This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: words able to decode but unable to encode in GB18030
Type: Stage:
Components: Unicode Versions: Python 2.5
process
Status: closed Resolution: duplicate
Dependencies: Superseder:
Assigned To: hyeshik.chang Nosy List: hyeshik.chang, nnorwitz, zaex
Priority: normal Keywords:

Created on 2007-08-09 01:34 by zaex, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
python25_GB18030_cant_encode zaex, 2007-08-09 01:34 The file containing the words able to decode but unable to encode in GB18030
Messages (4)
msg32611 - (view) Author: Z-flagship (zaex) Date: 2007-08-09 01:34
Here is a list of chinese characters that can be read from a file [in GB18030 encoding], but unable to encode to GB18030 encoding

detailed:
used codecs.open(r'file name', encoding='GB18030') to read the characters from a file, and try to encode them word by word into GB18030 with word.encode('GB18030'). The action caused an exception with 'illegal multibyte sequence'

the attachment is also the list.

list:
䎬䎱䅟䌷䦟䦷䲠㧏㭎㘚㘎㱮䴔䴖䴗䦆㧟䙡䙌䴕䁖䎬䴙䥽䝼䞍䓖䲡䥇䦂䦅䴓㩳㧐㳠䲢䴘㖞䜣䥺䶮䜩䥺䲟䲣䦛䦶㑳㑇㥮㤘䏝䦃
msg32612 - (view) Author: Z-flagship (zaex) Date: 2007-08-09 01:37
The Python is Python2.5 , my OS is windows XP professional sp2 version 2002
msg32613 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2007-08-10 03:35
This seems like a cjk problem.  Hye-Shik, could you take a look?
msg32614 - (view) Author: Hyeshik Chang (hyeshik.chang) * (Python committer) Date: 2007-08-12 15:18
The problem has been fixed about a week ago. (r56727-8)
It will be okay on the forthcoming Python releases.  Thank you for reporting!
History
Date User Action Args
2022-04-11 14:56:25adminsetgithub: 45292
2007-08-09 01:34:16zaexcreate