This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author vstinner
Recipients amaury.forgeotdarc, loewis, ocean-city, vstinner
Date 2011-06-10.13:44:49
SpamBayes Score 0.0014054381
Marked as misclassified No
Message-id <1307713489.98.0.651161906104.issue12281@psf.upfronthosting.co.za>
In-reply-to
Content
Example on Windows Vista with ANSI=cp932:

>>> import codecs
>>> codecs.code_page_encode(1252, '\xe9')
(b'\xe9', 1)
>>> codecs.mbcs_encode('\xe9')
...
UnicodeEncodeError: 'mbcs' codec can't encode characters in position 0--1: invalid character
>>> codecs.code_page_encode(932, '\xe9')
...
UnicodeEncodeError: 'cp932' codec can't encode characters in position 0--1: invalid character
>>> codecs.code_page_encode(932, '\xe9', 'replace')
(b'e', 1)
>>> codecs.code_page_encode(932, '\xe9', 'ignore')
(b'', 8)
>>> codecs.code_page_encode(932, '\xe9', 'backslashreplace')
(b'\\xe9', 8)

You can use a code page different than the ANSI code page.

The encoding name is generated from the code page number: "cp%u" % code_page, or "mbcs" if code_page == CP_ACP.

(Oops, I forgot a printf() in mbcs2.patch)
History
Date User Action Args
2011-06-10 13:44:50vstinnersetrecipients: + vstinner, loewis, amaury.forgeotdarc, ocean-city
2011-06-10 13:44:49vstinnersetmessageid: <1307713489.98.0.651161906104.issue12281@psf.upfronthosting.co.za>
2011-06-10 13:44:49vstinnerlinkissue12281 messages
2011-06-10 13:44:49vstinnercreate