This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: raw_bytes.decode('cp932') -- spurious mappings
Type: behavior Stage: resolved
Components: Unicode Versions: Python 3.1, Python 2.7
process
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: Nosy List: ezio.melotti, loewis, sjmachin
Priority: normal Keywords:

Created on 2010-04-03 23:40 by sjmachin, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (3)
msg102308 - (view) Author: John Machin (sjmachin) Date: 2010-04-03 23:40
According to the following references, the bytes 80, A0, FD, FE, and FF are not defined in cp932:

http://msdn.microsoft.com/en-au/goglobal/cc305152.aspx
http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP932.TXT
http://demo.icu-project.org/icu-bin/convexp?conv=ibm-943_P15A-2003&s=ALL

However CPython 3.1.2 does this:

 >>> print(ascii(b'\x80\xa0\xfd\xfe\xff'.decode('cp932')))
 '\x80\uf8f0\uf8f1\uf8f2\uf8f3'

(as do 2.5, 2.6. and 2.7 with the appropriate syntax)

This maps 80 to U+0080 (not very useful) and maps the other 4 bytes into the Private Use Area ("PUA")!! Each case should be treated as undefined/unexpected/error/...
msg102321 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2010-04-04 06:59
This mapping is in conformance with the de-facto standard of that encoding, Microsoft Windows, see

http://www.autumn.org/etc/unidif.html
http://mail.python.org/pipermail/i18n-sig/2003-June/001598.html
http://homepage1.nifty.com/nomenclator/perl/ShiftJIS-CP932-MapUTF.html
msg102334 - (view) Author: John Machin (sjmachin) Date: 2010-04-04 11:56
Thanks, Martin. Issue closed as far as I'm concerned.
History
Date User Action Args
2022-04-11 14:56:59adminsetgithub: 52555
2010-04-04 13:21:54r.david.murraysetstatus: open -> closed
resolution: wont fix
stage: test needed -> resolved
2010-04-04 11:56:40sjmachinsetmessages: + msg102334
2010-04-04 06:59:13loewissetnosy: + loewis
messages: + msg102321
2010-04-03 23:44:02ezio.melottisetpriority: normal
nosy: + ezio.melotti

stage: test needed
2010-04-03 23:40:16sjmachincreate