Author adamhj
Recipients Dmitry.Jemerov, Roman.Evstifeev, Vladimir Iofik, aclover, adamhj, brian.curtin, eric.araujo, frankoid, kaizhu, r.david.murray, tim.golden, vldmit, vstinner
Date 2013-11-14.13:26:38
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1384435598.62.0.337865541782.issue9291@psf.upfronthosting.co.za>
In-reply-to
Content
> The encoding is wrong. We should read the registry using Unicode, or at least use the correct encoding. The correct encoding is the ANSI code page: sys.getfilesystemencoding().

> Can you please try with: default_encoding = sys.getfilesystemencoding() ?

This does not work. In fact it doesn't matter what default_encoding is. The variable ctype, which is returned by _winreg.EnumKey(), is a byte string(b'blahblah'), at least on my computer(win2k3sp2, python 2.7.6). Because the interpreter is asked to encode a byte string, it tries to convert the byte string to unicode string first, by calling decode implicitly with 'ascii' encoding, so the exception UnicodeDecodeError.

the variable ctype, which is read from registry key name, can be decoded correctly with sys.getfilesystemencoding()(which returns 'mbcs'), but in fact what we need is a byte string, so there should be neither encoding nor decoding here.

if there is a case that _winreg.EnumKey() returns unicode string, then a type check should be added before the encode. Or maybe the case is that the return type of _winreg.EnumKey() is different in 2.x and 3.x?
History
Date User Action Args
2013-11-14 13:26:38adamhjsetrecipients: + adamhj, vstinner, tim.golden, eric.araujo, kaizhu, aclover, r.david.murray, brian.curtin, frankoid, Dmitry.Jemerov, vldmit, Vladimir Iofik, Roman.Evstifeev
2013-11-14 13:26:38adamhjsetmessageid: <1384435598.62.0.337865541782.issue9291@psf.upfronthosting.co.za>
2013-11-14 13:26:38adamhjlinkissue9291 messages
2013-11-14 13:26:38adamhjcreate