Message280720
Hi John, thanks for your contribution,
Looks like your implementation is missing some codepoints, like "\t":
>>> print("\t".encode(encoding='iso6937'))
[...]
UnicodeError: encoding with 'iso6937' codec failed (UnicodeError: Unacceptable utf-8 character)
Probably due to the "range(0x20, "…, why `0x20`?
You're having problems to decode multibytes sequences as you're not having the `else: … result += chr(c[0])` in this case. So typically decoding `\xc2\x20` will raise a `KeyError` as `\x20` is _not_ in your decoding table.
Also, please conform your contribution to the PEP8: you're missing spaces after comas and you're sometime indenting with 8 spaces instead of 4.
I implemented a simple checker based on glibc localedata, it show clearly your decoding problems step by step, and should be easily extended to check for your encoding function too, see attachment. It uses the ISO6937 found typically in the locales debian package or in an 'apt-get sourcee glibc'. |
|
Date |
User |
Action |
Args |
2016-11-13 22:11:01 | mdk | set | recipients:
+ mdk, lemburg, loewis, serhiy.storchaka, John Helour |
2016-11-13 22:11:01 | mdk | set | messageid: <1479075061.19.0.714591067396.issue24339@psf.upfronthosting.co.za> |
2016-11-13 22:11:01 | mdk | link | issue24339 messages |
2016-11-13 22:11:01 | mdk | create | |
|