Message144583
The test at Lib/test/test_multibytecodec.py:178 checks for len('\U00012345') == 2, and with PEP393 this is always False.
I tried to run the tests with a few changes and they seem to work, but the code doesn't raise any exception on c.reset():
---->8-------->8-------->8-------->8----
import io, codecs
s = io.BytesIO()
c = codecs.getwriter('gb18030')(s)
c.write('123'); s.getvalue()
c.write('\U00012345'); s.getvalue()
c.write('\U00012345' + '\uac00\u00ac'); s.getvalue()
c.write('\uac00'); s.getvalue()
c.reset()
s.getvalue()
---->8-------->8-------->8-------->8----
Result:
>>> import io, codecs
>>> s = io.BytesIO()
>>> c = codecs.getwriter('gb18030')(s)
>>> c.write('123'); s.getvalue()
b'123'
>>> c.write('\U00012345'); s.getvalue()
b'123\x907\x959'
>>> # '\U00012345'[0] is the same of '\U00012345' now
>>> c.write('\U00012345' + '\uac00\u00ac'); s.getvalue()
b'123\x907\x959\x907\x959\x827\xcf5\x810\x851'
>>> c.write('\uac00'); s.getvalue()
b'123\x907\x959\x907\x959\x827\xcf5\x810\x851\x827\xcf5'
>>> c.reset() # is this supposed to raise an error?
>>> s.getvalue()
b'123\x907\x959\x907\x959\x827\xcf5\x810\x851\x827\xcf5'
Victor suggested to wait until multibytecodec gets ported to the new API before fixing this. |
|
Date |
User |
Action |
Args |
2011-09-29 01:38:20 | ezio.melotti | set | recipients:
+ ezio.melotti, loewis, vstinner |
2011-09-29 01:38:20 | ezio.melotti | set | messageid: <1317260300.62.0.299015504355.issue13056@psf.upfronthosting.co.za> |
2011-09-29 01:38:19 | ezio.melotti | link | issue13056 messages |
2011-09-29 01:38:19 | ezio.melotti | create | |
|