This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author ezio.melotti
Recipients ezio.melotti, loewis, vstinner
Date 2011-09-29.01:38:19
SpamBayes Score 6.133368e-06
Marked as misclassified No
Message-id <1317260300.62.0.299015504355.issue13056@psf.upfronthosting.co.za>
In-reply-to
Content
The test at Lib/test/test_multibytecodec.py:178 checks for len('\U00012345') == 2, and with PEP393 this is always False.
I tried to run the tests with a few changes and they seem to work, but the code doesn't raise any exception on c.reset():

---->8-------->8-------->8-------->8----
import io, codecs
s = io.BytesIO()
c = codecs.getwriter('gb18030')(s)
c.write('123'); s.getvalue()
c.write('\U00012345'); s.getvalue()
c.write('\U00012345' + '\uac00\u00ac'); s.getvalue()
c.write('\uac00'); s.getvalue()
c.reset()
s.getvalue()
---->8-------->8-------->8-------->8----

Result:
>>> import io, codecs
>>> s = io.BytesIO()
>>> c = codecs.getwriter('gb18030')(s)
>>> c.write('123'); s.getvalue()
b'123'
>>> c.write('\U00012345'); s.getvalue()
b'123\x907\x959'
>>> # '\U00012345'[0] is the same of '\U00012345' now
>>> c.write('\U00012345' + '\uac00\u00ac'); s.getvalue()
b'123\x907\x959\x907\x959\x827\xcf5\x810\x851'
>>> c.write('\uac00'); s.getvalue()
b'123\x907\x959\x907\x959\x827\xcf5\x810\x851\x827\xcf5'
>>> c.reset()  # is this supposed to raise an error?
>>> s.getvalue()
b'123\x907\x959\x907\x959\x827\xcf5\x810\x851\x827\xcf5'

Victor suggested to wait until multibytecodec gets ported to the new API before fixing this.
History
Date User Action Args
2011-09-29 01:38:20ezio.melottisetrecipients: + ezio.melotti, loewis, vstinner
2011-09-29 01:38:20ezio.melottisetmessageid: <1317260300.62.0.299015504355.issue13056@psf.upfronthosting.co.za>
2011-09-29 01:38:19ezio.melottilinkissue13056 messages
2011-09-29 01:38:19ezio.melotticreate