This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: test_multibytecodec.py:TestStreamWriter is skipped after PEP393
Type: behavior Stage: resolved
Components: Tests, Unicode Versions: Python 3.4
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: ezio.melotti, loewis, python-dev, serhiy.storchaka, vstinner
Priority: normal Keywords: 3.3regression

Created on 2011-09-29 01:38 by ezio.melotti, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Messages (6)
msg144583 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2011-09-29 01:38
The test at Lib/test/test_multibytecodec.py:178 checks for len('\U00012345') == 2, and with PEP393 this is always False.
I tried to run the tests with a few changes and they seem to work, but the code doesn't raise any exception on c.reset():

---->8-------->8-------->8-------->8----
import io, codecs
s = io.BytesIO()
c = codecs.getwriter('gb18030')(s)
c.write('123'); s.getvalue()
c.write('\U00012345'); s.getvalue()
c.write('\U00012345' + '\uac00\u00ac'); s.getvalue()
c.write('\uac00'); s.getvalue()
c.reset()
s.getvalue()
---->8-------->8-------->8-------->8----

Result:
>>> import io, codecs
>>> s = io.BytesIO()
>>> c = codecs.getwriter('gb18030')(s)
>>> c.write('123'); s.getvalue()
b'123'
>>> c.write('\U00012345'); s.getvalue()
b'123\x907\x959'
>>> # '\U00012345'[0] is the same of '\U00012345' now
>>> c.write('\U00012345' + '\uac00\u00ac'); s.getvalue()
b'123\x907\x959\x907\x959\x827\xcf5\x810\x851'
>>> c.write('\uac00'); s.getvalue()
b'123\x907\x959\x907\x959\x827\xcf5\x810\x851\x827\xcf5'
>>> c.reset()  # is this supposed to raise an error?
>>> s.getvalue()
b'123\x907\x959\x907\x959\x827\xcf5\x810\x851\x827\xcf5'

Victor suggested to wait until multibytecodec gets ported to the new API before fixing this.
msg171346 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2012-09-26 16:47
Victor, do you know if multibytecodec has been ported to the new API yet?
If I removed the "if", I still get a failure.

test test_multibytecodec failed -- Traceback (most recent call last):
  File "/home/wolf/dev/py/py3k/Lib/test/test_multibytecodec.py", line 187, in test_gb18030
    self.assertEqual(s.getvalue(), b'123\x907\x959')
AssertionError: b'123\x907\x959\x907\x959' != b'123\x907\x959'
msg171347 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2012-09-26 16:57
> Victor, do you know if multibytecodec has been ported to the new API yet?

No, it has no. CJK codecs still use the legacy API (Py_UNICODE).
msg184186 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-03-14 20:14
I think these tests have no sense after PEP393. They tests that StreamWriter works with non-BMP characters broken inside surrogate pair. I.e. c.write(s[:i]); c.write(s[i:]) always is same as c.write(s), even if i breaks s inside a surrogate pair. This case is impossible after PEP393.
msg186591 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013-04-11 20:41
New changeset 78cd09d2f908 by Victor Stinner in branch 'default':
Issue #13056: Reenable test_multibytecodec.Test_StreamWriter tests
http://hg.python.org/cpython/rev/78cd09d2f908
msg186592 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-04-11 20:45
CJK decoders use the new Unicode API since the changeset bcecf3910162.

"I think these tests have no sense after PEP393. They tests that StreamWriter works with non-BMP characters broken inside surrogate pair. I.e. c.write(s[:i]); c.write(s[i:]) always is same as c.write(s), even if i breaks s inside a surrogate pair. This case is impossible after PEP393."

I reenabled tests, but I simplified them to remove parts related to surrogate pairs.

Tests are shorter than before, but it's better than no test at all.

Can I close the issue or someone wants to improve these tests?
History
Date User Action Args
2022-04-11 14:57:22adminsetgithub: 57265
2013-04-11 20:58:25ezio.melottisetstatus: open -> closed
stage: needs patch -> resolved
resolution: fixed
versions: - Python 3.3
2013-04-11 20:45:48vstinnersetmessages: + msg186592
2013-04-11 20:41:22python-devsetnosy: + python-dev
messages: + msg186591
2013-03-14 20:14:47serhiy.storchakasetmessages: + msg184186
2013-03-14 03:49:04ezio.melottisetnosy: + serhiy.storchaka

versions: + Python 3.4
2012-09-26 16:57:10vstinnersetmessages: + msg171347
2012-09-26 16:47:21ezio.melottisetkeywords: + 3.3regression

messages: + msg171346
2011-09-29 01:47:23vstinnersetcomponents: + Unicode
2011-09-29 01:38:20ezio.melotticreate