This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: _pyio.StringIO doesn't work with lone surrogates
Type: behavior Stage: resolved
Components: IO, Unicode Versions: Python 3.3, Python 3.4
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: serhiy.storchaka Nosy List: benjamin.peterson, ezio.melotti, hynek, pitrou, python-dev, serhiy.storchaka, stutzbach, vstinner
Priority: normal Keywords: patch

Created on 2014-01-28 20:26 by serhiy.storchaka, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
stringio_lone_surrogates.patch serhiy.storchaka, 2014-01-28 20:26 review
Messages (5)
msg209583 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-01-28 20:26
Unlike to io.StringIO, _pyio.StringIO doesn't work with lone surrogates.

>>> import io, _pyio
>>> io.StringIO('\ud880')
<_io.StringIO object at 0xb71426ec>
>>> _pyio.StringIO('\ud880')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/serhiy/py/cpython/Lib/_pyio.py", line 2065, in __init__
    self.write(initial_value)
  File "/home/serhiy/py/cpython/Lib/_pyio.py", line 1629, in write
    b = encoder.encode(s)
  File "/home/serhiy/py/cpython/Lib/encodings/utf_8.py", line 20, in encode
    return codecs.utf_8_encode(input, self.errors)[0]
UnicodeEncodeError: 'utf-8' codec can't encode character '\ud880' in position 0: surrogates not allowed

Proposed patch adds support of lone surrogates to _pyio.StringIO.
msg209588 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-01-28 21:31
I agree that StringIO should accept lone surrogates as str += str accept them.

The patch looks good, but please mention the issue number in the unit test. And add an empty line between the two parts of the test (reader, writer).
msg209626 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-01-29 09:26
Thanks Victor.
msg209628 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2014-01-29 09:49
New changeset 6ca9ba9eb76b by Serhiy Storchaka in branch '3.3':
Issue #20424: Python implementation of io.StringIO now supports lone surrogates.
http://hg.python.org/cpython/rev/6ca9ba9eb76b

New changeset 483096ef1cf6 by Serhiy Storchaka in branch 'default':
Issue #20424: Python implementation of io.StringIO now supports lone surrogates.
http://hg.python.org/cpython/rev/483096ef1cf6
msg209629 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-01-29 09:52
The test is backported to 2.7 in 3971e1b07af4.
History
Date User Action Args
2022-04-11 14:57:57adminsetgithub: 64623
2014-01-29 09:52:37serhiy.storchakasetstatus: open -> closed
resolution: fixed
messages: + msg209629

stage: patch review -> resolved
2014-01-29 09:49:31python-devsetnosy: + python-dev
messages: + msg209628
2014-01-29 09:26:06serhiy.storchakasetassignee: serhiy.storchaka
messages: + msg209626
2014-01-28 21:31:31vstinnersetmessages: + msg209588
2014-01-28 20:26:55serhiy.storchakacreate