Message 184045 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	amaury.forgeotdarc
Recipients	amaury.forgeotdarc, benjamin.peterson, hynek, pitrou, rbcollins, stutzbach
Date	2013-03-12.19:30:00
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1363116600.72.0.586538004487.issue17404@psf.upfronthosting.co.za>
In-reply-to

Content
> But that will still be within the TextIOWrapper itself, right? Yes. And I just noticed that the _io module (the C version) will also buffer encoded bytes, up to f._CHUNK_SIZE. On the other hand, TextIOWrapper is broken for buffering codecs, encode() is never called with final=True >>> import io >>> buffer = io.BytesIO() # <-- not really buffered, right? >>> output = io.TextIOWrapper(buffer, encoding='idna') >>> output.write("www.somesite.com") 16 >>> print(buffer.getvalue()) b'' # <-- ok, _CHUNK_SIZE buffering >>> output.flush() >>> print(buffer.getvalue()) b'www.somesite.' # <-- the last word is missing! >>> output.close() >>> print(buffer.getvalue()) Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: I/O operation on closed file. And it's even worse with python 2.7:: >>> import io as io >>> buffer = io.BytesIO() >>> output = io.TextIOWrapper(buffer, encoding='idna') >>> output.write("www.somesite.com") Traceback (most recent call last): File "<stdin>", line 3, in <module> TypeError: must be unicode, not str

> But that will still be within the TextIOWrapper itself, right?

Yes. And I just noticed that the _io module (the C version) will also buffer encoded bytes, up to f._CHUNK_SIZE.

On the other hand, TextIOWrapper is broken for buffering codecs, encode() is never called with final=True

    >>> import io
    >>> buffer = io.BytesIO()      # <-- not really buffered, right?
    >>> output = io.TextIOWrapper(buffer, encoding='idna')
    >>> output.write("www.somesite.com")
    16
    >>> print(buffer.getvalue())
    b''                            # <-- ok, _CHUNK_SIZE buffering
    >>> output.flush()
    >>> print(buffer.getvalue())
    b'www.somesite.'               # <-- the last word is missing!
    >>> output.close()
    >>> print(buffer.getvalue())
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    ValueError: I/O operation on closed file.


And it's even worse with python 2.7::

    >>> import io as io
    >>> buffer = io.BytesIO()
    >>> output = io.TextIOWrapper(buffer, encoding='idna')
    >>> output.write("www.somesite.com")
    Traceback (most recent call last):
      File "<stdin>", line 3, in <module>
    TypeError: must be unicode, not str

History
Date	User	Action	Args
2013-03-12 19:30:00	amaury.forgeotdarc	set	recipients: + amaury.forgeotdarc, pitrou, rbcollins, benjamin.peterson, stutzbach, hynek
2013-03-12 19:30:00	amaury.forgeotdarc	set	messageid: <1363116600.72.0.586538004487.issue17404@psf.upfronthosting.co.za>
2013-03-12 19:30:00	amaury.forgeotdarc	link	issue17404 messages
2013-03-12 19:30:00	amaury.forgeotdarc	create