classification
Title: an assertion failure in io.TextIOWrapper.write
Type: crash Stage: resolved
Components: IO Versions: Python 3.7, Python 3.6, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Oren Milman, benjamin.peterson, ncoghlan, serhiy.storchaka, veky, vstinner
Priority: normal Keywords: patch

Created on 2017-08-24 18:06 by Oren Milman, last changed 2017-11-07 00:19 by vstinner. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 3201 merged Oren Milman, 2017-08-24 19:08
PR 3209 merged Oren Milman, 2017-08-26 09:00
PR 3548 closed python-dev, 2017-09-13 17:16
PR 3611 closed python-dev, 2017-09-16 03:01
PR 3951 merged Oren Milman, 2017-10-11 11:53
Messages (17)
msg300795 - (view) Author: Oren Milman (Oren Milman) * Date: 2017-08-24 18:06
currently, the following causes an assertion in Modules/_io/textio.c in
_io_TextIOWrapper_write_impl() to fail:
import codecs
import io

class BadEncoder():
    def encode(self, dummy):
        return 42
def _get_bad_encoder(dummy):
    return BadEncoder()

quopri = codecs.lookup("quopri")
quopri._is_text_encoding = True
quopri.incrementalencoder = _get_bad_encoder
t = io.TextIOWrapper(io.BytesIO(b'foo'), encoding="quopri")
t.write('bar')


this is because _io_TextIOWrapper_write_impl() doesn't check whether the value
returned by encoder's encode() is a bytes object.

(I would open a PR to fix that soon.)
msg300801 - (view) Author: Vedran Čačić (veky) * Date: 2017-08-24 20:11
I can't reproduce this on 3.6.0. Is this on 3.7 only?
msg300805 - (view) Author: Oren Milman (Oren Milman) * Date: 2017-08-24 21:20
Just checked on current 3.6 on my Windows 10.
The assertion failes, and it is in line 1337.
oh my.
msg300829 - (view) Author: Oren Milman (Oren Milman) * Date: 2017-08-25 07:48
As Serhiy pointed out on github, the assertion failure can be easily reproduced
by the following:

import codecs
import io

rot13 = codecs.lookup("rot13")
rot13._is_text_encoding = True
t = io.TextIOWrapper(io.BytesIO(b'foo'), encoding="rot13")
t.write('bar')
msg300837 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-08-25 13:20
Nick can be interested in this.
msg300838 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2017-08-25 13:36
The proposed fix looks good to me, but it did make me wonder if we might have a missing check in the other direction as well. However, it looks like that case is already fine:


```
>>> hex_codec = codecs.lookup("hex")
>>> hex_codec._is_text_encoding = True
>>> t = io.TextIOWrapper(io.BytesIO(b'foo'), encoding="hex")
>>> t.buffer.write(b'abcd')
4
>>> t.read()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: decoder should return a string result, not 'bytes'
```
msg300840 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-08-25 14:14
May be make an error message more symmetric to the one in the decoder case?
msg300853 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-08-25 18:14
New changeset a5b4ea15b61e3f3985f4f0748a18f8b888a63532 by Serhiy Storchaka (Oren Milman) in branch 'master':
bpo-31271: Fix an assertion failure in io.TextIOWrapper.write. (#3201)
https://github.com/python/cpython/commit/a5b4ea15b61e3f3985f4f0748a18f8b888a63532
msg300863 - (view) Author: Oren Milman (Oren Milman) * Date: 2017-08-26 08:09
all three versions do 'self->pending_bytes_count += PyBytes_GET_SIZE(b);',
while 'b' is the object the encoder returned.

in 3.6 and 3.7, the implementation of PyBytes_GET_SIZE() includes
'assert(PyBytes_Check(op))', but in 2.7, the implementation is 'PyString_GET_SIZE',
which is just 'Py_SIZE(op)'.

and so, in 2.7 there isn't an assertion failure. Moreover,
'self->pending_bytes_count' is used only to determine whether a flush is needed.
so ISTM that probably the bug's only risk is not flushing automatically after
writing.
note that whenever _textiowrapper_writeflush() is finally called (when the
encoder returned a non-string object), it would raise a TypeError by calling
string_join() on a non-string object.

do you still think we should backport to 2.7?
msg300884 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-08-26 15:47
> do you still think we should backport to 2.7?

This is not trivial question. On one side, using PyString_GET_SIZE() with non-bytes object definitely is a bug. It is better to catch it earlier rather than hope on failing in the following code. On other side, adding a check for bytes can break existing user code that is passed now by accident. _PyBytes_Join() in 2.7 supports unicode objects.

I think that the fix should be backported, and the proper fix should allow bytes and unicode objects. But I left the decision on Benjamin.
msg300887 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-08-26 17:29
New changeset 9bcbc6cba385a83cac8f1ff430cad7617184d2bc by Serhiy Storchaka (Oren Milman) in branch '3.6':
[3.6] bpo-31271: Fix an assertion failure in io.TextIOWrapper.write. (GH-3201) (#3209)
https://github.com/python/cpython/commit/9bcbc6cba385a83cac8f1ff430cad7617184d2bc
msg303548 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-10-02 16:07
Oren, do you mind to create a backport to 2.7? miss-islington can not handle it.
msg303558 - (view) Author: Oren Milman (Oren Milman) * Date: 2017-10-02 18:50
sure
msg303579 - (view) Author: Oren Milman (Oren Milman) * Date: 2017-10-03 06:57
I am not sure, but ISTM that it isn't possible for the encoder to return a
unicode and not fail later.
This is because _textiowrapper_writeflush() would call _io.BytesIO.write()
(after it called _PyBytes_Join()), and bytesio_write() calls PyObject_GetBuffer(),
which would raise "TypeError: 'unicode' does not have the buffer interface".
msg303899 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-10-08 07:30
You are right. But in any case the test should be fixed for 2.7.
msg305695 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-11-07 00:17
New changeset 30537698b607d53fa9ce18522abb88469d5814b6 by Victor Stinner (Oren Milman) in branch '2.7':
[2.7] bpo-31271: Fix an assertion failure in io.TextIOWrapper.write. (GH-3201) (#3951)
https://github.com/python/cpython/commit/30537698b607d53fa9ce18522abb88469d5814b6
msg305696 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-11-07 00:19
Serhiy: "You are right. But in any case the test should be fixed for 2.7."

Done: I merged Oren's PR 3951.

It seems like the bug was fixed in all (maintained) branches, so I close the issue.

Thank you again Oren Milman for *reporting* and *fixing* the bug!
History
Date User Action Args
2017-11-07 00:19:47vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg305696

stage: patch review -> resolved
2017-11-07 00:17:56vstinnersetnosy: + vstinner
messages: + msg305695
2017-10-11 11:53:30Oren Milmansetstage: backport needed -> patch review
pull_requests: + pull_request3926
2017-10-08 07:30:51serhiy.storchakasetmessages: + msg303899
2017-10-03 06:57:58Oren Milmansetmessages: + msg303579
2017-10-02 18:50:13Oren Milmansetmessages: + msg303558
2017-10-02 16:07:18serhiy.storchakasetmessages: + msg303548
stage: patch review -> backport needed
2017-09-16 03:01:08python-devsetpull_requests: + pull_request3601
2017-09-13 17:16:57python-devsetkeywords: + patch
stage: backport needed -> patch review
pull_requests: + pull_request3540
2017-08-26 17:29:42serhiy.storchakasetmessages: + msg300887
2017-08-26 15:47:24serhiy.storchakasetnosy: + benjamin.peterson
messages: + msg300884
2017-08-26 09:00:40Oren Milmansetpull_requests: + pull_request3247
2017-08-26 08:09:28Oren Milmansetmessages: + msg300863
2017-08-25 18:33:03serhiy.storchakasetstage: needs patch -> backport needed
2017-08-25 18:14:56serhiy.storchakasetmessages: + msg300853
2017-08-25 14:14:36serhiy.storchakasetmessages: + msg300840
2017-08-25 13:36:54ncoghlansetmessages: + msg300838
2017-08-25 13:29:45ncoghlansetversions: + Python 2.7, Python 3.6
2017-08-25 13:20:52serhiy.storchakasetnosy: + ncoghlan
messages: + msg300837
2017-08-25 07:48:40Oren Milmansetmessages: + msg300829
2017-08-24 21:20:05Oren Milmansetmessages: + msg300805
2017-08-24 20:11:35vekysetnosy: + veky
messages: + msg300801
2017-08-24 19:08:54Oren Milmansetpull_requests: + pull_request3240
2017-08-24 18:30:23serhiy.storchakasetnosy: + serhiy.storchaka

stage: needs patch
2017-08-24 18:06:29Oren Milmancreate