Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

an assertion failure in io.TextIOWrapper.write #75454

Closed
orenmn mannequin opened this issue Aug 24, 2017 · 17 comments
Closed

an assertion failure in io.TextIOWrapper.write #75454

orenmn mannequin opened this issue Aug 24, 2017 · 17 comments
Labels
3.7 (EOL) end of life topic-IO type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@orenmn
Copy link
Mannequin

orenmn mannequin commented Aug 24, 2017

BPO 31271
Nosy @ncoghlan, @vstinner, @benjaminp, @serhiy-storchaka, @vedgar, @orenmn
PRs
  • bpo-31271: fix an assertion failure in io.TextIOWrapper.write #3201
  • [3.6] bpo-31271: Fix an assertion failure in io.TextIOWrapper.write. … #3209
  • [2.7] bpo-31271: Fix an assertion failure in io.TextIOWrapper.write. (GH-3201) #3548
  • [2.7] bpo-31271: Fix an assertion failure in io.TextIOWrapper.write. (GH-3201) #3611
  • [2.7] bpo-31271: Fix an assertion failure in io.TextIOWrapper.write. (GH-3201) #3951
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2017-11-07.00:19:47.812>
    created_at = <Date 2017-08-24.18:06:29.191>
    labels = ['3.7', 'expert-IO', 'type-crash']
    title = 'an assertion failure in io.TextIOWrapper.write'
    updated_at = <Date 2017-11-07.00:19:47.811>
    user = 'https://github.com/orenmn'

    bugs.python.org fields:

    activity = <Date 2017-11-07.00:19:47.811>
    actor = 'vstinner'
    assignee = 'none'
    closed = True
    closed_date = <Date 2017-11-07.00:19:47.812>
    closer = 'vstinner'
    components = ['IO']
    creation = <Date 2017-08-24.18:06:29.191>
    creator = 'Oren Milman'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 31271
    keywords = ['patch']
    message_count = 17.0
    messages = ['300795', '300801', '300805', '300829', '300837', '300838', '300840', '300853', '300863', '300884', '300887', '303548', '303558', '303579', '303899', '305695', '305696']
    nosy_count = 6.0
    nosy_names = ['ncoghlan', 'vstinner', 'benjamin.peterson', 'serhiy.storchaka', 'veky', 'Oren Milman']
    pr_nums = ['3201', '3209', '3548', '3611', '3951']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'crash'
    url = 'https://bugs.python.org/issue31271'
    versions = ['Python 2.7', 'Python 3.6', 'Python 3.7']

    @orenmn
    Copy link
    Mannequin Author

    orenmn mannequin commented Aug 24, 2017

    currently, the following causes an assertion in Modules/_io/textio.c in
    _io_TextIOWrapper_write_impl() to fail:
    import codecs
    import io

    class BadEncoder():
        def encode(self, dummy):
            return 42
    def _get_bad_encoder(dummy):
        return BadEncoder()
    
    quopri = codecs.lookup("quopri")
    quopri._is_text_encoding = True
    quopri.incrementalencoder = _get_bad_encoder
    t = io.TextIOWrapper(io.BytesIO(b'foo'), encoding="quopri")
    t.write('bar')

    this is because _io_TextIOWrapper_write_impl() doesn't check whether the value
    returned by encoder's encode() is a bytes object.

    (I would open a PR to fix that soon.)

    @orenmn orenmn mannequin added 3.7 (EOL) end of life topic-IO type-crash A hard crash of the interpreter, possibly with a core dump labels Aug 24, 2017
    @vedgar
    Copy link
    Mannequin

    vedgar mannequin commented Aug 24, 2017

    I can't reproduce this on 3.6.0. Is this on 3.7 only?

    @orenmn
    Copy link
    Mannequin Author

    orenmn mannequin commented Aug 24, 2017

    Just checked on current 3.6 on my Windows 10.
    The assertion failes, and it is in line 1337.
    oh my.

    @orenmn
    Copy link
    Mannequin Author

    orenmn mannequin commented Aug 25, 2017

    As Serhiy pointed out on github, the assertion failure can be easily reproduced
    by the following:

    import codecs
    import io
    
    rot13 = codecs.lookup("rot13")
    rot13._is_text_encoding = True
    t = io.TextIOWrapper(io.BytesIO(b'foo'), encoding="rot13")
    t.write('bar')

    @serhiy-storchaka
    Copy link
    Member

    Nick can be interested in this.

    @ncoghlan
    Copy link
    Contributor

    The proposed fix looks good to me, but it did make me wonder if we might have a missing check in the other direction as well. However, it looks like that case is already fine:

    >>> hex_codec = codecs.lookup("hex")
    >>> hex_codec._is_text_encoding = True
    >>> t = io.TextIOWrapper(io.BytesIO(b'foo'), encoding="hex")
    >>> t.buffer.write(b'abcd')
    4
    >>> t.read()
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: decoder should return a string result, not 'bytes'
    

    @serhiy-storchaka
    Copy link
    Member

    May be make an error message more symmetric to the one in the decoder case?

    @serhiy-storchaka
    Copy link
    Member

    New changeset a5b4ea1 by Serhiy Storchaka (Oren Milman) in branch 'master':
    bpo-31271: Fix an assertion failure in io.TextIOWrapper.write. (bpo-3201)
    a5b4ea1

    @orenmn
    Copy link
    Mannequin Author

    orenmn mannequin commented Aug 26, 2017

    all three versions do 'self->pending_bytes_count += PyBytes_GET_SIZE(b);',
    while 'b' is the object the encoder returned.

    in 3.6 and 3.7, the implementation of PyBytes_GET_SIZE() includes
    'assert(PyBytes_Check(op))', but in 2.7, the implementation is 'PyString_GET_SIZE',
    which is just 'Py_SIZE(op)'.

    and so, in 2.7 there isn't an assertion failure. Moreover,
    'self->pending_bytes_count' is used only to determine whether a flush is needed.
    so ISTM that probably the bug's only risk is not flushing automatically after
    writing.
    note that whenever _textiowrapper_writeflush() is finally called (when the
    encoder returned a non-string object), it would raise a TypeError by calling
    string_join() on a non-string object.

    do you still think we should backport to 2.7?

    @serhiy-storchaka
    Copy link
    Member

    do you still think we should backport to 2.7?

    This is not trivial question. On one side, using PyString_GET_SIZE() with non-bytes object definitely is a bug. It is better to catch it earlier rather than hope on failing in the following code. On other side, adding a check for bytes can break existing user code that is passed now by accident. _PyBytes_Join() in 2.7 supports unicode objects.

    I think that the fix should be backported, and the proper fix should allow bytes and unicode objects. But I left the decision on Benjamin.

    @serhiy-storchaka
    Copy link
    Member

    New changeset 9bcbc6c by Serhiy Storchaka (Oren Milman) in branch '3.6':
    [3.6] bpo-31271: Fix an assertion failure in io.TextIOWrapper.write. (GH-3201) (bpo-3209)
    9bcbc6c

    @serhiy-storchaka
    Copy link
    Member

    Oren, do you mind to create a backport to 2.7? miss-islington can not handle it.

    @orenmn
    Copy link
    Mannequin Author

    orenmn mannequin commented Oct 2, 2017

    sure

    @orenmn
    Copy link
    Mannequin Author

    orenmn mannequin commented Oct 3, 2017

    I am not sure, but ISTM that it isn't possible for the encoder to return a
    unicode and not fail later.
    This is because _textiowrapper_writeflush() would call _io.BytesIO.write()
    (after it called _PyBytes_Join()), and bytesio_write() calls PyObject_GetBuffer(),
    which would raise "TypeError: 'unicode' does not have the buffer interface".

    @serhiy-storchaka
    Copy link
    Member

    You are right. But in any case the test should be fixed for 2.7.

    @vstinner
    Copy link
    Member

    vstinner commented Nov 7, 2017

    New changeset 3053769 by Victor Stinner (Oren Milman) in branch '2.7':
    [2.7] bpo-31271: Fix an assertion failure in io.TextIOWrapper.write. (GH-3201) (bpo-3951)
    3053769

    @vstinner
    Copy link
    Member

    vstinner commented Nov 7, 2017

    Serhiy: "You are right. But in any case the test should be fixed for 2.7."

    Done: I merged Oren's PR 3951.

    It seems like the bug was fixed in all (maintained) branches, so I close the issue.

    Thank you again Oren Milman for *reporting* and *fixing* the bug!

    @vstinner vstinner closed this as completed Nov 7, 2017
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life topic-IO type-crash A hard crash of the interpreter, possibly with a core dump
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants