WindowsConsoleIO readall() fails if first line starts with Ctrl+Z #72349

eryksun · 2016-09-15T03:25:18Z

BPO	28162
Nosy	@pfmoore, @tjguk, @zware, @eryksun, @zooba
PRs	[Do Not Merge] Convert `Misc/NEWS` so that it is managed by towncrier #552
Files	issue_28162_01.patch

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = 'https://github.com/zooba'
closed_at = <Date 2016-10-08.19:44:31.832>
created_at = <Date 2016-09-15.03:25:17.633>
labels = ['3.7', 'type-bug', 'library', 'expert-IO', 'OS-windows']
title = 'WindowsConsoleIO readall() fails if first line starts with Ctrl+Z'
updated_at = <Date 2017-03-31.16:36:32.838>
user = 'https://github.com/eryksun'

bugs.python.org fields:

activity = <Date 2017-03-31.16:36:32.838>
actor = 'dstufft'
assignee = 'steve.dower'
closed = True
closed_date = <Date 2016-10-08.19:44:31.832>
closer = 'steve.dower'
components = ['Library (Lib)', 'Windows', 'IO']
creation = <Date 2016-09-15.03:25:17.633>
creator = 'eryksun'
dependencies = []
files = ['44766']
hgrepos = []
issue_num = 28162
keywords = ['patch']
message_count = 7.0
messages = ['276508', '276841', '276845', '276848', '277093', '278320', '278321']
nosy_count = 6.0
nosy_names = ['paul.moore', 'tim.golden', 'python-dev', 'zach.ware', 'eryksun', 'steve.dower']
pr_nums = ['552']
priority = 'normal'
resolution = 'fixed'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue28162'
versions = ['Python 3.6', 'Python 3.7']

The text was updated successfully, but these errors were encountered:

eryksun · 2016-09-15T03:25:17Z

For a console readall(), if the first line starts with '\x1a' (i.e. Ctrl+Z), it breaks out of its read loop before incrementing len. Thus the input isn't handled properly as EOF, for which the check requires len > 0. Instead it ends up calling WideCharToMultiByte with len == 0, which fails as follows:

    >>> sys.stdin.buffer.raw.read()
    ^Z
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    OSError: [WinError 87] The parameter is incorrect

zooba · 2016-09-17T20:38:02Z

Unfortunately, very hard to test this because it requires interacting with an actual console. But it's an easy fix.

python-dev · 2016-09-17T20:51:44Z

New changeset d0bab9fda568 by Steve Dower in branch '3.6':
Issue bpo-28161: Opening CON for write access fails
https://hg.python.org/cpython/rev/d0bab9fda568

New changeset 187a114b9ef4 by Steve Dower in branch 'default':
Issue bpo-28161: Opening CON for write access fails
https://hg.python.org/cpython/rev/187a114b9ef4

zooba · 2016-09-17T20:54:16Z

The fix here was changing "n > 0 && buf[len] == '\x1a'" into "n == 0 || buf[len] == '\x1a'" when returning an empty bytes. Previously we were falling into the conversion code which wasn't happy with n == 0.

eryksun · 2016-09-21T05:46:31Z

For breaking out of the readall while loop, you only need to check if the current read is empty:

    /* when the read is empty we break */
    if (n == 0)
        break;

Also, the logic is wrong here:

    if (len == 0 || buf[0] == '\x1a' && _buflen(self) == 0) {
        /* when the result starts with ^Z we return an empty buffer */
        PyMem_Free(buf);
        return PyBytes_FromStringAndSize(NULL, 0);
    }

This is true when len is 0 or when buf[0] is Ctrl+Z and _buflen(self) is 0. Since buf[0] shouldn't ever be Ctrl+Z here (low-level EOF handling is abstracted in read_console_w), it's never checking the internal buffer. We can easily see this going wrong here:

    >>> a = sys.stdin.buffer.raw.read(1); b = sys.stdin.buffer.raw.read()
    Ā^Z
    >>> a
    b'\xc4'
    >>> b
    b''

It misses the remaining byte in the internal buffer.

This check can be simplified as follows:

    rn = _buflen(self);

    if (len == 0 && rn == 0) {
        /* return an empty buffer */
        PyMem_Free(buf);
        return PyBytes_FromStringAndSize(NULL, 0);
    }

After this the code assumes that len isn't 0, which leads to more WideCharToMultiByte failure cases.

In the last conversion it's overwrite bytes_size without including rn.

I'm not sure what's going on with _PyBytes_Resize(&bytes, n * sizeof(wchar_t)). ISTM, it should be resized to bytes_size, and make sure this includes rn.

Finally, _copyfrombuf is repeatedly overwriting buf[0] instead of writing to buf[n].

With the attached patch, the behavior seems correct now:

    >>> sys.stdin.buffer.raw.read()
    ^Z
    b''

    >>> sys.stdin.buffer.raw.read()
    abc^Z
    ^Z
    b'abc\x1a\r\n'

Split U+0100:

    >>> a = sys.stdin.buffer.raw.read(1); b = sys.stdin.buffer.raw.read()
    Ā^Z
    >>> a
    b'\xc4'
    >>> b
b'\x80'

Split U+1234:

    >>> a = sys.stdin.buffer.raw.read(1); b = sys.stdin.buffer.raw.read()
    ሴ^Z
    >>> a
    b'\xe1'
    >>> b
    b'\x88\xb4'

The buffer still can't handle splitting an initial non-BMP character, stored as a surrogate pair. Both codes end up as replacement characters because they aren't transcoded as a unit.

Split U+00010000:

    >>> a = sys.stdin.buffer.raw.read(1); b = sys.stdin.buffer.raw.read()
    𐀀^Z
    ^Z
    >>> a
    b'\xef'
    >>> b
    b'\xbf\xbd\xef\xbf\xbd\x1a\r\n'

python-dev · 2016-10-08T19:38:09Z

New changeset 4d4aefa52f49 by Steve Dower in branch '3.6':
Issue bpo-28162: Fixes Ctrl+Z handling in console readall()
https://hg.python.org/cpython/rev/4d4aefa52f49

New changeset 947fa496ca6f by Steve Dower in branch 'default':
Issue bpo-28162: Fixes Ctrl+Z handling in console readall()
https://hg.python.org/cpython/rev/947fa496ca6f

zooba · 2016-10-08T19:44:32Z

I didn't see your patch, but I made basically the same fix and added a test.

eryksun added 3.7 (EOL) end of life stdlib Python modules in the Lib dir OS-windows topic-IO type-bug An unexpected behavior, bug, or error labels Sep 15, 2016

zooba self-assigned this Sep 17, 2016

zooba closed this as completed Sep 17, 2016

eryksun reopened this Sep 21, 2016

zooba closed this as completed Oct 8, 2016

ezio-melotti transferred this issue from another repository Apr 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WindowsConsoleIO readall() fails if first line starts with Ctrl+Z #72349

WindowsConsoleIO readall() fails if first line starts with Ctrl+Z #72349

eryksun commented Sep 15, 2016

eryksun commented Sep 15, 2016

zooba commented Sep 17, 2016

python-dev mannequin commented Sep 17, 2016

zooba commented Sep 17, 2016

eryksun commented Sep 21, 2016

python-dev mannequin commented Oct 8, 2016

zooba commented Oct 8, 2016

WindowsConsoleIO readall() fails if first line starts with Ctrl+Z #72349

WindowsConsoleIO readall() fails if first line starts with Ctrl+Z #72349

Comments

eryksun commented Sep 15, 2016

eryksun commented Sep 15, 2016

zooba commented Sep 17, 2016

python-dev mannequin commented Sep 17, 2016

zooba commented Sep 17, 2016

eryksun commented Sep 21, 2016

python-dev mannequin commented Oct 8, 2016

zooba commented Oct 8, 2016