This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author mjacob
Recipients mjacob
Date 2020-07-06.17:33:53
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1594056833.64.0.146611581422.issue41221@roundup.psfhosted.org>
In-reply-to
Content
Without unbuffered mode, it works as expected:

% python -c "import sys; sys.stdout.write('x'*4294967296)" | wc -c        
4294967296

% python -c "import sys; print('x'*4294967296)" | wc -c 
4294967297

With unbuffered mode, writes get truncated to 2147479552 bytes on my Linux machine:

% python -u -c "import sys; sys.stdout.write('x'*4294967296)" | wc -c           
2147479552

% python -u -c "import sys; print('x'*4294967296)" | wc -c 
2147479553

I didn’t try, but it’s probably an even bigger problem on Windows, where writes might be limited to 32767 bytes: https://github.com/python/cpython/blob/v3.9.0b4/Python/fileutils.c#L1585

Without unbuffered mode, `sys.stdout.buffer` is a `io.BufferedWriter` object.

% python -c 'import sys; print(sys.stdout.buffer)'
<_io.BufferedWriter name='<stdout>'>

With unbuffered mode, `sys.stdout.buffer` is a `io.FileIO` object.

% python -u -c 'import sys; print(sys.stdout.buffer)' 
<_io.FileIO name='<stdout>' mode='wb' closefd=False>

`io.BufferedWriter` implements the `io.BufferedIOBase` interface. `io.BufferedIOBase.write()` is documented to write all passed bytes. `io.FileIO` implements the `io.RawIOBase` interface. `io.RawIOBase.write()` is documented to be able to write less bytes than passed.

`io.TextIOWrapper.write()` is not documented to write all characters it has been passed, but e.g. `print()` relies on that.

To fix the problem, it has to be ensured that either
* `sys.stdout.buffer` is an object that guarantees that all bytes passed to its `write()` method are written (e.g. deriving from `io.BufferedIOBase`), or
* `io.TextIOWrapper` calls the `write()` method of its underlying binary stream until all bytes have been written, or
* users of `io.TextIOWrapper` call `write()` until all characters have been written.

In the first two possibilities it probably makes sense to tighten the contract of `io.TextIOBase.write` to guarantee that all passed characters are written.
History
Date User Action Args
2020-07-06 17:33:53mjacobsetrecipients: + mjacob
2020-07-06 17:33:53mjacobsetmessageid: <1594056833.64.0.146611581422.issue41221@roundup.psfhosted.org>
2020-07-06 17:33:53mjacoblinkissue41221 messages
2020-07-06 17:33:53mjacobcreate