classification
Title: print(s) fails on Windows with long strings
Type: crash Stage:
Components: Windows Versions: Python 3.3, Python 3.2, Python 3.1, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: amaury.forgeotdarc, casevh, davidsarah, eryksun, haypo, neologix, pitrou, python-dev, santa4nt, terry.reedy
Priority: high Keywords: patch

Created on 2011-03-04 05:47 by casevh, last changed 2014-08-03 06:18 by haypo. This issue is now closed.

Files
File name Uploaded Description Edit
test_writeconsole.patch santa4nt, 2011-03-05 04:35
wconsole_large.patch santa4nt, 2011-03-06 00:20 review
test_wconsole_binlarge.patch santa4nt, 2011-03-06 00:52 review
winconsole_large_py33.patch santa4nt, 2011-03-06 22:58
winconsole_large_py33_direct.patch santa4nt, 2011-03-07 00:45 review
io_write.patch haypo, 2011-03-07 11:26 review
Messages (34)
msg130029 - (view) Author: Case Van Horsen (casevh) Date: 2011-03-04 05:47
Python 3.2 fails when printing long strings.

C:\Python32>python
Python 3.2 (r32:88445, Feb 20 2011, 21:30:00) [MSC v.1500 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> print("a"*66000)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IOError: [Errno 12] Not enough space
>>>

Some observations:

1) 3.2 on Linux prints just fine.
2) 2.7.1 and 3.1.3 on Windows x64 are fine
3) The 32-bit interpreter for 3.2 also fails.
4) On 32-bit Windows, a length of 62733 works correctly but 62734, and higher, fail.
5) On 64-bit Windows, the output is visibly corrupted when the length reaches 62801 but the error does not occur until the length reaches 65536.
6) While experimenting with various lengths, I was able to crash the interpreter once.
msg130033 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2011-03-04 08:15
Likewise, this fails with 3.2::
   import os
   os.write(1, b"a" * 66000)
msg130034 - (view) Author: Charles-Fran├žois Natali (neologix) * (Python committer) Date: 2011-03-04 08:24
It's probably a Windows limitation regarding the number of bytes that can be written to stdout in one write.
As for the difference between python versions, what does
python -c "import sys; print(sys.getsizeof('a'))" return ?
msg130036 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2011-03-04 09:20
no, it works with 3.2b2 (r32b2:87398), and fails with 3.2 final (r32:88445)
msg130037 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-03-04 09:32
Extract of issue #1602:
<< WriteConsoleW has one bug that I know of, which is that it <a href="http://tahoe-lafs.org/trac/tahoe-lafs/ticket/1232">fails when writing more than 26608 characters at once</a>. That's easy to work around by limiting the amount of data passed in a single call. >>

I suppose that os.write(1) does indirectly write into the Windows console which has this limit.
msg130038 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-03-04 09:33
Extract of the WriteConsole Function:

<< The storage for this buffer is allocated from a shared heap for the process that is 64 KB in size. The maximum size of the buffer will depend on heap usage. >>
http://msdn.microsoft.com/en-us/library/ms687401(VS.85).aspx

Ah ah, that's funny, "depend on heap usage" :-)
msg130039 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2011-03-04 09:35
This changed with r87824
msg130041 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-03-04 09:39
> This changed with r87824

Yes, I changed Python to open all files in binary mode. With Python < 3.2, you can open sys.std* streams in binary mode using -u command line option (u like unbuffered, not Unicode ;-)).
msg130042 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2011-03-04 09:50
Indeed, Python3.1 fails with the -u option.

Before r87824, the C call to write() performed CRLF conversion.  In the implementation of MSVCRT, a local buffer is used (1025 chars in vs8.0, 5*1024 in vs10.0), so WriteFile is called with small sizes.
Since r87824 (or with -u), no such conversion occurs, and WriteFile is called with the full buffer.
msg130043 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-03-04 10:00
Anyway, use os.write() to write unicode into the Windows console is not the right thing to do. We should use WriteConsoleW(): #1602 is the correct fix for this issue.
msg130044 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2011-03-04 10:06
I'm writing bytes here: os.write(1, b"b" * 66000)
And WriteConsole has the same issue.
msg130045 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-03-04 10:11
> And WriteConsole has the same issue.

print() (sys.stdout and sys.stderr) should use WriteConsoleW() and use small chunks (smaller than 64 KB, I don't know the safest size).
msg130047 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-03-04 11:47
IIUC, this is a Windows bug? Is there any easy workaround for us?
msg130048 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2011-03-04 12:31
It may be a windows bug, but it's also an a python regression!
A fix is to limit the number of chars:

===================================================================
--- D:/py3k/Modules/_io/fileio.c   (revision 87824)
+++ D:/py3k/Modules/_io/fileio.c   (copie de travail)
@@ -712,6 +712,8 @@
         errno = 0;
         len = pbuf.len;
 #if defined(MS_WIN64) || defined(MS_WINDOWS)
+        if (len > 32000 && isatty(self->fd))
+            len = 32000;
         if (len > INT_MAX)
             len = INT_MAX;
         n = write(self->fd, pbuf.buf, (int)len);

On my system, errors start at ~52200 (why?). I hope that 32K is low enough... MSVCRT's write() (version vs10.0) uses a buffer of 5K.
msg130095 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2011-03-05 00:45
print("a"*66000)

works (after some delay) running from an IDLE edit window (but see #144249). Works means that I get a working prompt back with no errors.

Unlike IDLE, the Command Prompt Windows keeps a limited number of lines in its buffers (default: 4x50), and may have a total size limit.
msg130100 - (view) Author: Santoso Wijaya (santa4nt) * Date: 2011-03-05 04:35
I'm adding a test that will reproduce the crash.
msg130133 - (view) Author: Santoso Wijaya (santa4nt) * Date: 2011-03-06 00:20
And a patch for the test + fix.
msg130135 - (view) Author: Santoso Wijaya (santa4nt) * Date: 2011-03-06 00:52
> Indeed, Python3.1 fails with the -u option.

I'm also attaching another test to reproduce the crash with '-u' option.
msg130158 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-03-06 10:08
I did some tests: os.write(1, b'X'*length) does always fail with length >= 63842. It does sometimes fail with length > 35000. The maximum looks completly random: as written in Microsoft documentation, "The maximum size of the buffer will depend on heap usage"...

32000 looks arbitrary. I would prefer 2^15-1 (32767) because it looks less magical :-)
msg130186 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-03-06 20:13
Remarks about test_wconsole_binlarge.patch:
 - I don't know if isatty() is cheap or not. Is it a system call? If it might be slow, it should be only be called once in the constructor. On Windows, I don't think that isatty(fd) evoles.
 - I don't want to commit the tests because they write 66000 * 2 characters to the test output, which floods the test output. I don't know how to create a fake stdout which is a TTY but not the real stdout, especially on Windows. I think that manual tests only once should be enough. Or does anyone know how to create a fake TTY output?
msg130187 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-03-06 20:15
Other remarks about test_wconsole_binlarge.patch:
 - the patch doesn't apply on Python 3.3
 - I would prefer 32767 instead of 32000 for the maximum length

Suggestion for the comment in fileio.c:

* Issue #11395: not enough space error (errno 12) on writing
  into stdout in a Windows console if the length is greater than
  66000 bytes. */
msg130189 - (view) Author: Santoso Wijaya (santa4nt) * Date: 2011-03-06 20:46
Thanks for the comment. It's my first patch. :-)

> - the patch doesn't apply on Python 3.3

That latest patch file I generated against the tip of 3.1 branch. Should I create two separate patches for 3.1 and 3.2+ (which will apply on 3.3, as well)? Actually, this crash will reproduce on (from my testing) 2.7 with "-u" option on, as well...

>  - I don't want to commit the tests because they write 66000 * 2 characters to the test output, which floods the test output. I don't know how to create a fake stdout which is a TTY but not the real stdout, especially on Windows. I think that manual tests only once should be enough. Or does anyone know how to create a fake TTY output?

I have a few ideas to work around this and still have a unit test...
msg130197 - (view) Author: Santoso Wijaya (santa4nt) * Date: 2011-03-06 22:58
Attached a modified patch that should work against 3.2+ heads:

- Added `isatty` bit field in isatty that's evaluated during its
  construction. This should eliminate the need to call `isatty()` on 
  every write.
- Cap buffer length to 32767 (4 * 1024 - 1) when writing to a tty.
- Test this by supplying `CREATE_NEW_CONSOLE` to `subprocess.call`, so 
  we do not flood regrtest's console output.

These changes are conditionally compiled on Windows only.

Should a similar patch be made for 2.7+ (maybe earlier)?
msg130202 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2011-03-06 23:41
On Windows, isatty() is a cheap call: a simple lookup in the _ioinfo structure.  And dup2() can still change the destination of a file descriptor, so the new attribute can be out of sync...
I suggest to call isatty() on every write.
msg130212 - (view) Author: Santoso Wijaya (santa4nt) * Date: 2011-03-07 00:31
FWIW, here's the Microsoft's source for isatty (in VC\crt\src\isatty.c):

/***
*int _isatty(handle) - check if handle is a device
*
*Purpose:
*       Checks if the given handle is associated with a character device
*       (terminal, console, printer, serial port)
*
*Entry:
*       int handle - handle of file to be tested
*
*Exit:
*       returns non-0 if handle refers to character device,
*       returns 0 otherwise
*
*Exceptions:
*
*******************************************************************************/

int __cdecl _isatty (
        int fh
        )
{
#if defined (_DEBUG) && !defined (_SYSCRT)
        /* make sure we ask debugger only once and cache the answer */
        static int knownHandle = -1;
#endif  /* defined (_DEBUG) && !defined (_SYSCRT) */

        /* see if file handle is valid, otherwise return FALSE */
        _CHECK_FH_RETURN(fh, EBADF, 0);
        _VALIDATE_RETURN((fh >= 0 && (unsigned)fh < (unsigned)_nhandle), EBADF, 0);

#if defined (_DEBUG) && !defined (_SYSCRT)
        if (knownHandle == -1) {
            knownHandle = DebuggerKnownHandle();
        }

        if (knownHandle) {
            return TRUE;
        }
#endif  /* defined (_DEBUG) && !defined (_SYSCRT) */

        /* check file handle database to see if device bit set */
        return (int)(_osfile(fh) & FDEV);
}
msg130214 - (view) Author: Santoso Wijaya (santa4nt) * Date: 2011-03-07 00:45
Attached a version of the last patch without `.isatty` caching.
msg130249 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-03-07 11:26
I tried to commit io_write.patch, but I had problems with Mercurial :-) I will commit it later.
msg130251 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2011-03-07 11:59
This last patch looks good, except that the comments "if stdout mode is binary (python -u)" are incorrect: since r87824, all files are opened in binary mode.
msg130252 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-03-07 12:11
> This last patch looks good, except that the comments "if stdout mode 
> is binary (python -u)" are incorrect: since r87824, all files are 
> opened in binary mode.

I plan to commit the patch to 3.1 and then forward port to 3.2 and 3.3. Yes, I will adapt the comment (remove "(python -u)") for 3.2 and 3.3.
msg131558 - (view) Author: Roundup Robot (python-dev) Date: 2011-03-20 22:38
New changeset 8939a21bdb94 by Victor Stinner in branch '3.2':
Issue #11395: io.FileIO().write() clamps the data length to 32,767 bytes on
http://hg.python.org/cpython/rev/8939a21bdb94

New changeset 4b3472169493 by Victor Stinner in branch 'default':
(merge) Issue #11395: io.FileIO().write() clamps the data length to 32,767
http://hg.python.org/cpython/rev/4b3472169493
msg131561 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-03-20 22:50
I realized that it was a little more difficult to port the fix on 3.1 because 3.1 doesn't have the fix for Windows 64 bits. So I only fixed Python 3.2 and 3.3, also because nobody reported failure for Python 3.1 on Windows with -u flag.

I tested my fix: the test fails without the fix, and it pass correctly with the fix. So let's close the last regression that I introduced in Python 3.2!
msg132284 - (view) Author: David-Sarah Hopwood (davidsarah) Date: 2011-03-27 01:31
If I understand the bug in the Windows console functions correctly, a limit of 32767 bytes might not always be small enough. The problem is that if two or more threads are concurrently using any console functions (which all use the same 64 KiB heap), they could try to allocate up to 32767 bytes plus overhead at the same time, which will fail.

I wasn't able to provoke this by writing to sys.stdout.buffer (maybe there is locking that prevents concurrent writes), but the following code that calls WriteFile directly, does provoke it. GetLastError() returns 8 (ERROR_NOT_ENOUGH_MEMORY; see http://msdn.microsoft.com/en-us/library/ms681382%28v=vs.85%29.aspx), indicating that it's the same bug.


# Warning: this test may DoS your system.

from threading import Thread
import sys
from ctypes import WINFUNCTYPE, windll, POINTER, byref, c_int
from ctypes.wintypes import BOOL, HANDLE, DWORD, LPVOID, LPCVOID

GetStdHandle = WINFUNCTYPE(HANDLE, DWORD)(("GetStdHandle", windll.kernel32))
WriteFile = WINFUNCTYPE(BOOL, HANDLE, LPCVOID, DWORD, POINTER(DWORD), LPVOID) \
                        (("WriteFile", windll.kernel32))
GetLastError = WINFUNCTYPE(DWORD)(("GetLastError", windll.kernel32))
STD_OUTPUT_HANDLE = DWORD(-11)
INVALID_HANDLE_VALUE = DWORD(-1).value

hStdout = GetStdHandle(STD_OUTPUT_HANDLE)
assert hStdout is not None and hStdout != INVALID_HANDLE_VALUE

L = 32760
data = b'a'*L

def run():
    n = DWORD(0)
    while True:
        ret = WriteFile(hStdout, data, L, byref(n), None)
        if ret == 0 or n.value != L:
            print(ret, n.value, GetLastError())
            sys.exit(1)

[Thread(target=run).start() for i in range(10)]
msg224606 - (view) Author: eryksun (eryksun) Date: 2014-08-03 03:36
The buffer size only needs to be capped if WINVER < 0x602. This issue doesn't apply to Windows 8 since it uses the ConDrv device driver instead of LPC. 

Prior to Windows 8, WriteFile redirects to WriteConsoleA when passed a console handle. This makes an LPC call to conhost.exe (csrss.exe before Windows 7), which copies the buffer to a shared heap. But a Windows 8 console process instead has actual File handles provided by the ConDrv device:

    stdin     \Device\ConDrv\Input
    stdout    \Device\ConDrv\Output
    stderr    \Device\ConDrv\Output

For File handles, ReadFile and WriteFile simply call the NT system functions NtReadFile and NtWriteFile. The buffer size is only limited by available memory.
msg224608 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2014-08-03 06:18
This issue is closed. You should reopen it or open a new one.
History
Date User Action Args
2014-08-03 06:18:10hayposetmessages: + msg224608
2014-08-03 03:36:05eryksunsetnosy: + eryksun
messages: + msg224606
2011-03-27 01:31:40davidsarahsetnosy: + davidsarah
messages: + msg132284
2011-03-20 22:50:41hayposetstatus: open -> closed

messages: + msg131561
resolution: fixed
nosy: terry.reedy, amaury.forgeotdarc, pitrou, haypo, casevh, neologix, santa4nt, python-dev
2011-03-20 22:38:14python-devsetnosy: + python-dev
messages: + msg131558
2011-03-07 12:11:47hayposetnosy: terry.reedy, amaury.forgeotdarc, pitrou, haypo, casevh, neologix, santa4nt
messages: + msg130252
2011-03-07 11:59:27amaury.forgeotdarcsetnosy: terry.reedy, amaury.forgeotdarc, pitrou, haypo, casevh, neologix, santa4nt
messages: + msg130251
2011-03-07 11:26:22hayposetfiles: + io_write.patch
nosy: terry.reedy, amaury.forgeotdarc, pitrou, haypo, casevh, neologix, santa4nt
messages: + msg130249
2011-03-07 00:45:32santa4ntsetfiles: + winconsole_large_py33_direct.patch
nosy: terry.reedy, amaury.forgeotdarc, pitrou, haypo, casevh, neologix, santa4nt
messages: + msg130214
2011-03-07 00:31:02santa4ntsetnosy: terry.reedy, amaury.forgeotdarc, pitrou, haypo, casevh, neologix, santa4nt
messages: + msg130212
2011-03-06 23:41:31amaury.forgeotdarcsetnosy: terry.reedy, amaury.forgeotdarc, pitrou, haypo, casevh, neologix, santa4nt
messages: + msg130202
2011-03-06 22:58:36santa4ntsetfiles: + winconsole_large_py33.patch
nosy: terry.reedy, amaury.forgeotdarc, pitrou, haypo, casevh, neologix, santa4nt
messages: + msg130197
2011-03-06 20:46:07santa4ntsetnosy: terry.reedy, amaury.forgeotdarc, pitrou, haypo, casevh, neologix, santa4nt
messages: + msg130189
versions: + Python 3.1, Python 2.7
2011-03-06 20:15:10hayposetnosy: terry.reedy, amaury.forgeotdarc, pitrou, haypo, casevh, neologix, santa4nt
messages: + msg130187
2011-03-06 20:13:10hayposetnosy: terry.reedy, amaury.forgeotdarc, pitrou, haypo, casevh, neologix, santa4nt
messages: + msg130186
2011-03-06 10:08:06hayposetnosy: terry.reedy, amaury.forgeotdarc, pitrou, haypo, casevh, neologix, santa4nt
messages: + msg130158
2011-03-06 00:52:41santa4ntsetfiles: + test_wconsole_binlarge.patch
nosy: terry.reedy, amaury.forgeotdarc, pitrou, haypo, casevh, neologix, santa4nt
messages: + msg130135
2011-03-06 00:20:56santa4ntsetfiles: + wconsole_large.patch
nosy: terry.reedy, amaury.forgeotdarc, pitrou, haypo, casevh, neologix, santa4nt
messages: + msg130133
2011-03-05 04:35:27santa4ntsetfiles: + test_writeconsole.patch

messages: + msg130100
keywords: + patch
nosy: terry.reedy, amaury.forgeotdarc, pitrou, haypo, casevh, neologix, santa4nt
2011-03-05 00:45:17terry.reedysetnosy: + terry.reedy
messages: + msg130095
2011-03-04 17:49:35santa4ntsetnosy: amaury.forgeotdarc, pitrou, haypo, casevh, neologix, santa4nt
components: + Windows
2011-03-04 12:31:40amaury.forgeotdarcsetnosy: amaury.forgeotdarc, pitrou, haypo, casevh, neologix, santa4nt
messages: + msg130048
2011-03-04 11:47:47pitrousetnosy: amaury.forgeotdarc, pitrou, haypo, casevh, neologix, santa4nt
messages: + msg130047
versions: + Python 3.3
2011-03-04 10:11:52hayposetnosy: amaury.forgeotdarc, pitrou, haypo, casevh, neologix, santa4nt
messages: + msg130045
2011-03-04 10:06:33amaury.forgeotdarcsetnosy: amaury.forgeotdarc, pitrou, haypo, casevh, neologix, santa4nt
messages: + msg130044
2011-03-04 10:00:51hayposetnosy: amaury.forgeotdarc, pitrou, haypo, casevh, neologix, santa4nt
messages: + msg130043
2011-03-04 09:50:52amaury.forgeotdarcsetnosy: amaury.forgeotdarc, pitrou, haypo, casevh, neologix, santa4nt
messages: + msg130042
2011-03-04 09:39:31hayposetnosy: amaury.forgeotdarc, pitrou, haypo, casevh, neologix, santa4nt
messages: + msg130041
2011-03-04 09:35:14amaury.forgeotdarcsetnosy: amaury.forgeotdarc, pitrou, haypo, casevh, neologix, santa4nt
messages: + msg130039
2011-03-04 09:33:46hayposetnosy: amaury.forgeotdarc, pitrou, haypo, casevh, neologix, santa4nt
messages: + msg130038
2011-03-04 09:32:13hayposetnosy: + haypo
messages: + msg130037
2011-03-04 09:20:05amaury.forgeotdarcsetnosy: amaury.forgeotdarc, pitrou, casevh, neologix, santa4nt
messages: + msg130036
2011-03-04 08:24:40neologixsetnosy: + neologix
messages: + msg130034
2011-03-04 08:21:51santa4ntsetnosy: + santa4nt
2011-03-04 08:15:12amaury.forgeotdarcsetnosy: + amaury.forgeotdarc
messages: + msg130033
2011-03-04 07:57:30amaury.forgeotdarcsetpriority: normal -> high
nosy: + pitrou
2011-03-04 05:47:42casevhcreate