Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

print(s) fails on Windows with long strings #55604

Closed
casevh mannequin opened this issue Mar 4, 2011 · 36 comments
Closed

print(s) fails on Windows with long strings #55604

casevh mannequin opened this issue Mar 4, 2011 · 36 comments
Labels
OS-windows type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@casevh
Copy link
Mannequin

casevh mannequin commented Mar 4, 2011

BPO 11395
Nosy @terryjreedy, @amauryfa, @pitrou, @eryksun
Files
  • test_writeconsole.patch
  • wconsole_large.patch
  • test_wconsole_binlarge.patch
  • winconsole_large_py33.patch
  • winconsole_large_py33_direct.patch
  • io_write.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2011-03-20.22:50:41.369>
    created_at = <Date 2011-03-04.05:47:42.980>
    labels = ['OS-windows', 'type-crash']
    title = 'print(s) fails on Windows with long strings'
    updated_at = <Date 2020-04-13.22:30:55.729>
    user = 'https://bugs.python.org/casevh'

    bugs.python.org fields:

    activity = <Date 2020-04-13.22:30:55.729>
    actor = 'vstinner'
    assignee = 'none'
    closed = True
    closed_date = <Date 2011-03-20.22:50:41.369>
    closer = 'vstinner'
    components = ['Windows']
    creation = <Date 2011-03-04.05:47:42.980>
    creator = 'casevh'
    dependencies = []
    files = ['21003', '21012', '21013', '21021', '21025', '21030']
    hgrepos = []
    issue_num = 11395
    keywords = ['patch']
    message_count = 36.0
    messages = ['130029', '130033', '130034', '130036', '130037', '130038', '130039', '130041', '130042', '130043', '130044', '130045', '130047', '130048', '130095', '130100', '130133', '130135', '130158', '130186', '130187', '130189', '130197', '130202', '130212', '130214', '130249', '130251', '130252', '131558', '131561', '132284', '224606', '224608', '366237', '366243']
    nosy_count = 10.0
    nosy_names = ['terry.reedy', 'amaury.forgeotdarc', 'pitrou', 'casevh', 'davidsarah', 'neologix', 'santoso.wijaya', 'python-dev', 'eryksun', 'Drekin']
    pr_nums = []
    priority = 'high'
    resolution = 'fixed'
    stage = None
    status = 'closed'
    superseder = None
    type = 'crash'
    url = 'https://bugs.python.org/issue11395'
    versions = ['Python 3.1', 'Python 2.7', 'Python 3.2', 'Python 3.3']

    @casevh
    Copy link
    Mannequin Author

    casevh mannequin commented Mar 4, 2011

    Python 3.2 fails when printing long strings.

    C:\Python32>python
    Python 3.2 (r32:88445, Feb 20 2011, 21:30:00) [MSC v.1500 64 bit (AMD64)] on win32
    Type "help", "copyright", "credits" or "license" for more information.
    >>> print("a"*66000)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    IOError: [Errno 12] Not enough space
    >>>

    Some observations:

    1. 3.2 on Linux prints just fine.
    2. 2.7.1 and 3.1.3 on Windows x64 are fine
    3. The 32-bit interpreter for 3.2 also fails.
    4. On 32-bit Windows, a length of 62733 works correctly but 62734, and higher, fail.
    5. On 64-bit Windows, the output is visibly corrupted when the length reaches 62801 but the error does not occur until the length reaches 65536.
    6. While experimenting with various lengths, I was able to crash the interpreter once.

    @casevh casevh mannequin added the type-crash A hard crash of the interpreter, possibly with a core dump label Mar 4, 2011
    @amauryfa
    Copy link
    Member

    amauryfa commented Mar 4, 2011

    Likewise, this fails with 3.2::
    import os
    os.write(1, b"a" * 66000)

    @neologix
    Copy link
    Mannequin

    neologix mannequin commented Mar 4, 2011

    It's probably a Windows limitation regarding the number of bytes that can be written to stdout in one write.
    As for the difference between python versions, what does
    python -c "import sys; print(sys.getsizeof('a'))" return ?

    @amauryfa
    Copy link
    Member

    amauryfa commented Mar 4, 2011

    no, it works with 3.2b2 (r32b2:87398), and fails with 3.2 final (r32:88445)

    @vstinner
    Copy link
    Member

    vstinner commented Mar 4, 2011

    Extract of issue bpo-1602:
    << WriteConsoleW has one bug that I know of, which is that it <a href="http://tahoe-lafs.org/trac/tahoe-lafs/ticket/1232"\>fails when writing more than 26608 characters at once</a>. That's easy to work around by limiting the amount of data passed in a single call. >>

    I suppose that os.write(1) does indirectly write into the Windows console which has this limit.

    @vstinner
    Copy link
    Member

    vstinner commented Mar 4, 2011

    Extract of the WriteConsole Function:

    << The storage for this buffer is allocated from a shared heap for the process that is 64 KB in size. The maximum size of the buffer will depend on heap usage. >>
    http://msdn.microsoft.com/en-us/library/ms687401(VS.85).aspx

    Ah ah, that's funny, "depend on heap usage" :-)

    @amauryfa
    Copy link
    Member

    amauryfa commented Mar 4, 2011

    This changed with r87824

    @vstinner
    Copy link
    Member

    vstinner commented Mar 4, 2011

    This changed with r87824

    Yes, I changed Python to open all files in binary mode. With Python < 3.2, you can open sys.std* streams in binary mode using -u command line option (u like unbuffered, not Unicode ;-)).

    @amauryfa
    Copy link
    Member

    amauryfa commented Mar 4, 2011

    Indeed, Python3.1 fails with the -u option.

    Before r87824, the C call to write() performed CRLF conversion. In the implementation of MSVCRT, a local buffer is used (1025 chars in vs8.0, 5*1024 in vs10.0), so WriteFile is called with small sizes.
    Since r87824 (or with -u), no such conversion occurs, and WriteFile is called with the full buffer.

    @vstinner
    Copy link
    Member

    vstinner commented Mar 4, 2011

    Anyway, use os.write() to write unicode into the Windows console is not the right thing to do. We should use WriteConsoleW(): bpo-1602 is the correct fix for this issue.

    @amauryfa
    Copy link
    Member

    amauryfa commented Mar 4, 2011

    I'm writing bytes here: os.write(1, b"b" * 66000)
    And WriteConsole has the same issue.

    @vstinner
    Copy link
    Member

    vstinner commented Mar 4, 2011

    And WriteConsole has the same issue.

    print() (sys.stdout and sys.stderr) should use WriteConsoleW() and use small chunks (smaller than 64 KB, I don't know the safest size).

    @pitrou
    Copy link
    Member

    pitrou commented Mar 4, 2011

    IIUC, this is a Windows bug? Is there any easy workaround for us?

    @amauryfa
    Copy link
    Member

    amauryfa commented Mar 4, 2011

    It may be a windows bug, but it's also an a python regression!
    A fix is to limit the number of chars:

    ===================================================================

    --- D:/py3k/Modules/_io/fileio.c   (revision 87824)
    +++ D:/py3k/Modules/_io/fileio.c   (copie de travail)
    @@ -712,6 +712,8 @@
             errno = 0;
             len = pbuf.len;
     #if defined(MS_WIN64) || defined(MS_WINDOWS)
    +        if (len > 32000 && isatty(self->fd))
    +            len = 32000;
             if (len > INT_MAX)
                 len = INT_MAX;
             n = write(self->fd, pbuf.buf, (int)len);

    On my system, errors start at ~52200 (why?). I hope that 32K is low enough... MSVCRT's write() (version vs10.0) uses a buffer of 5K.

    @santosowijaya santosowijaya mannequin added the OS-windows label Mar 4, 2011
    @terryjreedy
    Copy link
    Member

    print("a"*66000)

    works (after some delay) running from an IDLE edit window (but see bpo-144249). Works means that I get a working prompt back with no errors.

    Unlike IDLE, the Command Prompt Windows keeps a limited number of lines in its buffers (default: 4x50), and may have a total size limit.

    @santosowijaya
    Copy link
    Mannequin

    santosowijaya mannequin commented Mar 5, 2011

    I'm adding a test that will reproduce the crash.

    @santosowijaya
    Copy link
    Mannequin

    santosowijaya mannequin commented Mar 6, 2011

    And a patch for the test + fix.

    @santosowijaya
    Copy link
    Mannequin

    santosowijaya mannequin commented Mar 6, 2011

    Indeed, Python3.1 fails with the -u option.

    I'm also attaching another test to reproduce the crash with '-u' option.

    @vstinner
    Copy link
    Member

    vstinner commented Mar 6, 2011

    I did some tests: os.write(1, b'X'*length) does always fail with length >= 63842. It does sometimes fail with length > 35000. The maximum looks completly random: as written in Microsoft documentation, "The maximum size of the buffer will depend on heap usage"...

    32000 looks arbitrary. I would prefer 2^15-1 (32767) because it looks less magical :-)

    @vstinner
    Copy link
    Member

    vstinner commented Mar 6, 2011

    Remarks about test_wconsole_binlarge.patch:

    • I don't know if isatty() is cheap or not. Is it a system call? If it might be slow, it should be only be called once in the constructor. On Windows, I don't think that isatty(fd) evoles.
    • I don't want to commit the tests because they write 66000 * 2 characters to the test output, which floods the test output. I don't know how to create a fake stdout which is a TTY but not the real stdout, especially on Windows. I think that manual tests only once should be enough. Or does anyone know how to create a fake TTY output?

    @vstinner
    Copy link
    Member

    vstinner commented Mar 6, 2011

    Other remarks about test_wconsole_binlarge.patch:

    • the patch doesn't apply on Python 3.3
    • I would prefer 32767 instead of 32000 for the maximum length

    Suggestion for the comment in fileio.c:

    • Issue bpo-11395: not enough space error (errno 12) on writing
      into stdout in a Windows console if the length is greater than
      66000 bytes. */

    @santosowijaya
    Copy link
    Mannequin

    santosowijaya mannequin commented Mar 6, 2011

    Thanks for the comment. It's my first patch. :-)

    • the patch doesn't apply on Python 3.3

    That latest patch file I generated against the tip of 3.1 branch. Should I create two separate patches for 3.1 and 3.2+ (which will apply on 3.3, as well)? Actually, this crash will reproduce on (from my testing) 2.7 with "-u" option on, as well...

    • I don't want to commit the tests because they write 66000 * 2 characters to the test output, which floods the test output. I don't know how to create a fake stdout which is a TTY but not the real stdout, especially on Windows. I think that manual tests only once should be enough. Or does anyone know how to create a fake TTY output?

    I have a few ideas to work around this and still have a unit test...

    @santosowijaya
    Copy link
    Mannequin

    santosowijaya mannequin commented Mar 6, 2011

    Attached a modified patch that should work against 3.2+ heads:

    • Added isatty bit field in isatty that's evaluated during its
      construction. This should eliminate the need to call isatty() on
      every write.
    • Cap buffer length to 32767 (4 * 1024 - 1) when writing to a tty.
    • Test this by supplying CREATE_NEW_CONSOLE to subprocess.call, so
      we do not flood regrtest's console output.

    These changes are conditionally compiled on Windows only.

    Should a similar patch be made for 2.7+ (maybe earlier)?

    @amauryfa
    Copy link
    Member

    amauryfa commented Mar 6, 2011

    On Windows, isatty() is a cheap call: a simple lookup in the _ioinfo structure. And dup2() can still change the destination of a file descriptor, so the new attribute can be out of sync...
    I suggest to call isatty() on every write.

    @santosowijaya
    Copy link
    Mannequin

    santosowijaya mannequin commented Mar 7, 2011

    FWIW, here's the Microsoft's source for isatty (in VC\crt\src\isatty.c):

    /***
    *int _isatty(handle) - check if handle is a device
    *
    *Purpose:

    •   Checks if the given handle is associated with a character device
      
    •   (terminal, console, printer, serial port)
      

    *Entry:

    •   int handle - handle of file to be tested
      

    *Exit:

    •   returns non-0 if handle refers to character device,
      
    •   returns 0 otherwise
      

    *Exceptions:
    *
    *******************************************************************************/

    int __cdecl _isatty (
            int fh
            )
    {
    #if defined (_DEBUG) && !defined (_SYSCRT)
            /* make sure we ask debugger only once and cache the answer */
            static int knownHandle = -1;
    #endif  /* defined (_DEBUG) && !defined (_SYSCRT) */
        /* see if file handle is valid, otherwise return FALSE */
        _CHECK_FH_RETURN(fh, EBADF, 0);
        _VALIDATE_RETURN((fh >= 0 && (unsigned)fh < (unsigned)_nhandle), EBADF, 0);
    
    #if defined (_DEBUG) && !defined (_SYSCRT)
            if (knownHandle == -1) {
                knownHandle = DebuggerKnownHandle();
            }
    
            if (knownHandle) {
                return TRUE;
            }
    #endif  /* defined (_DEBUG) && !defined (_SYSCRT) */
        /* check file handle database to see if device bit set \*/
        return (int)(_osfile(fh) & FDEV);
    

    }

    @santosowijaya
    Copy link
    Mannequin

    santosowijaya mannequin commented Mar 7, 2011

    Attached a version of the last patch without .isatty caching.

    @vstinner
    Copy link
    Member

    vstinner commented Mar 7, 2011

    I tried to commit io_write.patch, but I had problems with Mercurial :-) I will commit it later.

    @amauryfa
    Copy link
    Member

    amauryfa commented Mar 7, 2011

    This last patch looks good, except that the comments "if stdout mode is binary (python -u)" are incorrect: since r87824, all files are opened in binary mode.

    @vstinner
    Copy link
    Member

    vstinner commented Mar 7, 2011

    This last patch looks good, except that the comments "if stdout mode
    is binary (python -u)" are incorrect: since r87824, all files are
    opened in binary mode.

    I plan to commit the patch to 3.1 and then forward port to 3.2 and 3.3. Yes, I will adapt the comment (remove "(python -u)") for 3.2 and 3.3.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Mar 20, 2011

    New changeset 8939a21bdb94 by Victor Stinner in branch '3.2':
    Issue bpo-11395: io.FileIO().write() clamps the data length to 32,767 bytes on
    http://hg.python.org/cpython/rev/8939a21bdb94

    New changeset 4b3472169493 by Victor Stinner in branch 'default':
    (merge) Issue bpo-11395: io.FileIO().write() clamps the data length to 32,767
    http://hg.python.org/cpython/rev/4b3472169493

    @vstinner
    Copy link
    Member

    I realized that it was a little more difficult to port the fix on 3.1 because 3.1 doesn't have the fix for Windows 64 bits. So I only fixed Python 3.2 and 3.3, also because nobody reported failure for Python 3.1 on Windows with -u flag.

    I tested my fix: the test fails without the fix, and it pass correctly with the fix. So let's close the last regression that I introduced in Python 3.2!

    @davidsarah
    Copy link
    Mannequin

    davidsarah mannequin commented Mar 27, 2011

    If I understand the bug in the Windows console functions correctly, a limit of 32767 bytes might not always be small enough. The problem is that if two or more threads are concurrently using any console functions (which all use the same 64 KiB heap), they could try to allocate up to 32767 bytes plus overhead at the same time, which will fail.

    I wasn't able to provoke this by writing to sys.stdout.buffer (maybe there is locking that prevents concurrent writes), but the following code that calls WriteFile directly, does provoke it. GetLastError() returns 8 (ERROR_NOT_ENOUGH_MEMORY; see http://msdn.microsoft.com/en-us/library/ms681382%28v=vs.85%29.aspx), indicating that it's the same bug.

    # Warning: this test may DoS your system.

    from threading import Thread
    import sys
    from ctypes import WINFUNCTYPE, windll, POINTER, byref, c_int
    from ctypes.wintypes import BOOL, HANDLE, DWORD, LPVOID, LPCVOID
    
    GetStdHandle = WINFUNCTYPE(HANDLE, DWORD)(("GetStdHandle", windll.kernel32))
    WriteFile = WINFUNCTYPE(BOOL, HANDLE, LPCVOID, DWORD, POINTER(DWORD), LPVOID) \
                            (("WriteFile", windll.kernel32))
    GetLastError = WINFUNCTYPE(DWORD)(("GetLastError", windll.kernel32))
    STD_OUTPUT_HANDLE = DWORD(-11)
    INVALID_HANDLE_VALUE = DWORD(-1).value
    
    hStdout = GetStdHandle(STD_OUTPUT_HANDLE)
    assert hStdout is not None and hStdout != INVALID_HANDLE_VALUE
    
    L = 32760
    data = b'a'*L
    
    def run():
        n = DWORD(0)
        while True:
            ret = WriteFile(hStdout, data, L, byref(n), None)
            if ret == 0 or n.value != L:
                print(ret, n.value, GetLastError())
                sys.exit(1)

    [Thread(target=run).start() for i in range(10)]

    @eryksun
    Copy link
    Contributor

    eryksun commented Aug 3, 2014

    The buffer size only needs to be capped if WINVER < 0x602. This issue doesn't apply to Windows 8 since it uses the ConDrv device driver instead of LPC.

    Prior to Windows 8, WriteFile redirects to WriteConsoleA when passed a console handle. This makes an LPC call to conhost.exe (csrss.exe before Windows 7), which copies the buffer to a shared heap. But a Windows 8 console process instead has actual File handles provided by the ConDrv device:

    stdin     \Device\ConDrv\Input
    stdout    \Device\ConDrv\Output
    stderr    \Device\ConDrv\Output
    

    For File handles, ReadFile and WriteFile simply call the NT system functions NtReadFile and NtWriteFile. The buffer size is only limited by available memory.

    @vstinner
    Copy link
    Member

    vstinner commented Aug 3, 2014

    This issue is closed. You should reopen it or open a new one.

    @Drekin
    Copy link
    Mannequin

    Drekin mannequin commented Apr 12, 2020

    I've been hit by this issue recently. On my configuration, print("a" * 10215) fails with an infinite loop of OSErrors (WinError 8). This even cannot by interrupted with Ctrl-C nor the exception can be catched.

    • print("a" * 10214) is fine
    • print("a" * 10215) is fine when preceeded by print("b" * 2701), but not when preceeded by print("b" * 2700)
    • the problem (or at least with these numbers) occurs only when the code is saved in a script, and this is run by double-clicking the file (i.e. run by Windows ShellExecute I guess), not by "py test.py" or interactively.

    My configuration is Python 3.7.3 64 bit on Windows Vista 64 bit. I wonder if anyone is able to reproduce this on their configuration.

    @eryksun
    Copy link
    Contributor

    eryksun commented Apr 12, 2020

    the problem (or at least with these numbers) occurs only when
    the code is saved in a script, and this is run by double-
    clicking the file

    The .py file association is probably using a different version of Python, or it's associated with the py launcher and there's a shebang in the file that runs 2.7. Verify the version that's executing the script. If it's running in 3.7, follow up in a new issue or on python-list. Please do not follow up on this resolved issue. Also, please try to reproduce the issue on a supported, updated system. Windows Vista is not supported by 3.7, and the latest 3.7 release is 3.7.7.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    OS-windows type-crash A hard crash of the interpreter, possibly with a core dump
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants