classification
Title: On Windows sys.stdin.readline() doesn't handle Ctrl-C properly
Type: behavior Stage: needs patch
Components: Windows Versions: Python 3.5, Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: tim.golden Nosy List: Drekin, ebarry, eryksun, haypo, tim.golden, troyhirni
Priority: normal Keywords: patch

Created on 2013-07-30 09:13 by Drekin, last changed 2016-10-06 13:59 by Drekin.

Files
File name Uploaded Description Edit
issue18597_3_6_0.patch eryksun, 2016-03-11 12:58 review
Messages (13)
msg193918 - (view) Author: Adam Bartoš (Drekin) * Date: 2013-07-30 09:13
When I run sys.stdin.readline() interactivelly (on Windows and Python 3.3.2) and hit Ctrl-C, sometimes it returns an empty string just before KeyboardInterrupt is raised. Sometimes it isn't even raised and instead after hitting Return some weird SyntaxtError: unknown decode error (on line 0) occurs. Seems like propagation of KeyboardInterrupt is somehow out of sync. sys.stdin.read(n) has the same issue. May be related to recently fixed http://bugs.python.org/issue17619 where was similar situation with input().
msg193919 - (view) Author: Tim Golden (tim.golden) * (Python committer) Date: 2013-07-30 09:34
The Ctrl-C handling in Python on Windows is a bit strange in places. I'll add this to my list of things to look at. If you'd care to walk through the code to produce a patch or at least to point to suspect code, that would make it more likely that it be fixed.
msg193936 - (view) Author: Adam Bartoš (Drekin) * Date: 2013-07-30 15:17
I haven't experience with Python C code but I tried to find some clues in the code. First for input(): it call PyOS_Readline which may call PyOS_StdioReadline > my_fgets > fgets in Parser/myreadline.c. There is Windows related comment on line 56:

“Ctrl-C anywhere on the line or Ctrl-Z if the only character on a line will set ERROR_OPERATION_ABORTED. Under normal circumstances Ctrl-C will also have caused the SIGINT handler to fire which will have set the event object returned by _PyOS_SigintEvent. This signal fires in another thread and is not guaranteed to have occurred before this point in the code. 
Therefore: check whether the event is set with a small timeout. If it is, assume this is a Ctrl-C and reset the event. If it isn't set assume that this is a Ctrl-Z on its own and drop through to check for EOF.”

For sys.stdin.readline and .read: it goes down the IO machinery from text IO, buffered IO and raw IO (in this case FileIO) to Modules/_io/fileio.c where it ends calling function read(fd, buf, len), probably from <unistd>. I don't know how read is implemented on Windows.

I also tried calling ReadConsoleW from winapi via ctypes to read Unicode charactes from console (see http://bugs.python.org/issue1602). And there was similar issue with Ctrl-C occurring. What seems to work here is to put time.sleep(0.01) after ReadConsoleW.

So the general pattern is following: when calling some low-level Windows function to read input from user and when he hits Ctrl-C, the function returns and SIGINT is generated. However it takes time for this signal to arrive. Because it may arrive anywhere in the following code, the strange behaviour may occur. In the input() case, when PyOS_Readline returns, it was probably enough time, so added PyErr_CheckSignals() catched that SIGINT/KeyboardInterrupt.

We can find out about Ctrl-C having been pressed by calling winapi function GetLastError() and testing against ERROR_OPERATION_ABORTED. Then we should wait for the signal.
msg193938 - (view) Author: Tim Golden (tim.golden) * (Python committer) Date: 2013-07-30 15:28
Thanks for doing the investigation. Yes, that comment was added by me
as part of the fix for issue1677. I'll try to have a look at the
codepath you describe to see if we can add a similar workaround. The
Ctrl-C / SIGINT handling on Windows is less than ideal, I admit.

There was a similar problem in issue18040 which I closed as "won't fix"
since the fix was arguably too intrusive for the extremely unlikely
problem it was fixing. It might be worth seeing if the same root cause
applies, though.
msg196120 - (view) Author: Adam Bartoš (Drekin) * Date: 2013-08-25 11:43
Why are there actually more codepaths which may raise this issue? My naive idea would be that input() is just thin wrapper around sys.stdout.write() (for prompt) and sys.stdin.readline() which leads to sys.stdin.buffer.raw.read* where is the place where some low level OS-dependent function to actually get input from user is called (unistd.read or GNU readline or whatever). And also there is the place where the waiting for KeyboardInterrupt on Windows should occur.
msg224402 - (view) Author: Adam Bartoš (Drekin) * Date: 2014-07-31 12:49
Shouldn't there be a Ctrl-C check in Modules/_io/fileio.c:fileio_read* functions? I think some of these are called by standard sys.stdin.readline().
msg246053 - (view) Author: Eryk Sun (eryksun) * Date: 2015-07-02 01:57
In Windows 10 ReadFile doesn't set ERROR_OPERATION_ABORTED (995) for Ctrl+C when reading console input, but ReadConsole does. 

    >>> from ctypes import *
    >>> kernel32 = WinDLL('kernel32', use_last_error=True)
    >>> buf = (c_char * 1)()
    >>> n = c_uint()
    >>> kernel32.ReadFile(kernel32.GetStdHandle(-10), buf, 1, byref(n), None)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    KeyboardInterrupt
    >>> get_last_error()
    0
    >>> kernel32.ReadConsoleA(kernel32.GetStdHandle(-10), buf, 1, byref(n), None)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    KeyboardInterrupt
    >>> get_last_error()
    995

Add this to the list of reasons Python should be using the console API for interactive standard streams. As is Ctrl+C is killing the REPL since it gets interpreted as EOF. This bug probably applies to Windows 8, too. Could someone check?

Background:
In Windows 7 reading from the console is implemented with a common code path to make an LPC call (NtRequestWaitReplyPort) to the console host process, conhost.exe. This was all completely redesigned for Windows 8, which instead uses the ConDrv device driver. Now ReadFile calls NtReadFile, and ReadConsole calls NtDeviceIoControlFile. When splitting this up they apparently forgot to set ERROR_OPERATION_ABORTED for Ctrl+C in ReadFile.
msg256134 - (view) Author: Troy Hirni (troyhirni) Date: 2015-12-09 03:19
I'm also experiencing this on Windows 8 and 10. In the bare example below, I can Ctrl-C to exit the loop. When I press Enter again, the exception at the bottom appears.

try:
  while True:
    input("? ")
except:
  pass




>>>
>>> try:
...     while True:
...             input("? ")
... except:
...     pass
...
? asdf
'asdf'
? qqwer
'qqwer'
? >>>
  File "<stdin>", line 0

    ^
SyntaxError: decoding with 'cp437' codec failed (KeyboardInterrupt: )
>>>
msg261564 - (view) Author: Eryk Sun (eryksun) * Date: 2016-03-11 12:58
This problem has come up in several issues now (see issue 25376 and issue 26531). I'm adding a patch for Python 3.6 to call ReadConsoleA instead of fgets in PyOS_Readline. This fixes Ctrl+C and EOF handling in Windows 10 for both the interactive shell and the built-in input() function. 

As noted previously, changes to the console in Windows 8 and 10 introduced a bug in ReadFile. It no longer sets the last error to ERROR_OPERATION_ABORTED (995) when a console read is interrupted by Ctrl+C or Ctrl+Break. This is a problem for the current implementation of PyOS_Readline, which calls ReadFile indirectly via C fgets. 

This bug can be avoided by calling ReadConsoleA instead of fgets when stdin is a console. Note that isatty() is insufficient to detect the console, since it's true for all character devices, such as NUL. I instead call GetConsoleMode to check for a console handle.

I'm also looking into modifying Modules/signalmodule.c to set _PyOS_SigintEvent() when SIGBREAK is tripped, which matters when there's a non-default SIGBREAK handler. Also, PyErr_CheckSignals is a logical place to reset the event. Actually, it seems to me that everywhere in signalmodule.c where the signal flag is untripped should reset the event, in which case there should be an untrip_signal() to match trip_signal().
msg261569 - (view) Author: Eryk Sun (eryksun) * Date: 2016-03-11 13:31
Background Discussion

The Windows 10 console uses the condrv.sys device driver, which is set up as follows in the NT namespace:

    C:\>odir \ -r -n con;con*$;cond*
    Directory of \

    Device
        ConDrv <Device>
    Driver
        condrv <Driver>
    GLOBAL??
        CON -> \Device\ConDrv\Console
        CONIN$ -> \Device\ConDrv\CurrentIn
        CONOUT$ -> \Device\ConDrv\CurrentOut

Previously the base console API used an NT LPC port to communicate with the attached console process (i.e. an instance of conhost.exe). There wasn't a real console device. Instead, opening "CON", "CONIN$", or "CONOUT$" was special cased to call OpenConsoleW (undocumented). 

With the new console device driver, opening the DOS "CON" device gets translated to the NT path "\Device\ConDrv\Console", i.e. it opens the file named "Console" on the ConDrv device. 

Opening the Console file returns a handle for a regular kernel File object. To that end, you may have noticed that console handles in Windows 10 are no longer tagged for routing by setting the lower two bits (e.g. 3, 7, 11, etc). For example:

    >>> kernel32.GetStdHandle(STD_INPUT_HANDLE)
    32
    >>> kernel32.DebugBreak()
    (e1c.e20): Break instruction exception - code 80000003 (first chance)
    KERNELBASE!DebugBreak+0x2:
    00007ffa`60280262 cc              int     3

    0:000> !handle 32
    Handle 32
      Type          File

Previously, all operations on console handles were internally routed to special console functions, such as ReadFile => ReadConsoleA. Thus with the old LPC-based console, a ReadFile basically has the behavior of ReadConsoleA (with the addition of special casing input lines that start with Ctrl+Z). 

The new design scraps a lot of the special-cased code. For example, reading from a console handle in Windows 10 uses a regular NtReadFile system call. So the error it sets, if any at all, depends on translating the NTSTATUS code that's returned by NtReadFile. Let's see what status the console sets here:

    C:\Temp>cdb -xi ld python ccbug.py

    [...]

    ntdll!LdrpDoDebuggerBreak+0x30:
    00007ffb`170de260 cc              int     3
    0:000> g
    3.5.1 (v3.5.1:37a07cee5969, Dec  6 2015, 01:54:25)
    [MSC v.1900 64 bit (AMD64)]

    calling DebugBreak...
    (8d0.62c): Break instruction exception - code 80000003 (first chance)
    KERNELBASE!DebugBreak+0x2:
    00007ffb`13f40262 cc              int     3
    0:000> bp ntdll!NtReadFile
    0:000> g
    Breakpoint 0 hit
    ntdll!NtReadFile:
    00007ffb`170b35d0 4c8bd1          mov     r10,rcx
    0:000> pt
    ntdll!NtReadFile+0xa:
    00007ffb`170b35da c3              ret
    0:000> r rax
    rax=0000000000000101

The console weirdly returns a success code, STATUS_ALERTED (0x101, "the delay completed because the thread was alerted"), which is why ReadFile doesn't set an error. STATUS_ALERTED is normally returned when an NT wait function gets alerted by NtAlertThread (note that this is not the same as getting alerted by an asynchronous procedure call). For example:
    
    tid = threading.get_ident()
    h = kernel32.OpenThread(MAXIMUM_ALLOWED, 0, tid)
    t = threading.Timer(5, ntdll.NtAlertThread, (h,))
    delay = LARGE_INTEGER(10 * -10**7) # 10 seconds
    t.start()
    r = ntdll.NtDelayExecution(True, byref(delay))

    >>> hex(r)
    '0x101'

NtAlertThread is rarely used because WinAPI wait functions (e.g. SleepEx) automatically restart a wait when the underlying NT wait returns STATUS_ALERTED. 

The ReadConsole implementation has always translated STATUS_ALERTED to ERROR_OPERATION_ABORTED. This still exists in the Windows 10 implementation of ReadConsole. However, the correct error status for this case is STATUS_CANCELLED (0xC0000120, "the I/O request was cancelled"): 

    >>> ntdll.RtlNtStatusToDosError(0xC0000120)
    995

Whoever reimplemented the console IPC using a device driver should have updated the console to return STATUS_CANCELLED when an I/O operation is interrupted by Ctrl+C or Ctrl+Break. Then nothing would need to be special cased.
msg278182 - (view) Author: Adam Bartoš (Drekin) * Date: 2016-10-06 10:26
Maybe this was fixed with the recent fix of #1602.
msg278193 - (view) Author: Eryk Sun (eryksun) * Date: 2016-10-06 13:53
Switching to ReadConsoleW in 3.6+ solves the problem with not seeing ERROR_OPERATION_ABORTED in Windows 8+, and with proper handling this potentially solves issues with Ctrl+C handling (when I last checked there were still bugs with this in the 3.6 beta). However, the problem still exists in 2.7 and 3.5, where the only possible solution is to switch to ReadConsoleA. Maybe once the new PyOS_StdioReadline code in 3.6 is stable, it can be backported to 3.5 using ReadConsoleA instead of ReadConsoleW. 2.7 will probably remain broken.
msg278194 - (view) Author: Adam Bartoš (Drekin) * Date: 2016-10-06 13:59
The main reason I have extended the support of win_unicode_console to Python 2.7 was that the related issues won't be fixed there, so using win_unicode_console may fix this as well.
History
Date User Action Args
2016-10-06 13:59:30Drekinsetmessages: + msg278194
2016-10-06 13:53:44eryksunsetmessages: + msg278193
versions: - Python 3.3, Python 3.4, Python 3.6
2016-10-06 10:26:57Drekinsetmessages: + msg278182
2016-03-11 13:31:46eryksunsetmessages: + msg261569
2016-03-11 12:58:31eryksunsetfiles: + issue18597_3_6_0.patch
keywords: + patch
messages: + msg261564
2015-12-09 03:26:07ebarrysetversions: + Python 3.3, Python 3.4, Python 3.6
2015-12-09 03:19:39troyhirnisetnosy: + ebarry, troyhirni

messages: + msg256134
versions: + Python 2.7, - Python 3.3, Python 3.4, Python 3.6
2015-07-02 01:57:02eryksunsetnosy: + eryksun

messages: + msg246053
versions: + Python 3.4, Python 3.5, Python 3.6
2015-06-06 05:21:35martin.panterlinkissue14287 superseder
2014-07-31 12:49:19Drekinsetmessages: + msg224402
2014-07-30 23:44:55hayposetnosy: + haypo
2013-08-25 11:43:25Drekinsetmessages: + msg196120
2013-07-30 15:28:59tim.goldensetmessages: + msg193938
2013-07-30 15:17:58Drekinsetmessages: + msg193936
2013-07-30 09:34:05tim.goldensetnosy: + tim.golden
messages: + msg193919

assignee: tim.golden
type: behavior
stage: needs patch
2013-07-30 09:13:12Drekincreate