classification
Title: SIGINT blocked by socket operations like recv on Windows
Type: behavior Stage:
Components: Windows Versions: Python 3.8
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: eryksun, paul.moore, steve.dower, tim.golden, zach.ware, zmwangx
Priority: normal Keywords:

Created on 2020-07-29 18:13 by zmwangx, last changed 2020-07-29 22:20 by eryksun.

Files
File name Uploaded Description Edit
socket_sigint_sigbreak.py zmwangx, 2020-07-29 18:13
Messages (2)
msg374580 - (view) Author: Zhiming Wang (zmwangx) * Date: 2020-07-29 18:13
I noticed that on Windows, socket operations like recv appear to always block SIGINT until it's done, so if a recv hangs, Ctrl+C cannot interrupt the program. (I'm a *nix developer investigating a behavioral problem of my program on Windows, so please excuse my limited knowledge of Windows.)

Consider the following example where I spawn a TCP server that stalls connections by 5 seconds in a separate thread, and use a client to connect to it on the main thread. I then try to interrupt the client with Ctrl+C.

    import socket
    import socketserver
    import time
    import threading


    interrupted = threading.Event()


    class HoneypotServer(socketserver.TCPServer):
        # Stall each connection for 5 seconds.
        def get_request(self):
            start = time.time()
            while time.time() - start < 5 and not interrupted.is_set():
                time.sleep(0.1)
            return self.socket.accept()


    class EchoHandler(socketserver.BaseRequestHandler):
        def handle(self):
            data = self.request.recv(1024)
            self.request.sendall(data)


    class HoneypotServerThread(threading.Thread):
        def __init__(self):
            super().__init__()
            self.server = HoneypotServer(("127.0.0.1", 0), EchoHandler)

        def run(self):
            self.server.serve_forever(poll_interval=0.1)


    def main():
        start = time.time()
        server_thread = HoneypotServerThread()
        server_thread.start()
        sock = socket.create_connection(server_thread.server.server_address)
        try:
            sock.sendall(b"hello")
            sock.recv(1024)
        except KeyboardInterrupt:
            print(f"processed SIGINT {time.time() - start:.3f}s into the program")
            interrupted.set()
        finally:
            sock.close()
            server_thread.server.shutdown()
            server_thread.join()


    if __name__ == "__main__":
        main()

On *nix systems the KeyboardInterrupt is processed immediately. On Windows, the KeyboardInterrupt is always processed more than 5 seconds into the program, when the recv is finished.

I suppose this is a fundamental limitation of Windows? Is there any workaround (other than going asyncio)?

Btw, I learned about SIGBREAK, which when unhandled seems to kill the process immediately, but that means no chance of cleanup. I tried to handle SIGBREAK but whenever a signal handler is installed, the behavior reverts to that of SIGINT -- the handler is called only after 5 seconds have passed.

(I'm attaching a socket_sigint_sigbreak.py which is a slightly expanded version of my sample program above, showing my attempt at handler SIGBREAK. Both

    python .\socket_sigint_sigbreak.py --sigbreak-handler interrupt

and

    python .\socket_sigint_sigbreak.py --sigbreak-handler exit

stall for 5 seconds.)
msg374590 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2020-07-29 22:20
Winsock is inherently asynchronous. It implements synchronous functions by using an alertable wait for the completion of an asynchronous I/O request. Python doesn't implement anything for a console Ctrl+C event to alert the main thread when it's blocked in an alterable wait. NTAPI NtAlertThread will alert a thread in this case, but it won't help here because Winsock just rewaits when alerted. 

You need a user-mode asynchronous procedure call (APC) to make the waiting thread cancel all of its pended I/O request packets (IRPs) for the given file (socket) handle. Specifically, open a handle to the thread, and call QueueUserAPC to queue an APC to the thread that calls WinAPI CancelIo on the file handle. (I don't suggest using the newer CancelIoEx function from an arbitrary thread context in this case. It would be simpler than queuing an APC to the target thread, but you don't have an OVERLAPPED record to cancel a specific IRP, so it would cancel IRPs for all threads.)

Here's a context manager that temporarily sets a Ctrl+C handler that implements the above suggestion:

    import ctypes
    import threading
    import contextlib

    kernel32 = ctypes.WinDLL('kernel32', use_last_error=True)

    CTRL_C_EVENT = 0
    THREAD_SET_CONTEXT = 0x0010

    @contextlib.contextmanager
    def ctrl_cancel_async_io(file_handle):
        apc_sync_event = threading.Event()
        hthread = kernel32.OpenThread(THREAD_SET_CONTEXT, False,
            kernel32.GetCurrentThreadId())
        if not hthread:
            raise ctypes.WinError(ctypes.get_last_error())

        @ctypes.WINFUNCTYPE(None, ctypes.c_void_p)
        def apc_cancel_io(ignored):
            kernel32.CancelIo(file_handle)
            apc_sync_event.set()

        @ctypes.WINFUNCTYPE(ctypes.c_uint, ctypes.c_uint)
        def ctrl_handler(ctrl_event):
            # For a Ctrl+C cancel event, queue an async procedure call
            # to the target thread that cancels pending async I/O for
            # the given file handle.
            if ctrl_event == CTRL_C_EVENT:
                kernel32.QueueUserAPC(apc_cancel_io, hthread, None)
                # Synchronize here in case the APC was queued to the
                # main thread, else apc_cancel_io might get interrupted
                # by a KeyboardInterrupt.
                apc_sync_event.wait()
            return False # chain to next handler

        try:
            kernel32.SetConsoleCtrlHandler(ctrl_handler, True)
            yield
        finally:
            kernel32.SetConsoleCtrlHandler(ctrl_handler, False)
            kernel32.CloseHandle(hthread)


Use it as follows in your sample code:

    with ctrl_cancel_async_io(sock.fileno()):
        sock.sendall(b"hello")
        sock.recv(1024)

Note that this requires the value of sock.fileno() to be an NT kernel handle for a file opened in asynchronous mode. This is the case for a socket.

HTH
History
Date User Action Args
2020-07-29 22:20:55eryksunsetnosy: + eryksun
messages: + msg374590
2020-07-29 18:13:12zmwangxcreate