This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author vstinner
Recipients njs, pitrou, vstinner
Date 2018-07-18.10:09:54
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1531908594.57.0.56676864532.issue34130@psf.upfronthosting.co.za>
In-reply-to
Content
On Windows 10, when I run "python -m test test_signal -v -F" in 3 terminals in parallel, sometimes test_socket() fails.

I debugged test_socket() and I validated that the C signal handler called send() and that the signal byte (signal number) has been sent properly (send() returns 1) when the bug occurs. I added an internal "static size_t written = 0;" variable which is incremented when send() succeed and I added a signal.get_written() method to get the value after read.recv(1) fails in the unit test. But recv() fails with BlockingIOError, as if the single byte has not be transfered from the write end to the read end of the TCP socket pair.

On Windows, a socket pair is a pair of two TCP sockets connected on the local link (IPv4 127.0.0.1). It's not a UNIX socket as on Linux.

Questions.

(*) Is the socket pair properly connected when the bug occurs? socket.socketpair() sets the socket as non-blocking to call connect() and ignores when connect() fails with BlockingIOError or InterruptedError (EINTR). I modified the Windows implementation of socketpair() (in socket.py) to use a blocking call and the bug occurs even if connect() is made in blocking mode. I also checked that csock.getpeername() == lsock.getsockname(). So the socket is connected.

(*) Does the socket pair use buffering somewhere? Both socekts are set to non-blocking mode by test_signal.test_socket(). By default, SO_SNDBUF is 64 KiB (65 536 bytes) for the write end of the socket pair, and SO_RCVBUF is 64 KiB (65 536 bytes) for the read end of the socket pair.

It *seems* like adding the following call to test_socket() works around the bug:

write.setsockopt(socket.SOL_SOCKET, socket.SO_SNDBUF, 0)

I'm not sure if it's "correct" to use SO_SNDBUF=0. I'm not sure that SO_SNDBUF=0 really *fix* the bug: it's a race condition, so it's hard to check if it's really fixed or not.
History
Date User Action Args
2018-07-18 10:09:54vstinnersetrecipients: + vstinner, pitrou, njs
2018-07-18 10:09:54vstinnersetmessageid: <1531908594.57.0.56676864532.issue34130@psf.upfronthosting.co.za>
2018-07-18 10:09:54vstinnerlinkissue34130 messages
2018-07-18 10:09:54vstinnercreate