This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Socket state corrupts when original socket object goes out of scope in a different thread
Type: behavior Stage:
Components: Documentation Versions: Python 3.5
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: docs@python Nosy List: JoshN, docs@python, josh.r, martin.panter, pitrou
Priority: normal Keywords:

Created on 2016-04-06 18:07 by JoshN, last changed 2022-04-11 14:58 by admin.

Messages (10)
msg262951 - (view) Author: (JoshN) Date: 2016-04-06 18:07
Creating a socket in one thread and sharing it with another will cause the socket to corrupt as soon as the thread it was created in exits.

Example code:
import socket, threading, time, os

def start():
    a = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    a.bind(("", 8080))
    a.set_inheritable(True)

    thread = threading.Thread(target=abc, args=(a.fileno(),))
    thread.start()

    time.sleep(2)
    print("Main thread exiting, socket is still valid: " + str(a) + "\n")

def abc(b):
    sock = socket.socket(fileno=b)
    for _ in range(3):
        print("Passed as an argument:" + str(sock) + "\n=====================")

        time.sleep(1.1)

start()

Note that, as soon as the main thread exits, the socket isn't closed, nor is the fd=-1, etc. Doing anything with this corrupted object throws WinError 10038 ('operation performed on something that is not a socket').

I should note that the main thread exiting doesn't seem to be the cause, it is the original object containing the socket going out of scope that causes the socket to become corrupted.

-JoshN
msg262957 - (view) Author: Josh Rosenberg (josh.r) * (Python triager) Date: 2016-04-06 19:30
You used the `fileno` based initialization in the child, which creates a wrapper around the same file descriptor without duplicating it, so when the first socket disappears, that file descriptor becomes invalid.

I think this is a doc bug more than a behavior bug; the docs say "Unlike socket.fromfd(), fileno will return the same socket and not a duplicate." which almost seems like the idea is that the Python level socket object it returns is cached in some way that allows it to be looked up by file descriptor (making mysock2 = socket.socket(fileno=mysock.fileno()) equivalent to mysock2 = mysock), but what it really means is that there are two Python level socket objects referencing the same C level file descriptor; the normal cleanup behavior still applies though, so the first Python level socket object to be destroyed also closes the file descriptor, leaving the other socket object in a broken state.

The correct approach to this would be to just pass the socket object to the thread directly, or pass along the address family and type and use socket.fromfd (which dups the underlying file descriptor).
msg262958 - (view) Author: Josh Rosenberg (josh.r) * (Python triager) Date: 2016-04-06 19:32
For source reference, the behavior for this case is to just copy out the file descriptor and stick it in a new socket object ( https://hg.python.org/cpython/file/3.5/Modules/socketmodule.c#l4289 ); no work is being done to somehow collaboratively manage the file descriptor to ensure it remains alive for the life of the socket object you're creating.
msg262968 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-04-06 22:12
The documentation already says “Sockets are automatically closed when they are garbage-collected”. If for some reason you want to release a socket object but keep the file descriptor open, I suggest socket.detach(). Otherwise, pass the original socket, not the fileno.

I think this is at best a documentation issue, if you have any suggestions.
msg262969 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-04-06 22:15
Also, if you enable warnings (e.g. python -Wall), you should see that the socket is being closed:

-c:22: ResourceWarning: unclosed <socket.socket fd=3, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=0, laddr=('0.0.0.0', 8080)>
msg262973 - (view) Author: (JoshN) Date: 2016-04-07 00:32
I do understand that the docs are a bit strange on the issue. For example, actually testing the line you referenced ("...fileno will return the same socket and not a duplicate.") by creating 2 sockets and testing sameness with the 'is' operator returns false.

I tried to trim the example code as much as possible - I did test disabling the garbage collector, playing with inheritance, etc, but trimmed them out as they didn't have any effect on my system.


I think my main issue was, when this occurs, the socket 'breaks' as you mentioned instead of closing. Was almost sure it was a bug. Using detach works for this UDP example, but I wasn't sure if detaching the socket actually closes it (e.g. in a stream oriented connection).

So this is considered normal behavior then?
msg262976 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-04-07 02:24
Yes, I think this is the expected behaviour, and I can’t think of any improvements that could be made. If you call fileno(), you have to ensure that you don’t close the file descriptor until you have finished using it. It is a bit like accessing memory after it has been freed. Python doesn’t make raw memory addresses easily accessible, but it does make fileno() accessible without much protection.

Perhaps there is some confusion about the term socket. Normally (without using the fileno=... parameter), Python’s socket() constructor does two things. First, it creates a new OS socket using the socket() system call (or Winsock equivalent), which returns a file descriptor or handle (an integer). Then, it creates a Python socket object, which wraps the file descriptor.

When you use socket(fileno=...), only the second step is taken. You get a _new_ socket object, which wraps the given existing OS socket file descriptor. So when it says “the same socket”, I think it means the same OS-level socket. It still creates a new Python object.
msg262980 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2016-04-07 06:44
The general answer here is you should avoid mixing calls to different abstraction layers. Either use only the file descriptor or only the socket object.

This is not limited to lifetime issues, other issues can occur. For example, setting a timeout on a socket puts the underlying file descriptor in non-blocking mode. So code using the file descriptor can fail with EAGAIN.

If you really want to use *both* a file descriptor and a socket object, you can use os.dup() on the file descriptor, so that the OS resources are truly independent.
msg263009 - (view) Author: (JoshN) Date: 2016-04-08 07:00
Josh/Martin/Antoine: Thank you for the tips - I was not aware of the underlying mechanics, especially the separate abstraction layers.

I did RTFM up and down before posting this, to be sure. My apologies for the inconvenience.
msg263010 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2016-04-08 08:58
No need to apologize :) Perhaps we need to make the docs a bit clearer.
History
Date User Action Args
2022-04-11 14:58:29adminsetgithub: 70890
2016-04-08 08:58:54pitrousetmessages: + msg263010
2016-04-08 07:00:16JoshNsetmessages: + msg263009
2016-04-07 06:44:52pitrousetmessages: + msg262980
2016-04-07 02:24:45martin.pantersetmessages: + msg262976
2016-04-07 00:32:35JoshNsetmessages: + msg262973
2016-04-06 22:15:24martin.pantersetmessages: + msg262969
2016-04-06 22:12:00martin.pantersetnosy: + docs@python, martin.panter
messages: + msg262968

assignee: docs@python
components: + Documentation, - IO
type: crash -> behavior
2016-04-06 19:32:57josh.rsetmessages: + msg262958
2016-04-06 19:30:06josh.rsetnosy: + josh.r
messages: + msg262957
2016-04-06 18:29:02SilentGhostsetnosy: + pitrou
2016-04-06 18:11:14JoshNsettitle: Socket state corrupts when original assignment goes out of scope -> Socket state corrupts when original socket object goes out of scope in a different thread
2016-04-06 18:07:05JoshNcreate