classification
Title: Add socket.fdtype()
Type: enhancement Stage: patch review
Components: Library (Lib) Versions: Python 3.7, Python 3.6
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: SilentGhost, christian.heimes, martin.panter, nascheme
Priority: normal Keywords:

Created on 2016-06-23 18:33 by nascheme, last changed 2017-04-28 20:40 by nascheme.

Files
File name Uploaded Description Edit
fromfd2.txt nascheme, 2016-06-23 19:44
fromfd2_v2.txt nascheme, 2016-06-24 17:17 review
fromfd2_v3.txt nascheme, 2016-06-28 21:02 revised patch review
fromfd2_v4.txt nascheme, 2016-07-07 17:27 review
fromfd2_v5.txt nascheme, 2016-07-14 17:31 review
Pull Requests
URL Status Linked Edit
PR 1333 closed nascheme, 2017-04-27 21:39
PR 1348 open nascheme, 2017-04-28 16:54
Messages (16)
msg269131 - (view) Author: Neil Schemenauer (nascheme) * (Python committer) Date: 2016-06-23 18:33
When implementing server software that inherits a socket via a file descriptor, it is useful to be able to query the descriptor and find out what kind of socket has been passed.  This can be done with getsockopt() and getsockname().  Python does not expose a clean way to do this.

One example use case is receiving an open socket from systemd.  For example, systemd will pass the open socket using file descriptor 3.  It would be handy if the Python server did not have to hard-code the type of the socket but instead could determine it.  This patch provides a function that, when given an integer file descriptor, returns a Python socket object with the correct family and kind/type attributes.  This seems like a useful, high-level interface.

This patch adds two new functions to the socket module.  I'm not so happy about the names.  Suggestions welcome.  The fromfd2() function does not duplicate the file descriptor.  I think this was a design mistake in fromfd().  If it wasn't for the duplication, I would change fromfd() to have None as default arguments and then use fdtype() to get the family, kind and protocol.  The fdtype() function returns (family, type, proto) for an integer file descriptor.

There does not appear to be any way to query the protocol of the passed socket.  I just assume it is zero.
msg269140 - (view) Author: Neil Schemenauer (nascheme) * (Python committer) Date: 2016-06-23 19:44
Add documentation for new functions.
msg269143 - (view) Author: SilentGhost (SilentGhost) * Date: 2016-06-23 20:13
The patch applies cleanly, not sure why the rietveld link doesn't appear. Perhaps because of the extension?

In test, I noticed, that you're not testing that OSError is raised, perhaps something to add?
msg269149 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-06-24 02:33
Maybe if the patch was regenerated with Mercurial off a public revision, it would work with Rietveld. I have seen non-Mercurial patches work too though. Perhaps this one doesn’t because the line numbers are off compared to the default branch?

See also Issue 18391 discussing extending the existing fromfd() function to automatically determine the socket parameters.

I agree that the name fromfd2() is a bit ugly, although there is precedence at least in Python 2: urllib2, os.popen2(), popen2 module, etc. Maybe other bike shed colours might be from_existing_fd(), using_fd(), fdopen(). Something that suggests creating an object from an FD without duplication.

I wonder if fdtype() is really needed as a public API.

The new fromfd2() API is missing when you do “from socket import * ”. There is a new function <https://docs.python.org/3.6/library/test.html#test.support.check__all__> that may be useful to automate checking for this error.

What prevents this from working on Windows? I understand there is less of a need, but I imagine it would be relatively easy to support.

It looks like some platforms support SO_PROTOCOL and/or SO_PROTOTYPE to get a socket’s protocol: <http://austingroupbugs.net/view.php?id=840>. Also, there might be a Windows-specific SO_PROTOCOL_INFO option.

>>> s
<socket.socket fd=3, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=0, laddr=('0.0.0.0', 0)>
>>> SO_PROTOCOL = 38  # On Linux x86-64; see Issue 18391
>>> protocol_struct = Struct("I")
>>> protocol = s.getsockopt(SOL_SOCKET, SO_PROTOCOL, protocol_struct.size)
>>> protocol_struct.unpack(protocol)
(6,)
>>> IPPROTO_TCP
6

IMO if Python grows the ability to determine a socket’s protocol, it might be better for socket.proto to use it automatically, at least if proto=0 was specified to the socket() function.

diff --git a/Doc/library/socket.rst b/Doc/library/socket.rst
+.. function:: fromfd2(fd)
+
+   Create a socket object from a file descriptor.  The file descriptor is
+   not duplicated.  The family, type and protocol for the socket are
This needs fixing about how and if the protocol is determined.

+   determined from the file descriptor.  If the file descriptor is not a
+   socket, OSError is raised.  The socket is assumed to be in blocking mode.
:exc:`OSError`

+   The newly created socket is :ref:`non-inheritable <fd_inheritance>`.
This does not seem to be true, nor desirable:

>>> s = socket()
>>> s.get_inheritable()
False
>>> s.set_inheritable(True)
>>> ss = fromfd2(s.fileno())
>>> ss.get_inheritable()
True
>>> s.get_inheritable()
True

+   .. versionadded:: 3.6
+      The returned socket is now non-inheritable.
The whole function is new, not just process inheritance.

diff --git a/Lib/test/test_socket.py b/Lib/test/test_socket.py
+            with s:
+                s2 = socket.fromfd2(s.fileno())
+                with s2:
Isn’t this incorrect due to closing the same file descriptor twice? I think you should use socket.detach(). See Issue 26685, whose changes should raise an exception for this bug in 3.6.

diff --git a/Modules/socketmodule.c b/Modules/socketmodule.c
+    if (!S_ISSOCK(st_fd.st_mode)) {
+        PyErr_SetString(PyExc_ValueError, "fdtype: "
+                        "file descriptor is not a socket");
This seems to be contrary to your documentation. Also, perhaps it is good enough to let getsockopt() later raising ENOTSOCK?

+    union sockaddr_union sockaddr = {};
Why not just use a struct sockaddr? All we need is the sa.sa_family field right?
msg269196 - (view) Author: Neil Schemenauer (nascheme) * (Python committer) Date: 2016-06-24 17:17
Thank you for the excellent review Martin.  I'm attaching a new patch which I think addresses your points.  I created it from hg, maybe that works nicer.

I've added constants for SO_DOMAIN, SO_PROTOCOL, SO_PASSCRED, SO_PEERSEC, and SO_PASSEC.

Using SO_PROTOCOL to get the protocol value seems to work fine on Linux.

I've dropped the union as you suggest.  I think the docs have been fixed.

I would be okay with making fdtype() a non-public function.  It seems possibly useful though.

Regarding the name, currently I think fromfd2 is best.  It really does what fromfd() should have done if SO_TYPE/SO_PROTOCOL was available and portable.  

Extending fromfd() seems a bad idea.  I don't want dup() and it seems better that application code can use hasattr() to test for new functionality.  You can work around not having fromfd2() by using getsockname() and then using some ugly text matching to work out what kind of socket you have.
msg269219 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-06-25 02:29
Okay, I guess the fdtype() API could be useful if we don’t alter fromfd(), i.e. as a workaround for Issue 18391.

IMO the specific SO_* constants should at least be listed in the documentation, so users know which versions of Python have which constants. Although exposing them is not necessary for the fromfd() implementation.

See also more comments on the code review.
msg269451 - (view) Author: Neil Schemenauer (nascheme) * (Python committer) Date: 2016-06-28 21:02
I've revised the patch based on the second round of comments from Martin.  I've removed the ifdef test for CO_TYPE and assumed it is always available.  That means fdtype() should be available on all platforms.

I did not change the test as suggested to call s.detach(), I think my test code should work okay.  The compare of 'protocol' has been removed from the test.

The duplicate definition of SO_PASSCRED has been removed.  I didn't add the constants to the documentation as other constants are only documented on a wildcard basis (e.g. SO_*).
msg269456 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-06-29 00:25
The s.detach() thing isn’t a big deal, but I thought it would simplify the code (eliminate the need for the “finally” clause).

I will try to propose a minor documentation update for SO_PASSCRED as a starting point for adding the others.
msg269953 - (view) Author: Neil Schemenauer (nascheme) * (Python committer) Date: 2016-07-07 17:27
Adding yet another revised patch.  I think this is ready to commit, if someone would like to do it.  The documentation for constants can be added as a separate commit, if Martin wants.  I think the generic SO_* style documentation is okay.

Changes in this version:
- fix documentation wording
- add tests for different protocol types

It would be really good(TM) if someone could test this on other platforms.  I only tested on vanilla Debian Linux 3.16.0.
msg270155 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-07-11 03:33
Left some minor code review nitpicks.

I opened Issue 27409 with an attempt at documenting exactly which SO_* etc symbols may be available. But so far I haven’t got any positive or negative feedback. If it were up to me, I would either commit everything except the new SO_* constants, or briefly list them in the documentation.

One more thing that occurred to me is maybe we should check for EINVAL from SO_PROTOCOL. That option was apparently added in Linux 2.6.32, and Free BSD 8.4, 9.1, 10.0. Even if you think these version numbers are too old to worry about, what happens if a Python package is compiled with a new (e.g.) OS X or Windows version that supports SO_PROTOCOL, and then run on an old (existing) OS version?
msg270426 - (view) Author: Neil Schemenauer (nascheme) * (Python committer) Date: 2016-07-14 17:19
I just tested on Windows.  fdtype() fails with:
OSError: [WinError 10022] An invalid argument was supplied

The getsockname() call fails with WSAGetLastError() == 10022.  getsockname() is used to find the address family.  Perhaps there is some other way to get it on Windows.

The easiest fix would be to #ifdef MS_WINDOWS out the whole socket_fdtype() function.  Passing file descriptors around is not a thing on Windows anyhow.

There is still the issue of unexpected errors getting returned inside fdtype() (e.g. as suggested by Martin).  One idea is to commit the code as is and fix errors as they appear.
msg270429 - (view) Author: Neil Schemenauer (nascheme) * (Python committer) Date: 2016-07-14 17:31
Updated patch, v5.  Disable fdtype() function on Windows.  Fix documentation nits as suggested by review of v4.
msg270432 - (view) Author: Neil Schemenauer (nascheme) * (Python committer) Date: 2016-07-14 18:31
Tested on FreeBSD 10.3.  getsockname() on a IPPROTO_SCTP protocol socket returns errno = FileNoFoundError.  We could just comment out that test I guess.

My theory is that on FreeBSD, getsockname() on an SCTP socket fails if it is not bound.  Indeed, adding a bind(('127.0.0.1', 11234)) call before the fdtype() call makes the test pass.
msg270669 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-07-17 23:52
The Windows problem, error 10022 = WSAEINVAL from getsockname(), seems to be documented at <https://msdn.microsoft.com/en-us/library/windows/desktop/ms738543%28v=vs.85%29.aspx>: “The socket has not been bound”.

Regarding the SCTP problem, raising an error seems like an OS bug to  me. However, according to POSIX <http://pubs.opengroup.org/onlinepubs/9699919799/functions/getsockname.html>, we cannot rely on getsockname() indicating any particular address family if the socket is unbound.

So on some platforms, it seems like it will only work with bound sockets, or you have to use less-standard options like SO_PROTOCOL_INFO or SO_DOMAIN. Although SO_DOMAIN won’t help on Free BSD. The simplest solution may be to document and test that it only works with bound sockets.

For the problem with SO_PROTOCOL being unsupported at run-time, I think it would be better to anticipate EINVAL now, rather than wait for someone to complain. But it is not a big deal.
msg276631 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2016-09-15 22:22
I opened #28134 because I was running into a similar issue with socket.socket(fileno). Since this new feature didn't make it into 3.6, can we at least change the constructor of socket.socket() to set a correct family, type and proto?

Any fix is better than no fix. It doesn't have to work for all socket types on all operating systems. The feature is mostly used for socket inheritance, systemd socket activation or fd transfer over Unix sockets of AF_INET, AF_INET6 and AF_UNIX file descriptors.
msg292549 - (view) Author: Neil Schemenauer (nascheme) * (Python committer) Date: 2017-04-28 20:40
Changing title, I think #28134 (socket.socket(fileno=fd) does not work as documented) is a cleaner solution.  However, exposing the ability to query a socket file descriptor seems useful so I'm still proposing to add socket.fdtype().

The github PR 1348 has been modified to remove fromfd2() and only add the function fdtype().
History
Date User Action Args
2017-04-28 20:40:40naschemesetmessages: + msg292549
title: Add smarter socket.fromfd() -> Add socket.fdtype()
2017-04-28 16:54:12naschemesetpull_requests: + pull_request1460
2017-04-27 21:39:56naschemesetpull_requests: + pull_request1441
2016-09-15 22:22:00christian.heimessetnosy: + christian.heimes

messages: + msg276631
versions: + Python 3.7
2016-07-17 23:52:53martin.pantersetmessages: + msg270669
2016-07-14 18:31:39naschemesetmessages: + msg270432
2016-07-14 17:31:43naschemesetfiles: + fromfd2_v5.txt

messages: + msg270429
2016-07-14 17:19:36naschemesetmessages: + msg270426
2016-07-11 03:33:29martin.pantersetmessages: + msg270155
2016-07-07 17:27:16naschemesetfiles: + fromfd2_v4.txt

messages: + msg269953
2016-06-29 00:25:07martin.pantersetmessages: + msg269456
2016-06-28 21:02:24naschemesetfiles: + fromfd2_v3.txt

messages: + msg269451
2016-06-25 02:29:45martin.pantersetmessages: + msg269219
2016-06-24 17:17:11naschemesetfiles: + fromfd2_v2.txt

messages: + msg269196
2016-06-24 02:33:57martin.pantersetnosy: + martin.panter
messages: + msg269149
2016-06-23 20:13:40SilentGhostsetnosy: + SilentGhost
messages: + msg269143
2016-06-23 19:44:31naschemesetfiles: - fromfd2.txt
2016-06-23 19:44:18naschemesetfiles: + fromfd2.txt

messages: + msg269140
2016-06-23 18:33:36naschemecreate