classification
Title: TCP listening sockets created without FD_CLOEXEC flag
Type: enhancement Stage: needs patch
Components: Library (Lib) Versions: Python 3.4
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Christophe.Devriese, gregory.p.smith, haypo, nadeem.vawda, neologix, pitrou
Priority: normal Keywords: patch

Created on 2011-05-19 10:09 by Christophe.Devriese, last changed 2013-08-27 23:24 by haypo. This issue is now closed.

Files
File name Uploaded Description Edit
socketserver_close_on_exec.patch haypo, 2013-01-09 00:07 review
socketserver_close_on_exec-2.patch haypo, 2013-01-09 00:36 review
Messages (20)
msg136273 - (view) Author: Christophe Devriese (Christophe.Devriese) Date: 2011-05-19 10:09
The specific issue this is creating is that a malicious user could use this socket in a subprocess which is started from a library (ie. I'm using a .so, which calls fork/exec).

A second failure mode is starting a daemon from withing, say, a django application. Djano opens a TCP listening socket, then starts up a daemon to provide some sort of service in the background. The daemon keeps running and inherits the socket. Now you restart the django app.

It refuses to start ! Why ? Because the socket was inherited, the listening socket isn't actually closed, and this results in the socket being stuck in CLOSE_WAIT as long as the daemon is running.

It seems to me that it is almost never the case that you'd want a TCP listening socket to be preserved across exec, and not setting this flag should thus be considered a bug for 2 reasons :

1) it results in accidental disclosure of information that shouldn't be exposed in certain cases.
2) it can result in denial of service

Solution : 

update SocketServer.py :
  in the class TCPServer
  add the following 2 lines in __init__ after self.socket = socket( ...:
    flags = fcntl.fcntl(self.socket, fcntl.F_GETFD)
    fcntl.fcntl(self.socket, fcntl.F_SETFD, flags | fcntl.FD_CLOEXEC)
msg136341 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2011-05-19 22:55
Hello Christophe,

First and foremost, I think that the FD_CLOEXEC approach is terminally broken, as it should have been the default in Unix. Now, we're stuck with this bad design.
But we can't simply change the default to FD_CLOEXEC, for two reasons:
- we can't silently change the Unix semantics
- this is going to break some applications: for example, FD inherited across exec is used by super servers such as inetd, and there are others very legitimate uses

>  in the class TCPServer
>  add the following 2 lines in __init__ after self.socket = socket( ...:
>    flags = fcntl.fcntl(self.socket, fcntl.F_GETFD)
>    fcntl.fcntl(self.socket, fcntl.F_SETFD, flags | fcntl.FD_CLOEXEC)

There are at least two problems with this approach:
1) there's a race between the socket creation and the call to fcntl
2) accept doesn't necessarily inherit the FD_CLOEXEC flag

1) can be fixed on systems that support it through SOCK_CLOEXEC
2) can be fixed on systems that support it through accept4(), but it seems to break badly on some systems, see issue #10115

But I think this is a perfectly legitimate request, so one approach to tackle this problem could be:
- since accept4() seems to fail so badly in some configurations, the only portable and reliable choice left is probably to call accept() then fcntl(FD_CLOEXEC) (there's a race, but it's better than nothing). We might reconsider this syscall in a couple years when we're sure it's implemented correctly
- in the socketserver module, add a new set_socket_cloexec attribute to BaseServer, which would do the right thing (i.e. create the socket with SOCK_CLOEXEC if available, otherwise call fcntl(FD_CLOEXEC)), and in TCPServer, call fcntl(FD_CLOEXEC) after accept.

That way, this would at least fix the problem for people using the socketserver module. People using sockets directly of course have the option of using SOCK_CLOEXEC and fcntl(FD_CLOEXEC) explicitely in their code.

Gregory, any thoughts on this?
msg136369 - (view) Author: Christophe Devriese (Christophe.Devriese) Date: 2011-05-20 12:31
I realize this bugreport cannot fix 35 years of a bad design decision in
linux. That's not the intention (that's a gordian knot I *will* be keeping a
safe distance from). The intention is to create a saner default situation
for most python programs.

Christophe

2011/5/20 Charles-François Natali <report@bugs.python.org>

>
> Charles-François Natali <neologix@free.fr> added the comment:
>
> Hello Christophe,
>
> First and foremost, I think that the FD_CLOEXEC approach is terminally
> broken, as it should have been the default in Unix. Now, we're stuck with
> this bad design.
> But we can't simply change the default to FD_CLOEXEC, for two reasons:
> - we can't silently change the Unix semantics
> - this is going to break some applications: for example, FD inherited
> across exec is used by super servers such as inetd, and there are others
> very legitimate uses
>
> >  in the class TCPServer
> >  add the following 2 lines in __init__ after self.socket = socket( ...:
> >    flags = fcntl.fcntl(self.socket, fcntl.F_GETFD)
> >    fcntl.fcntl(self.socket, fcntl.F_SETFD, flags | fcntl.FD_CLOEXEC)
>
> There are at least two problems with this approach:
> 1) there's a race between the socket creation and the call to fcntl
> 2) accept doesn't necessarily inherit the FD_CLOEXEC flag
>
> 1) can be fixed on systems that support it through SOCK_CLOEXEC
> 2) can be fixed on systems that support it through accept4(), but it seems
> to break badly on some systems, see issue #10115
>
> But I think this is a perfectly legitimate request, so one approach to
> tackle this problem could be:
> - since accept4() seems to fail so badly in some configurations, the only
> portable and reliable choice left is probably to call accept() then
> fcntl(FD_CLOEXEC) (there's a race, but it's better than nothing). We might
> reconsider this syscall in a couple years when we're sure it's implemented
> correctly
> - in the socketserver module, add a new set_socket_cloexec attribute to
> BaseServer, which would do the right thing (i.e. create the socket with
> SOCK_CLOEXEC if available, otherwise call fcntl(FD_CLOEXEC)), and in
> TCPServer, call fcntl(FD_CLOEXEC) after accept.
>
> That way, this would at least fix the problem for people using the
> socketserver module. People using sockets directly of course have the option
> of using SOCK_CLOEXEC and fcntl(FD_CLOEXEC) explicitely in their code.
>
> Gregory, any thoughts on this?
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue12107>
> _______________________________________
>
msg136375 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2011-05-20 14:19
> That's not the intention (that's a gordian knot I *will* be keeping
> a
> safe distance from). The intention is to create a saner default situation
> for most python programs.

I understand what you're saying, and I agree with you on the principle.
But the point I was making is that we can't make this the "default
situation", for the reasons explained above (we can't change the
default socket semantics, and it's going to break some applications,
which is unacceptable).
That's why I was suggesting to add this as an option to the
socketserver module (since that's the piece of code you were referring
to).
The Python interpreter is not a simple standalone library/application,
we can't just make those kind of changes.

P.S.: please avoid sending html messages, they result in a spurious
"unnamed" file being attached.
msg136445 - (view) Author: Christophe Devriese (Christophe.Devriese) Date: 2011-05-21 14:35
It would already be a nice piece of progress if you could request the
SO_CLOEXEC (with fallback to FD_CLOEXEC), say, in the constructor, or even
with a module variable. I hope at least this change can make it in, so that
we have a decent in-python solution that can be used from within, say,
django.

Christophe

On Sat, May 21, 2011 at 1:24 AM, Antoine Pitrou <report@bugs.python.org>wrote:

>
> Changes by Antoine Pitrou <pitrou@free.fr>:
>
>
> Removed file: http://bugs.python.org/file22038/unnamed
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue12107>
> _______________________________________
>
msg136446 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-05-21 14:43
The request is reasonable, but the best bet is for someone interested to provide a patch. See http://docs.python.org/devguide/ if you aren't acquainted with the development process.
msg136447 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2011-05-21 14:50
$ cat test_sock.py 
import socket
import fcntl


with socket.socket(socket.AF_INET, socket.SOCK_STREAM|socket.SOCK_CLOEXEC) as s:
    print(bool(fcntl.fcntl(s, fcntl.F_GETFD) & fcntl.FD_CLOEXEC))

$ ./python test_sock.py 
True
msg136870 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2011-05-25 16:46
So, SOCK_CLOEXEC is available.
Note that I don't like the idea of falling back to FD_CLOEXEC since
it's not atomic, and some people might rely on this.
Can we close this issue?
msg136872 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-05-25 16:53
> So, SOCK_CLOEXEC is available.
> Note that I don't like the idea of falling back to FD_CLOEXEC since
> it's not atomic, and some people might rely on this.
> Can we close this issue?

Well, this is apparently a feature request for socketserver.TCPServer.
I don't see any problem in adding a best-effort option to add the cloexec flag, possibly atomically, and fall back on FD_CLOEXEC.

People who "rely on this" can only do it if their system supports it anyway.
msg136873 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-05-25 17:06
For the record, the best effort approach is already used in subprocess (check subprocess_cloexec_pipe() in Modules/_posixsubprocess.c).

Speaking of which, the os module could expose the pipe2() function.
msg136879 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2011-05-25 17:59
> Well, this is apparently a feature request for socketserver.TCPServer.

Honestly, I'm not sure what this request is about.
The original request seemed to imply this should be made the default. I don't agree, and think this should be made optional. Also, the title is much more general: "TCP listening sockets created without FD_CLOEXEC flag".

> I don't see any problem in adding a best-effort option to add the cloexec flag, possibly atomically, and fall back on FD_CLOEXEC.
> People who "rely on this" can only do it if their system supports it anyway.

I wasn't clear. I thought we were talking about the socket constructor. In that case, yes, we could do that for socketserver, but with an explicit warning (because with a cloexec flag to a constructor, people will unconsciously expect it to behave like SOCK_CLOEXEC and be atomic).

> Speaking of which, the os module could expose the pipe2() function.

I'll work on a patch.
msg179390 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-01-08 23:09
See also #5715 which asks for something similar.
msg179396 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-01-08 23:26
"Note that I don't like the idea of falling back to FD_CLOEXEC since
it's not atomic, and some people might rely on this."

There is a similar question (atomic or best effort) for the #16860 issue. It looks like more people prefer the best effort option.

IMO most people prefer to use the best available method to set the close-on-exec flag, but doesn't really care if the method is not atomic (and the OS doesn't provide an atomic function).

If someone cannot accept the best effort option: I guess that it is still possible to override TCPServer constructor to ensure that the action is atomic.

"It would already be a nice piece of progress if you could request the
SO_CLOEXEC (with fallback to FD_CLOEXEC), say, in the constructor, or even with a module variable."

Yes, close-on-exec flag should be configurable, as TCPServer.allow_reuse_address for example. I like the class attribute, something like TCPServer.close_on_exec, False by default, for example.

--

"I realize this bugreport cannot fix 35 years of a bad design decision in linux."

Well... Ruby made a brave choice :-) Ruby (2.0?) does set
close-on-exec flag on *ALL file descriptors (except 0, 1, 2) *by
default*:
http://bugs.ruby-lang.org/issues/5041

This change solves the problem of having to close all file descriptor
after a fork to run a new program (see closed Python issues #11284 and
#8052)... if you are not using C extensions creating file descriptors?

Ruby applications relying on passing FD to child processes have to
explicitly disable close-on-exec the flag: it was done in Unicorn for
example.

"But we can't simply change the default to FD_CLOEXEC, for two reasons:
- we can't silently change the Unix semantics
- this is going to break some applications: for example, FD inherited across exec is used by super servers such as inetd, and there are others very legitimate uses"

Yes, changing the default value of the flag (enable close-on-exec by default) will break some applications. But if we chose to do it, developers of these applications can still disable close-on-exec (because the option must be configurable) to fix their application for Python 3.4.
msg179397 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-01-08 23:29
Oh, by the way: I never understood why SimpleXMLRPCServer does set FD_CLOEXEC (since the issue #1222790), whereas TCPServer doesn't.
msg179402 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-01-09 00:07
socketserver_close_on_exec.patch:
 - Add TCPServer.close_on_exec class attribute: True by default if the fcntl module is present, False otherwise
 - Use SOCK_CLOEXEC if present, fcntl() + FD_CLOEXEC otherwise

Even if SOCK_CLOEXEC is present, fcntl() is used to check the flag "works" (if FD_CLOEXEC was set). If SOCK_CLOEXEC works, fcntl() will not be called anymore, otherwise we fall back to fcntl() + FD_CLOEXEC. I implemented this fallback for Linux older than 2.6.27. I don't know if SOCK_CLOEXEC is simply ignored, as Linux did for O_CLOEXEC, but I don't have such old Linux version to test.

I chose to set close-on-exec flag *by default*. If we chose to disable it by default, the following changes should be done on my patch:
 - TCPServer.close_on_exec : "close_on_exec = (fcntl is not None)" => "close_on_exec = False"
 - SimpleXMLRPCServer: add close_on_exec class attribute, "close_on_exec = (fcntl is not None)" (and restore try/except ImportError for fcntl)
msg179403 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-01-09 00:09
> There is a similar question (atomic or best effort) for the #16860 issue. It looks like more people prefer the best effort option.

Oops, it's the issue #16850: Add "e" mode to open(): close-and-exec (O_CLOEXEC) / O_NOINHERIT
msg179404 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-01-09 00:36
Oh, I completly forgot the client socket for TCP servers: updated patch. The patch version 2 uses fcntl()+FD_CLOEXEC after accept(). It is not atomic, but I prefer to leave to problem to issue #10115:

"can be fixed on systems that support it through accept4(), but it seems to break badly on some systems, see issue #10115"
msg179428 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2013-01-09 09:38
> "I realize this bugreport cannot fix 35 years of a bad design decision in linux."
>
> Well... Ruby made a brave choice :-) Ruby (2.0?) does set
> close-on-exec flag on *ALL file descriptors (except 0, 1, 2) *by
> default*:
> http://bugs.ruby-lang.org/issues/5041
>
> This change solves the problem of having to close all file descriptor
> after a fork to run a new program (see closed Python issues #11284 and
> #8052)... if you are not using C extensions creating file descriptors?

> "But we can't simply change the default to FD_CLOEXEC, for two reasons:
> - we can't silently change the Unix semantics
> - this is going to break some applications: for example, FD inherited across exec is used by super servers such as inetd, and there are others very legitimate uses"
>
> Yes, changing the default value of the flag (enable close-on-exec by default) will break some applications. But if we chose to do it, developers of these applications can still disable close-on-exec (because the option must be configurable) to fix their application for Python 3.4.

Changing all FDs to close-on-exec by default would be a bold move (and
even just for SocketServer).
It ought to be default, but it's going to break some applications.
That's definitely something which needs discussion (at not only on the
tracker, I would could bring it up on python-dev).
msg179436 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-01-09 12:50
> That's definitely something which needs discussion
> (at not only on the tracker, I would could bring it up on python-dev).

I agree, I started a thread on python-dev mailing list:
https://mail.python.org/pipermail/python-dev/2013-January/123552.html
msg196331 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-08-27 23:24
Sockets are now created non-inheritable in Python 3.4. See the PEP 446 for more information (PEP implemented in the issue #18571).
History
Date User Action Args
2013-08-27 23:24:06hayposetstatus: open -> closed
resolution: fixed
messages: + msg196331
2013-01-09 12:50:07hayposetmessages: + msg179436
2013-01-09 09:38:29neologixsetmessages: + msg179428
2013-01-09 00:36:58hayposetfiles: + socketserver_close_on_exec-2.patch

messages: + msg179404
2013-01-09 00:09:22hayposetmessages: + msg179403
2013-01-09 00:07:26hayposetversions: + Python 3.4, - Python 3.3
2013-01-09 00:07:17hayposetfiles: + socketserver_close_on_exec.patch
keywords: + patch
messages: + msg179402
2013-01-08 23:29:19hayposetmessages: + msg179397
2013-01-08 23:26:00hayposetmessages: + msg179396
2013-01-08 23:09:52hayposetnosy: + haypo
messages: + msg179390
2011-05-25 17:59:30neologixsetmessages: + msg136879
2011-05-25 17:06:13pitrousetmessages: + msg136873
2011-05-25 16:53:15pitrousetmessages: + msg136872
2011-05-25 16:46:35neologixsetmessages: + msg136870
2011-05-21 14:50:27neologixsetmessages: + msg136447
2011-05-21 14:43:28pitrousetfiles: - unnamed
2011-05-21 14:43:17pitrousettype: security -> enhancement
components: + Library (Lib)
versions: + Python 3.3
nosy: + pitrou

messages: + msg136446
stage: needs patch
2011-05-21 14:35:32Christophe.Devriesesetfiles: + unnamed

messages: + msg136445
2011-05-20 23:24:52pitrousetfiles: - unnamed
2011-05-20 14:19:48neologixsetmessages: + msg136375
2011-05-20 12:31:44Christophe.Devriesesetfiles: + unnamed

messages: + msg136369
2011-05-19 22:55:48neologixsetmessages: + msg136341
2011-05-19 20:23:50pitrousetnosy: + gregory.p.smith, neologix
2011-05-19 10:15:32nadeem.vawdasetnosy: + nadeem.vawda
2011-05-19 10:09:57Christophe.Devriesecreate