This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Problem with SocketIO for closing the socket
Type: behavior Stage: patch review
Components: Library (Lib) Versions: Python 3.0
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: gregory.p.smith Nosy List: amaury.forgeotdarc, ggenellina, gregory.p.smith, jhylton, loewis, romkyns, vstinner, zanella
Priority: release blocker Keywords: patch

Created on 2008-09-09 21:21 by romkyns, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
breakage.py romkyns, 2008-09-09 21:21 Example reproducing the issue
issue3826_gps05.diff gregory.p.smith, 2008-11-29 23:06 fix for socket.SocketIO and unit tests
test_httpclose_py3k.py ggenellina, 2008-11-30 07:37
socket_real_close-5.patch vstinner, 2009-01-06 01:13
Messages (28)
msg72917 - (view) Author: (romkyns) Date: 2008-09-09 21:21
See example code attached. A very basic http server responds with an XML
document to any GET request. I think what happens is that the connection
remains open at the end of the request, leading the browser to believe
there's more data to come (and hence not displaying anything).

What's curious is that commenting out the single line "self.dummy =
self" fixes the problem. Also, the problem occurs in 3.0b2 but not in 2.5.2.

To reproduce:

1. Run the example code on 3.0b2
2. Navigate to http://localhost:8123/
   The browser shows "Transferring data" or something similar, until it
times out.
3. Comment out the line "self.dummy = self"
4. Navigate to http://localhost:8123/
   This time it loads instantly.

Repeat with 2.5.2 - both variants work.

I don't know much at all about python internals, but it seems that 1) a
circular reference prevents the request handler from being GC'd and 2)
http.server relies on the request being GC'd to finalise the connection.
msg72980 - (view) Author: (romkyns) Date: 2008-09-10 18:11
My initial description doesn't state the actual observable problem very
clearly: the problem is that the browser gets stuck instead of
displaying the page, writing something along the lines of "Transferring
data from localhost". After many seconds it times out. Aborting the
request in Firefox causes the page to be displayed.

Also forgot to mention that it's possible to work around this by
explicitly removing all circular references after the base class'
__init__ method - so for the attached example a work-around is
"self.dummy = None" as the last line of the __init__ method.
msg73248 - (view) Author: (romkyns) Date: 2008-09-15 07:00
Someone suggested I test this in 2.6rc1. The problem does not exist in
2.6rc1, only in 3.0b2.
msg73481 - (view) Author: Gabriel Genellina (ggenellina) Date: 2008-09-21 04:14
3.0rc1 still fails.
The diagnostic is correct, the connection should be closed after 
sending the response, but isn't.
The attached unittest reproduces the error without requiring a browser.
msg73774 - (view) Author: (romkyns) Date: 2008-09-25 09:39
So the GC behaviour in this case is such that Python 3.0 can't collect
the object whereas Python 2.6 can. Is this known / expected or should
this be recorded in a separate issue?
msg73777 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-09-25 11:51
The garbage collector does collect unreachable objects.

What happens is that with python 2, the socket is explicitly closed by
the HTTPServer, whereas with python 3, the explicit close() does not
work, and the socket is ultimately closed when the request has finished
and all objects are disposed.

The cause is in the socket.makefile() function: since python3, the
underlying socket uses a reference count, so that:
    s = <a connected socket>
    f = s.makefile()
    s.close()
does not close the socket! adding f.close() is not enough. A "del f" is
necessary to really close the underlying socket.
msg73834 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2008-09-26 05:57
The whole socket._io_refs thing looks like a real design flaw.  What is
its actual intended purpose?

When close is called on the socket object itself, the socket MSUT be
closed.  Why is our API otherwise?

Its up to the programming to ensure that no other references to that
socket wrapped in whatever layers will be used after that, if they are
they'll eventually get errors when an IO occurs.


I think this behavior started with this change:

------------------------------------------------------------------------
r56714 | jeremy.hylton | 2007-08-03 13:40:09 -0700 (Fri, 03 Aug 2007) |
21 lines

Make sure socket.close() doesn't interfere with socket.makefile().

If a makefile()-generated object is open and its parent socket is
closed, the parent socket should remain open until the child is
closed, too.  The code to support this is moderately complex and
requires one extra slots in the socket object.

This change fixes httplib so that several urllib2net test cases pass
again.

Add SocketCloser class to socket.py, which encapsulates the
refcounting logic for sockets after makefile() has been called.

Move SocketIO class from io module to socket module.  It's only use is
to implement the raw I/O methods on top of a socket to support
makefile().

Add unittests to test_socket to cover various patterns of close and
makefile.

...............
later modified by this (still the same effect):
...............

------------------------------------------------------------------------
r58970 | guido.van.rossum | 2007-11-14 14:32:02 -0800 (Wed, 14 Nov 2007)
| 6 lines

Patch 1439 by Bill Janssen.  I think this will work.
Tested on Windows by Christian Heimes.
I changed the code slightly, renaming decref_socketios() to
_decref_socketios(), and moving it closer to the close() method
that it calls.  Hopefully Bill can now submit his SSL port to 3.0.
msg73849 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2008-09-26 14:18
On Thu, Sep 25, 2008 at 10:57 PM, Gregory P. Smith
<report@bugs.python.org> wrote:
> The whole socket._io_refs thing looks like a real design flaw.  What is
> its actual intended purpose?

The original purpose was to support the original semantics of the
.makefile() method on Windows which doesn't have dup() (or at least
didn't have it when this hack was first invented, many years ago). We
almost got rid of it for Py3k, but then we realized that the same
issue (sort of) exists for ssl, and that's why it's still there and
used everywhere, not just on Windows.

> When close is called on the socket object itself, the socket MSUT be
> closed.  Why is our API otherwise?

See above -- the underlying connection must remain open until the
stream created with .makefile() is also closed. (These are the
original dup-based semantics.)

Why do you say MUST? Is there an Internet standard that requires this?

> Its up to the programming to ensure that no other references to that
> socket wrapped in whatever layers will be used after that, if they are
> they'll eventually get errors when an IO occurs.

No, the .makefile() API always promised that the connection remained
open until you closed (or lost) the stream.

> I think this behavior started with this change:
>
> ------------------------------------------------------------------------
> r56714 | jeremy.hylton | 2007-08-03 13:40:09 -0700 (Fri, 03 Aug 2007) |
> 21 lines
>
> Make sure socket.close() doesn't interfere with socket.makefile().

No, it's much, much older than that. That change was probably just a
refactoring -- it's been refactored a number of times.
msg76567 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2008-11-28 23:32
Alright I've taken another fresh look at this.  I understand the dup
semantics issue and don't want to break that.

Attached are two patches, either one of these will fix the problem and
breakage.py test code attached to this bug.

Personally I prefer the socket.py patch as I think it will solve this
problem in other places it could potentially pop up.  It calls
self._sock._decref_socketios() from within the SocketIO.close() method
so that after a SocketIO returned by socket.makefile() is closed, it
will no longer prevent the underlying socket from closing.

The socketserver.py patch also works but seems like more of a hack as it
is still relies on immediate reference counted destruction.

Please review.

I'm guessing its too late in the release candidate process for 3.0 to
get this in, but we should do it for 3.0.1.
msg76568 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2008-11-28 23:34
P.S.  Gabriel Genellina (gagenellina)  -  Your comment sounded like you
had a unit test for this but it never got attached.  Still have it?
msg76569 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-11-28 23:58
I agree that the patch on socket.py is the correct fix: the raw socket 
should be detached when the close() method is called.

I have one remark on the patch:
io.IOBase.__del__ already calls close(): could SocketIO.__del__ be 
removed completely? And no need for "_need_to_decref_sock": the closed 
property should be enough.
msg76570 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2008-11-29 00:04
Indeed IOBase does call close() from its __del__.  Excellent.  That
makes this simpler.  -gps03 attached.
msg76572 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-11-29 00:58
Patch issue3826_socket-gps03.diff is OK for me.
Here is a unit test for this new behavior.

Someone should review both patches.
msg76592 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2008-11-29 13:25
I don't think this is quite right yet. If I run

import socket
s=socket.socket()
s.connect(('www.python.org',80))
x=s.makefile()
del x
del s

and put a print statement into real_close, I don't see that real_close
is invoked.
msg76595 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-11-29 14:38
But real_close() is not called either in the trivial socket use:

import socket
s=socket.socket()
s.connect(('www.python.org',80))
del s

OTOH, I added some printf statements in socketmodule.c, near all usages 
of SOCKETCLOSE(). They show that the raw socket is closed as expected - 
except in the following case:

import socket
s=socket.socket()
s.connect(('www.python.org',80))
x=s.makefile()
x.close()
del s
msg76625 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2008-11-29 21:52
Martin: socket.socket has no destructor so a call to
socket.socket._real_close() is not guaranteed.  Thats fine as its parent
class from socketmodule.c _socket.socket does the right thing in its
destructor.

Amaury: The case you show doesn't call SOCKETCLOSE() because x still
exists and holds a reference to the socket object.

In order to fix that, SocketIO needs to drop its reference on close. 
Take a look at the attached -gps04 patch.  It fixes that.
msg76631 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2008-11-29 23:06
gps05.diff includes the fix and unittests.
msg76633 - (view) Author: Gabriel Genellina (ggenellina) Date: 2008-11-30 07:37
--- El vie 28-nov-08, Gregory P. Smith <report@bugs.python.org> escribió:

> P.S.  Gabriel Genellina (gagenellina)  -  Your comment
> sounded like you
> had a unit test for this but it never got attached.  Still
> have it?

I've found it; it uses BaseHTTPRequestHandler as in the original bug report. I'm not sure it is still relevant, but I'm attaching it here anyway.
msg79209 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2009-01-05 23:12
Issue #4791 is exaclty the same problem (io reference in SocketIO): I 
proposed another patch (decrement the io reference but keep the 
self._sock object unchanged).
msg79211 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2009-01-05 23:25
Comment about issue3826_gps05.diff:
 - I don't like "magical" attributes (sometimes it does exist, 
sometimes not) even if it's a protected attribute (_sock) => use 
self._sock=None?
 - fileno() returns a fixed value whereas the socket.fileno() 
returns -1 when the socket is closed => returns -1 or raise an error 
if the socket is closed?
 - I'm not sure that your test is really working: read a closed socket 
may raise an error because of the test on self._closed, not because 
the low level socket is closed => use fileno() to check if the socket 
is closed or not
msg79213 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2009-01-05 23:29
from #python-dev discussion.

agreed, magic attributes are annoying.  also, my gps05 patch continues
to return the old fileno even after the underlying socket has been
closed.  that is a problem.

I like your patch in #4791 but lets keep both sets of our unit tests.

Also, I think the SocketIO.fileno mathod should change to:

  def fileno(self):
    no = self._sock.fileno()
    if no == -1:
      raise IOError("no file descriptor, socket is closed.")

io.IOBase.fileno already raises IOError when no file descriptor is
associated with a file so this behavior seems reasonable and more
pythonic than returning a magic -1.

thoughts?
msg79214 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2009-01-05 23:34
> Also, I think the SocketIO.fileno mathod should change to: (...)

Since SocketIO.fileno() just calls self._sock.fileno(), it would be 
easier (and better) to fix socket.socket(). But since socket.fileno() 
calls socket._fileno() we can just fix it correctly once in 
socketmodule.c ;-) And instead IOError, I would prefer socket.error.
msg79220 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2009-01-06 01:09
I spent two hours on this issue and here are some notes...

(1) SocketIO.close() have to call self._sock._decref_socketios() to 
allow the socket to call _real_close()

s = socket...
f = s.makefile()
f.close()  # only close the "file view"
s.close() <= close the socket here

(2) SocketIO.close() have to break its reference to allow the socket 
to be closed

s = socket...
f = s.makefile()
del s      # there is still one reference (f._sock)
f.close() <= the garbage collector will close the socket here

(3) operations on a closed SocketIO should be blocked

s = socket...
f = s.makefile()
f.close()
f.fileno() => error!

So issue3826_gps05.diff have two bugs:
 - point (1)
 - point (3): can be fixed by adding self._check_closed() (and so we 
don't need the local copy of fileno)
msg79221 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2009-01-06 01:13
socket_real_close-5.patch: my own patch developed for the issue #4791 
(which is exactly the same problem) to fix SocketIO:
 - set self._sock=None on close
 - update the io reference count directly in close() (but also in 
destructor)
 - add a regression test

You may merge my patch with gps's patch (which has different tests).

See also issue #4853 to block all operations on closed socket in the 
low level API (socketmodule.c).
msg79645 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2009-01-12 04:51
Committed all of our tests and the actual code to fix the problem from 
socket_real_close-5.patch in py3k r68539.

This still needs back porting to release30-maint assuming no other 
issues are found with it.
msg79993 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2009-01-17 02:02
> This still needs back porting to release30-maint 
> assuming no other issues are found with it

How can we check if they are new issues with the changes?
msg79994 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2009-01-17 02:08
That mostly meant let the buildbots run it and/or see if anyone 
complains on a list.  Go ahead and backport. :)
msg80233 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2009-01-20 04:03
backported to release30-maint in r68796.
History
Date User Action Args
2022-04-11 14:56:38adminsetgithub: 48076
2009-01-20 04:03:01gregory.p.smithsetstatus: open -> closed
resolution: fixed
messages: + msg80233
2009-01-17 02:08:13gregory.p.smithsetmessages: + msg79994
2009-01-17 02:02:19vstinnersetmessages: + msg79993
2009-01-12 04:51:36gregory.p.smithsetkeywords: - needs review
messages: + msg79645
2009-01-06 13:38:21vstinnersetpriority: critical -> release blocker
title: BaseHTTPRequestHandler depends on GC to close connections -> Problem with SocketIO for closing the socket
2009-01-06 01:15:40vstinnersetfiles: - socket_real_close-5.patch
2009-01-06 01:13:31vstinnersetfiles: + socket_real_close-5.patch
messages: + msg79221
2009-01-06 01:09:23vstinnersetfiles: + socket_real_close-5.patch
messages: + msg79220
2009-01-05 23:48:31gvanrossumsetnosy: - gvanrossum
2009-01-05 23:34:20vstinnersetmessages: + msg79214
2009-01-05 23:29:47gregory.p.smithsetmessages: + msg79213
2009-01-05 23:25:14vstinnersetmessages: + msg79211
2009-01-05 23:12:37vstinnersetnosy: + vstinner
messages: + msg79209
2008-11-30 07:37:52ggenellinasetfiles: + test_httpclose_py3k.py
messages: + msg76633
2008-11-29 23:07:22gregory.p.smithsetfiles: - issue3826_socket-gps04.diff
2008-11-29 23:07:17gregory.p.smithsetfiles: - test_makefileclose.patch
2008-11-29 23:06:54gregory.p.smithsetfiles: - issue3826_socket-gps03.diff
2008-11-29 23:06:46gregory.p.smithsetfiles: + issue3826_gps05.diff
messages: + msg76631
2008-11-29 21:52:16gregory.p.smithsetfiles: + issue3826_socket-gps04.diff
messages: + msg76625
2008-11-29 14:38:45amaury.forgeotdarcsetmessages: + msg76595
2008-11-29 13:25:13loewissetnosy: + loewis
messages: + msg76592
2008-11-29 12:49:59loewissetfiles: - issue3826_socket-gps02.diff
2008-11-29 12:49:53loewissetfiles: - issue3826_socketserver-gps01.diff
2008-11-29 00:58:03amaury.forgeotdarcsetkeywords: + needs review
files: + test_makefileclose.patch
messages: + msg76572
2008-11-29 00:04:58gregory.p.smithsetfiles: - issue3826_socket-gps01.diff
2008-11-29 00:04:48gregory.p.smithsetfiles: + issue3826_socket-gps03.diff
messages: + msg76570
2008-11-28 23:58:13amaury.forgeotdarcsetmessages: + msg76569
2008-11-28 23:41:17gregory.p.smithsetfiles: + issue3826_socket-gps02.diff
2008-11-28 23:34:41gregory.p.smithsetmessages: + msg76568
stage: patch review
2008-11-28 23:32:04gregory.p.smithsetmessages: + msg76567
2008-11-28 23:27:31gregory.p.smithsetfiles: + issue3826_socket-gps01.diff
2008-11-28 23:27:02gregory.p.smithsetfiles: + issue3826_socketserver-gps01.diff
keywords: + patch
2008-09-26 14:18:18gvanrossumsetmessages: + msg73849
2008-09-26 05:57:45gregory.p.smithsetnosy: + gvanrossum, jhylton
messages: + msg73834
2008-09-25 11:51:40amaury.forgeotdarcsetnosy: + amaury.forgeotdarc
messages: + msg73777
2008-09-25 09:39:50romkynssetmessages: + msg73774
2008-09-25 01:37:14zanellasetnosy: + zanella
2008-09-22 01:05:28gregory.p.smithsetpriority: critical
assignee: gregory.p.smith
nosy: + gregory.p.smith
title: Self-reference in BaseHTTPRequestHandler descendants causes stuck connections -> BaseHTTPRequestHandler depends on GC to close connections
2008-09-21 04:14:03ggenellinasetnosy: + ggenellina
messages: + msg73481
2008-09-15 07:00:58romkynssetmessages: + msg73248
2008-09-10 18:11:59romkynssetmessages: + msg72980
2008-09-09 21:21:11romkynscreate