classification
Title: socketserver.ForkingMixIn leaves zombies, also fails to reap all zombies in one pass
Type: behavior Stage: committed/rejected
Components: Library (Lib) Versions: Python 3.3
process
Status: open Resolution: fixed
Dependencies: Superseder:
Assigned To: orsenthil Nosy List: giampaolo.rodola, gregory.p.smith, haypo, jwarkentin, orsenthil, python-dev
Priority: normal Keywords: patch

Created on 2011-02-03 22:01 by jwarkentin, last changed 2012-10-02 01:48 by python-dev.

Files
File name Uploaded Description Edit
collect_children.patch jwarkentin, 2011-02-03 22:01 Patch to collect children in serve_forever() instead of process_request()
loop_actions.patch jwarkentin, 2011-02-06 04:14 Defines a _loop_actions() method that can be overriden by a subclass. Implements collect_children() in ForkingMixIn._loop_actions().
loop_actions.patch jwarkentin, 2011-02-06 06:53 Same as previous patch, but renames _loop_actions() to loop_actions() and adds it to the list of overridable methods.
Messages (19)
msg127825 - (view) Author: Justin Warkentin (jwarkentin) Date: 2011-02-03 22:01
This is the same issue as was reported here: http://bugs.python.org/issue1954. It is still a problem in Python 3.1. I'm writing a server that will be receiving a massive number of requests and I'd like to eliminate the zombie problem. Once I figured out what was going on, I tried adding a call to collect_children() in the serve_forever() loop. It worked very well. I've included a patch of what I did, however, I obviously can't leave this change in my socketserver.py because we will be deploying this on a lot of servers.

Is there any reason not to collect_children() in the serve_forever() loop? It seems like the place to do it to me. The patch will only collect children if there are any, so it doesn't have to call it every time through the loop.
msg127972 - (view) Author: Justin Warkentin (jwarkentin) Date: 2011-02-05 04:42
I guess I didn't really explain the issue much. The problem is that if the server receives say, 10 requests at once, and then forks a process for each, after those processes finish they will sit as zombies until process_request() is called again, which calls collect_children(). So that patch simply moves the call to collect_children() into the serve_forever() loop so that it doesn't leave zombies sitting around.
msg128026 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2011-02-05 20:59
Rather than depending on the internal details of ForkingMixIn in your BaseServer.serve_forever modification I'd prefer to see that simply call self._cleanup()

Define a do-nothing _periodic_cleanup method in BaseServer.  ForkingMixIn should implement its own _periodic_cleanup method that does the active_children test and calls collect_children as appropriate.
msg128033 - (view) Author: Justin Warkentin (jwarkentin) Date: 2011-02-05 23:16
Good point. I was just writing up something quick that works. Here's another patch, is this acceptable?
msg128036 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2011-02-06 01:14
I believe that is good.  I'll commit it after the 3.2 release has been cut (we're in release candidate release blocker only lockdown right now).

Looking at ForkingMixIn.collect_children() there appears to be another buglet: it loops over self.active_children and calls self.active_children.remove(pid).  This modification of the list while looping over it will cause it to skip the next item in the list.  For every child waited on successfully, it skips checking one of the others.
msg128037 - (view) Author: Justin Warkentin (jwarkentin) Date: 2011-02-06 02:18
I noticed that ForkingMixIn also was overriding handle_timeout() trying to cleanup zombies after 300 seconds of inactivity, which is useless on a busy server. I'm replacing the patch with one that also removes handle_timeout().
msg128039 - (view) Author: Justin Warkentin (jwarkentin) Date: 2011-02-06 04:16
I hope I did that last patch right. I did a 'diff -u' instead of a 'diff -c'. If you need something different, let me know.
msg128045 - (view) Author: Justin Warkentin (jwarkentin) Date: 2011-02-06 06:53
Sorry I keep plaguing this with comments and files, but I got to thinking, anyone should be able to override _loop_actions() and implement what they need in the loop. While it's probably a bad idea in most cases, there may be legitimate needs and it's always good to allow the flexibility. So, I'm adding one more patch that changes the name to loop_actions() and adds it to the list of methods that can be overridden in the BaseServer docstring. If you like it, keep it, if not, just use the last one.
msg132441 - (view) Author: Justin Warkentin (jwarkentin) Date: 2011-03-28 22:43
Just following up on this. Now that 3.2 is out, has the patch been committed?
msg132519 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2011-03-29 20:15
not yet, thanks for the reminder.  if any other committers feel like jumping on this and doing it before I get around to it, feel free.
msg136392 - (view) Author: Justin Warkentin (jwarkentin) Date: 2011-05-20 18:42
Don't mean to nag. Just checking to see if anyone has taken it upon themselves to commit this yet since it's been a couple more months :P
msg136760 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2011-05-24 16:12
Justin,

The patch and logic is okay. We can have this is 3.3.

- I find that loop_actions as not appropriate name for the new method. It fails to give a intuitive meaning of what is supposed to do. request_action, request_action_continued or anything else which gives a meaning should be helpful.  I agree with your reasoning to provide some flexiblity for the user to override this.

- The patch lacks Documentation and tests should be added to Lib/test/test_socketserver.py.  If you can, please append the patch with these, otherwise I shall do it.

A suggestion for better method name is a must! :)

Thanks!
msg136866 - (view) Author: Roundup Robot (python-dev) Date: 2011-05-25 16:26
New changeset e2363b1c4bca by Senthil Kumaran in branch 'default':
Fix closes issue #11109 -  socketserver.ForkingMixIn leaves zombies, also fails to reap all zombies in one pass.
http://hg.python.org/cpython/rev/e2363b1c4bca

New changeset 3e3cd0ed82bb by Senthil Kumaran in branch 'default':
News entry for issue11109.
http://hg.python.org/cpython/rev/3e3cd0ed82bb
msg136867 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2011-05-25 16:31
The feature is in 3.3. I did not remove the handle_timeout method from the Mixin class, because it might be used in the production by the existing servers. It is not appropriate to remove methods all of sudden without deprecation warnings and also it is not required to remove in this case.

Added the Documentation and News entry too.
msg136869 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2011-05-25 16:44
This is fixed in 3.3 now. Keeping it open for test_socketserver update. After a ForkingServer service call, it should be asserted the collect_children routine is run.
msg136894 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2011-05-25 20:21
> New changeset 3e3cd0ed82bb by Senthil Kumaran in branch 'default':
> News entry for issue11109.
> http://hg.python.org/cpython/rev/3e3cd0ed82bb

Please try to format NEWS entries using "Issue #xxx: xxx" instead of "Issue #xxx - xxx".
msg136930 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2011-05-26 05:13
Victor - Sure, I understand Issue #xxx: desc must be useful while generation reST docs.
msg136933 - (view) Author: Justin Warkentin (jwarkentin) Date: 2011-05-26 07:37
Sorry, I haven't had a chance to look at this in a couple days. I've been very busy with work. I'm not sure exactly how to write the test for this, so I don't know that I'd be much help there.

One last thing, I was just looking through the commits and I noticed in the NEWS update (http://hg.python.org/cpython/rev/3e3cd0ed82bb/) you have my name as Justin Wark. The last name is actually Warkentin, I just didn't have anything showing that. That's my fault, sorry. I just updated my profile to show my full name.
msg171769 - (view) Author: Roundup Robot (python-dev) Date: 2012-10-02 01:48
New changeset 991c24b8969d by R David Murray in branch '3.3':
#11109: clean up docs, add whatsnew entry, and fix Justin's last name.
http://hg.python.org/cpython/rev/991c24b8969d

New changeset 1234300bc056 by R David Murray in branch 'default':
Merge #11109: clean up docs, add whatsnew entry, and fix Justin's last name.
http://hg.python.org/cpython/rev/1234300bc056
History
Date User Action Args
2012-10-02 01:48:11python-devsetmessages: + msg171769
2011-05-26 07:37:59jwarkentinsetmessages: + msg136933
2011-05-26 05:13:14orsenthilsetmessages: + msg136930
2011-05-25 20:21:08hayposetnosy: + haypo
messages: + msg136894
2011-05-25 16:44:49orsenthilsetresolution: fixed
stage: patch review -> committed/rejected
messages: + msg136869
versions: - Python 2.7, Python 3.2
2011-05-25 16:31:08orsenthilsetmessages: + msg136867
2011-05-25 16:26:47python-devsetnosy: + python-dev
messages: + msg136866
2011-05-24 16:12:16orsenthilsetassignee: gregory.p.smith -> orsenthil

messages: + msg136760
nosy: + orsenthil
2011-05-20 18:42:26jwarkentinsetmessages: + msg136392
2011-03-29 20:15:27gregory.p.smithsetmessages: + msg132519
2011-03-28 22:43:26jwarkentinsetmessages: + msg132441
2011-02-06 06:53:38jwarkentinsetfiles: + loop_actions.patch
nosy: gregory.p.smith, giampaolo.rodola, jwarkentin
messages: + msg128045
2011-02-06 04:16:43jwarkentinsetnosy: gregory.p.smith, giampaolo.rodola, jwarkentin
messages: + msg128039
2011-02-06 04:14:39jwarkentinsetfiles: + loop_actions.patch
nosy: gregory.p.smith, giampaolo.rodola, jwarkentin
2011-02-06 04:14:05jwarkentinsetfiles: - loop_actions.patch
nosy: gregory.p.smith, giampaolo.rodola, jwarkentin
2011-02-06 02:18:44jwarkentinsetfiles: - loop_actions.patch
nosy: gregory.p.smith, giampaolo.rodola, jwarkentin
2011-02-06 02:18:27jwarkentinsetfiles: + loop_actions.patch
nosy: gregory.p.smith, giampaolo.rodola, jwarkentin
messages: + msg128037
2011-02-06 01:14:47gregory.p.smithsetassignee: gregory.p.smith
nosy: gregory.p.smith, giampaolo.rodola, jwarkentin
2011-02-06 01:14:33gregory.p.smithsetnosy: gregory.p.smith, giampaolo.rodola, jwarkentin
messages: + msg128036
title: socketserver.ForkingMixIn leaves zombies -> socketserver.ForkingMixIn leaves zombies, also fails to reap all zombies in one pass
2011-02-05 23:16:23jwarkentinsetfiles: + loop_actions.patch
nosy: gregory.p.smith, giampaolo.rodola, jwarkentin
messages: + msg128033
2011-02-05 20:59:24gregory.p.smithsetnosy: gregory.p.smith, giampaolo.rodola, jwarkentin
messages: + msg128026
2011-02-05 18:53:46giampaolo.rodolasetnosy: + giampaolo.rodola
2011-02-05 08:17:43pitrousetnosy: + gregory.p.smith
stage: patch review

versions: + Python 2.7, Python 3.2, Python 3.3, - Python 3.1
2011-02-05 04:42:42jwarkentinsetmessages: + msg127972
2011-02-03 22:01:13jwarkentincreate