classification
Title: Python's HTTP server implementations hangs after 16.343 requests on MacOSX
Type: behavior Stage: resolved
Components: macOS Versions: Python 3.3, Python 3.4, Python 2.7
process
Status: closed Resolution: third party
Dependencies: Superseder:
Assigned To: ronaldoussoren Nosy List: chobeiry, ned.deily, neologix, pitrou, ronaldoussoren
Priority: normal Keywords:

Created on 2014-03-23 11:14 by chobeiry, last changed 2014-07-24 19:21 by neologix. This issue is now closed.

Messages (19)
msg214562 - (view) Author: Parto Chobeiry (chobeiry) Date: 2014-03-23 11:14
While deriving from Python's HTTP server implementation I think I found a bug (on MacOSX) which can be reproduced like this:

(1) Start the HTTP server:
python -m SimpleHTTPServer (or python -m http.server)

(2) Benchmark the HTTP server:
ab -n 20000 http://127.0.0.1:8000/
This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 127.0.0.1 (be patient)
Completed 2000 requests
Completed 4000 requests
Completed 6000 requests
Completed 8000 requests
Completed 10000 requests
Completed 12000 requests
Completed 14000 requests
Completed 16000 requests
apr_socket_recv: Operation timed out (60)
Total of 16343 requests completed

This does not happen on Ubuntu.

What is the reason for timing out after 16.343 requests?
msg214569 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-03-23 12:26
Intuitively: no TCP ports available left?
Have you tried connecting from another client when the hang happens?
Have you tried tracing what happens in the server? (is the incoming connection accepted?)
msg214588 - (view) Author: Parto Chobeiry (chobeiry) Date: 2014-03-23 15:19
TCP ports are available - did the following while running "ab" against "python3.4 -m http.server":

 u1@~$ while true; do netstat -an|grep tcp|wc; sleep 1; done
      50     300    3950
      50     300    3950
      50     300    3950
      50     300    3950
      50     300    3950
      50     300    3950
      49     294    3871
      51     306    4029
      51     306    4029
      51     306    4029
      51     306    4029
      51     306    4029
      51     306    4029
      51     306    4029
      51     306    4029
      51     306    4029
      51     306    4029
      51     306    4029
      50     300    3950
      50     300    3950
      51     306    4029
      51     306    4029
      51     306    4029
      51     306    4029

51 TCP sockets in use.

Connecting from another client then blocks for some time (I hit enter multiple times) before showing "connected to localhost", then the request is answered:

 u1@~$ telnet 127.0.0.1 8000
 Trying 127.0.0.1...
 GET /




 Connected to localhost.
 Escape character is '^]'.
 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
 <html>
 <head>
 <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
 <title>Directory listing for /</title>
 </head>
 ...

How do I trace in MacOSX? No truss or strace available...
msg214590 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-03-23 15:28
> How do I trace in MacOSX? No truss or strace available...

Add print() statements to the Python stdlib code :-)
msg214605 - (view) Author: Parto Chobeiry (chobeiry) Date: 2014-03-23 16:51
I thought you were kidding concerning "print()" -- thus I started using "python -m trace -t httpd.py" >trace 2>&1. The hanging does not occur when tracing (however, things slow down immensely naturally)!

When breaking the httpd.py while it is not serving anymore, the stack trace looks always like this:

^CTraceback (most recent call last):
  File "httpd.py", line 3, in <module>
    s.serve_forever()
  File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/socketserver.py", line 237, in serve_forever
    poll_interval)
  File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/socketserver.py", line 155, in _eintr_retry
    return func(*args)
KeyboardInterrupt

I will try to add some "print()" before the "return func(*args)"...
msg214608 - (view) Author: Parto Chobeiry (chobeiry) Date: 2014-03-23 16:58
I think there is no need for print() -- it is hanging <<HERE>>:

 def _eintr_retry(func, *args):
     """restart a system call interrupted by EINTR"""
     while True:
         try:
             return func(*args) <<HERE>>
         except OSError as e:
             if e.errno != errno.EINTR:
                 raise

This gets called <<HERE>>:

    def serve_forever(self, poll_interval=0.5):
        """Handle one request at a time until shutdown.

        Polls for shutdown every poll_interval seconds. Ignores
        self.timeout. If you need to do periodic tasks, do them in
        another thread.
        """
        self.__is_shut_down.clear()
        try:
            while not self.__shutdown_request:
                # XXX: Consider using another file descriptor or
                # connecting to the socket to wake this up instead of
                # polling. Polling reduces our responsiveness to a
                # shutdown request and wastes cpu at all other times.
                r, w, e = _eintr_retry(select.select, [self], [], [],
                                       poll_interval) <<HERE>>
                if self in r:
                    self._handle_request_noblock()

                self.service_actions()
        finally:
            self.__shutdown_request = False
            self.__is_shut_down.set()

So, the select.select is blocking or it does not find anything to "select" on... Did I conclude that correctly?
msg214610 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-03-23 17:02
If it blocks in the select() call, then it's probably an OS issue rather than a Python issue.
msg214622 - (view) Author: Parto Chobeiry (chobeiry) Date: 2014-03-23 17:58
Submitted a bug at bugreport.apple.com: #16401766
msg214630 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2014-03-23 20:03
You could use tcpdump to see what's going on (does the server reply to SYN?).

Note that it might very well be either a firewall setting, or a DoS protection (some sort of backoff when there are too many SYN within a short interval).
msg214632 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2014-03-23 20:19
By the way, could you test with the patch available in issue #21040 ?
It'll use poll() instead of select(): I don't think it'll fix your problem,
but it'll be a nice test anyway :-)
msg214689 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2014-03-24 15:12
You could trace using dtruss (as root), which behaves similarly to strace.

What happens when you first set "http_proxy" in the environment:

$ env http_proxy= python -m SimpleHTTPServer

If that doesn't cause problems the hang is caused by the _scproxy extension, which urllib uses to get the system proxy settings on OSX.
msg214910 - (view) Author: Parto Chobeiry (chobeiry) Date: 2014-03-26 20:25
@neologix: I tried to patch socketserver.py from versions 2.7, 3.3 and 3.4 -- it fails all the time, slightly differing error messages but similar to this:

 u1@~/Projects$ patch <../Downloads/socketserver_use_selectors-1.diff 
 can't find file to patch at input line 4
 Perhaps you should have used the -p or --strip option?
 The text leading up to this was:
 --------------------------
 |diff -r 3d0eacb12b5c Doc/library/socketserver.rst
 |--- a/Doc/library/socketserver.rst	Sun Mar 23 11:57:01 2014 +0000
 |+++ b/Doc/library/socketserver.rst	Sun Mar 23 20:23:41 2014 +0000
 --------------------------
 File to patch: 
 Skip this patch? [y] 
 Skipping patch.
 2 out of 2 hunks ignored
 patching file socketserver.py
 Hunk #3 FAILED at 129.
 Hunk #4 FAILED at 150.
 Hunk #5 succeeded at 169 (offset 1 line).
 Hunk #6 FAILED at 230.
 Hunk #7 succeeded at 254 with fuzz 2 (offset -10 lines).
 Hunk #8 succeeded at 277 (offset -10 lines).
 Hunk #9 succeeded at 381 (offset -10 lines).
 Hunk #10 succeeded at 463 (offset -10 lines).
 3 out of 10 hunks FAILED -- saving rejects to file socketserver.py.rej 

I made a local copy of socketserver.py into my homedir to be able to modify it...
msg214911 - (view) Author: Parto Chobeiry (chobeiry) Date: 2014-03-26 20:28
@ronaldoussoren: I tried the "env http_proxy= python ..." but the result is the same.

 u1@~/Projects$ ab -n 20000 http://127.0.0.1:8000/
 This is ApacheBench, Version 2.3 <$Revision: 655654 $>
 Copyright 1996 Adam Twiss, Zeus Technology Ltd,  http://www.zeustech.net/
 Licensed to The Apache Software Foundation, http://www.apache.org/

 Benchmarking 127.0.0.1 (be patient)
 Completed 2000 requests
 Completed 4000 requests
 Completed 6000 requests
 Completed 8000 requests
 Completed 10000 requests
 Completed 12000 requests
 Completed 14000 requests
 Completed 16000 requests
 apr_socket_recv: Operation timed out (60)
 Total of 16335 requests completed
msg214912 - (view) Author: Parto Chobeiry (chobeiry) Date: 2014-03-26 20:31
Got feedback from Apple :-(

 Engineering has determined that this is an issue for a third party to resolve based on the following:

 This happens in stock python, it's a python issue.

 We are now closing this bug report.

 If you have questions regarding the resolution of this issue, please update your bug report with that information.

 Please be sure to regularly check new Apple releases for any updates that might affect this issue.
msg214916 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2014-03-26 21:53
I believe Ronald's suggestion to disable calls to _scproxy doesn't quite work.  Please try this instead:

env no_proxy='*' python -m SimpleHTTPServer
msg215066 - (view) Author: Parto Chobeiry (chobeiry) Date: 2014-03-28 18:34
@ned.daily: I did as requested, similar outcome.

In terminal session #1:

 u1@~/Projects$ env no_proxy='*' python -m SimpleHTTPServer
 Serving HTTP on 0.0.0.0 port 8000 ...
 127.0.0.1 - - [28/Mar/2014 19:30:59] "GET / HTTP/1.0" 200 -
 127.0.0.1 - - [28/Mar/2014 19:30:59] "GET / HTTP/1.0" 200 -
 127.0.0.1 - - [28/Mar/2014 19:30:59] "GET / HTTP/1.0" 200 -
 ...

In terminal session #2:

 u1@~/Projects$ ab -n 20000 http://127.0.0.1:8000/
 This is ApacheBench, Version 2.3 <$Revision: 655654 $>
 Copyright 1996 Adam Twiss, Zeus Technology Ltd,  http://www.zeustech.net/
 Licensed to The Apache Software Foundation, http://www.apache.org/

 Benchmarking 127.0.0.1 (be patient)
 Completed 2000 requests
 Completed 4000 requests
 Completed 6000 requests
 Completed 8000 requests
 Completed 10000 requests
 Completed 12000 requests
 Completed 14000 requests
 Completed 16000 requests
 apr_socket_recv: Operation timed out (60)
 Total of 16321 requests completed

I am going do try a dtruss/ dtrace now...
msg215070 - (view) Author: Parto Chobeiry (chobeiry) Date: 2014-03-28 19:11
dtruss yields this when the pause at 16.300+ requests occurs:

 67677/0x23c1df:  12960942       3      1  stat64("/Users/u1/SparkleShare\0", 0x7FFF5C5ED540, 0x103713A60)		 = 0 0
 67677/0x23c1df:  12960946       3      1 lstat64("/Users/u1/SparkleShare\0", 0x7FFF5C5ED540, 0x103713A60)		 = 0 0
 67677/0x23c1df:  12960969       6      3 write_nocancel(0x2, "127.0.0.1 - - [28/Mar/2014 20:05:04] \"GET / HTTP/1.0\" 200 -\n\0", 0x3C)		 = 60 0
 67677/0x23c1df:  12960990      12      9 sendto(0x6, 0x103B61EC4, 0x11)		 = 17 0
 67677/0x23c1df:  12961005       7      5 sendto(0x6, 0x103B533C4, 0x25)		 = 37 0
 67677/0x23c1df:  12961025       6      4 sendto(0x6, 0x103B533C4, 0x25)		 = 37 0
 67677/0x23c1df:  12961034       4      2 sendto(0x6, 0x103B533C4, 0x28)		 = 40 0
 67677/0x23c1df:  12961047       8      6 sendto(0x6, 0x103B60554, 0x16)		 = 22 0
 67677/0x23c1df:  12961057       6      4 sendto(0x6, 0x1038D42FC, 0x2)		 = 2 0
 67677/0x23c1df:  12961072       7      5 sendto(0x6, 0x7F8902801824, 0x688)		 = 1672 0
 67677/0x23c1df:  12961092       6      3 shutdown(0x6, 0x1, 0x10364FE30)		 = 0 0
 67677/0x23c1df:  12961099       5      3 close(0x6)		 = 0 0
 dtrace: 55723 dynamic variable drops with non-empty dirty list
 67677/0x23c1df:  13631099  501078     11 select(0x4, 0x7FFF5C5E2900, 0x7FFF5C5E2880, 0x7FFF5C5E2800, 0x7FFF5C5E27F0)		 = 0 0
 67677/0x23c1df:  13631152  500202     17 select(0x4, 0x7FFF5C5E2900, 0x7FFF5C5E2880, 0x7FFF5C5E2800, 0x7FFF5C5E27F0)		 = 0 0
 67677/0x23c1df:  13631239  501182     42 select(0x4, 0x7FFF5C5E2900, 0x7FFF5C5E2880, 0x7FFF5C5E2800, 0x7FFF5C5E27F0)		 = 0 0
 67677/0x23c1df:  13631312  501123     29 select(0x4, 0x7FFF5C5E2900, 0x7FFF5C5E2880, 0x7FFF5C5E2800, 0x7FFF5C5E27F0)		 = 0 0
 67677/0x23c1df:  13631453  501291     56 select(0x4, 0x7FFF5C5E2900, 0x7FFF5C5E2880, 0x7FFF5C5E2800, 0x7FFF5C5E27F0)		 = 0 0

However, when dtrussing the python process, it pauses at 16.300+ requests and then *continues*!!

 u1@~/Projects$ ab -n 20000 http://127.0.0.1:8000/

 This is ApacheBench, Version 2.3 <$Revision: 655654 $>
 Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
 Licensed to The Apache Software Foundation, http://www.apache.org/

 Benchmarking 127.0.0.1 (be patient)
 Completed 2000 requests
 Completed 4000 requests
 Completed 6000 requests
 Completed 8000 requests
 Completed 10000 requests
 Completed 12000 requests
 Completed 14000 requests
 Completed 16000 requests
 Completed 18000 requests
 Completed 20000 requests
 Finished 20000 requests


 Server Software:        SimpleHTTP/0.6
 Server Hostname:        127.0.0.1
 Server Port:            8000

 Document Path:          /
 Document Length:        1672 bytes

 Concurrency Level:      1
 Time taken for tests:   42.930 seconds
 Complete requests:      20000
 Failed requests:        0
 Write errors:           0
 Total transferred:      36540000 bytes
 HTML transferred:       33440000 bytes
 Requests per second:    465.87 [#/sec] (mean)
 Time per request:       2.147 [ms] (mean)
 Time per request:       2.147 [ms] (mean, across all concurrent requests)
 Transfer rate:          831.20 [Kbytes/sec] received

 Connection Times (ms)
               min  mean[+/-sd] median   max
 Connect:        0    1 139.1      0   19666
 Processing:     1    1   4.2      1     538
 Waiting:        1    1   1.6      1     154
 Total:          1    2 139.1      1   19668

 Percentage of the requests served within a certain time (ms)
  50%      1
  66%      1
  75%      1
  80%      1
  90%      1
  95%      2
  98%      2
  99%      2
 100%  19668 (longest request)
 u1@~/Projects$
msg215831 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2014-04-09 18:14
This could be related to a problem described on this blog: <http://savanne.be/1035-benchmarking-on-osx-http-timeouts/>

As such this appears to be a platform issue and not a python one.
msg223875 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2014-07-24 19:21
I'm closing, since it's amlost certainly an OS-X bug (similar to time-wit socket exhaustion & Co).
History
Date User Action Args
2014-07-24 19:21:02neologixsetstatus: open -> closed
resolution: third party
messages: + msg223875

stage: resolved
2014-04-09 18:14:45ronaldoussorensetmessages: + msg215831
2014-03-28 19:11:23chobeirysetmessages: + msg215070
2014-03-28 18:34:04chobeirysetmessages: + msg215066
2014-03-26 21:53:20ned.deilysetmessages: + msg214916
2014-03-26 20:31:16chobeirysetmessages: + msg214912
2014-03-26 20:28:06chobeirysetmessages: + msg214911
2014-03-26 20:25:07chobeirysetmessages: + msg214910
2014-03-24 15:12:08ronaldoussorensetmessages: + msg214689
2014-03-23 20:19:24neologixsetmessages: + msg214632
2014-03-23 20:03:25neologixsetnosy: + neologix
messages: + msg214630
2014-03-23 17:58:23chobeirysetmessages: + msg214622
2014-03-23 17:02:47pitrousetmessages: + msg214610
2014-03-23 16:58:12chobeirysetmessages: + msg214608
2014-03-23 16:51:12chobeirysetmessages: + msg214605
2014-03-23 15:28:58pitrousetmessages: + msg214590
2014-03-23 15:19:47chobeirysetmessages: + msg214588
2014-03-23 12:26:33pitrousetnosy: + pitrou, ned.deily
messages: + msg214569
2014-03-23 12:22:22r.david.murraysetassignee: ronaldoussoren

components: + macOS
nosy: + ronaldoussoren
2014-03-23 11:14:12chobeirycreate