Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http.server and SimpleHTTPServer hang after a few requests #75820

Closed
mattpr mannequin opened this issue Sep 29, 2017 · 17 comments
Closed

http.server and SimpleHTTPServer hang after a few requests #75820

mattpr mannequin opened this issue Sep 29, 2017 · 17 comments
Labels
OS-mac OS-windows type-bug An unexpected behavior, bug, or error

Comments

@mattpr
Copy link
Mannequin

mattpr mannequin commented Sep 29, 2017

BPO 31639
Nosy @pfmoore, @ronaldoussoren, @tjguk, @ned-deily, @bitdancer, @vadmium, @zware, @zooba, @JulienPalard, @miss-islington
PRs
  • bpo-31639: Use threads in http.server #5018
  • [3.7] bpo-31639: Use threads in http.server module. (GH-5018) #6202
  • [3.6] bpo-31639: Use threads in http.server module. (GH-5018) #6203
  • bpo-31639: Change ThreadedHTTPServer to ThreadingHTTPServer class name #7195
  • [3.7] bpo-31639: Change ThreadedHTTPServer to ThreadingHTTPServer class name (GH-7195) #7219
  • Files
  • Archive.zip: domain{1,2,3}.html files for reproducing the issue
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2018-03-23.16:52:59.262>
    created_at = <Date 2017-09-29.16:19:04.586>
    labels = ['OS-mac', 'type-bug', 'OS-windows']
    title = 'http.server and SimpleHTTPServer hang after a few requests'
    updated_at = <Date 2019-09-11.16:25:49.639>
    user = 'https://bugs.python.org/mattpr'

    bugs.python.org fields:

    activity = <Date 2019-09-11.16:25:49.639>
    actor = 'benjamin.peterson'
    assignee = 'none'
    closed = True
    closed_date = <Date 2018-03-23.16:52:59.262>
    closer = 'mdk'
    components = ['macOS', 'Windows']
    creation = <Date 2017-09-29.16:19:04.586>
    creator = 'mattpr'
    dependencies = []
    files = ['47177']
    hgrepos = []
    issue_num = 31639
    keywords = ['patch']
    message_count = 17.0
    messages = ['303335', '303352', '303437', '303439', '308518', '308521', '309076', '309082', '309083', '309094', '309105', '309110', '309120', '314314', '314325', '318082', '318141']
    nosy_count = 13.0
    nosy_names = ['paul.moore', 'ronaldoussoren', 'tim.golden', 'ned.deily', 'v+python', 'r.david.murray', 'martin.panter', 'zach.ware', 'steve.dower', 'mdk', 'mattpr', 'rogerwang', 'miss-islington']
    pr_nums = ['5018', '6202', '6203', '7195', '7219']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue31639'
    versions = ['Python 2.7', 'Python 3.6']

    @mattpr
    Copy link
    Mannequin Author

    mattpr mannequin commented Sep 29, 2017

    Doing a cross domain iframe test. domain1.html has iframe pointing at domain2.html which has iframe pointing at domain3.html.

    domain{1,2,3}.com are all configured to point at 127.0.0.1 in my /etc/hosts file.

    Loaded up http://domain1.com:8000/domain1.html in my browser and it spins waiting for domain3 to load. CTRL-C and then domain3 loads. CTRL-C again to quit.

    Google chrome: 61.0.3163.100 (Official Build) (64-bit)

    $ python --version
    Python 2.7.13
    $ uname -a
    Darwin [hostname-removed] 14.5.0 Darwin Kernel Version 14.5.0: Sun Jun  4 21:40:08 PDT 2017; root:xnu-2782.70.3~1/RELEASE_X86_64 x86_64
    $ brew info python
    ...
    /usr/local/Cellar/python/2.7.13 (3,571 files, 49MB) *
      Poured from bottle on 2017-01-30 at 16:56:40
    From: https://github.com/Homebrew/homebrew-core/blob/master/Formula/python.rb
    
    $ python -m SimpleHTTPServer 8000
    Serving HTTP on 0.0.0.0 port 8000 ...
    127.0.0.1 - - [29/Sep/2017 17:14:22] "GET /domain1.html HTTP/1.1" 200 -
    127.0.0.1 - - [29/Sep/2017 17:14:22] "GET /style.css HTTP/1.1" 200 -
    127.0.0.1 - - [29/Sep/2017 17:14:23] "GET /domain2.html HTTP/1.1" 200 -
    127.0.0.1 - - [29/Sep/2017 17:14:23] "GET /style.css HTTP/1.1" 200 -
    ^C----------------------------------------
    Exception happened during processing of request from ('127.0.0.1', 64315)
    Traceback (most recent call last):
      File "/usr/local/Cellar/python/2.7.13/Frameworks/Python.framework/Versions/2.7/lib/python2.7/SocketServer.py", line 290, in _handle_request_noblock
        self.process_request(request, client_address)
      File "/usr/local/Cellar/python/2.7.13/Frameworks/Python.framework/Versions/2.7/lib/python2.7/SocketServer.py", line 318, in process_request
        self.finish_request(request, client_address)
      File "/usr/local/Cellar/python/2.7.13/Frameworks/Python.framework/Versions/2.7/lib/python2.7/SocketServer.py", line 331, in finish_request
        self.RequestHandlerClass(request, client_address, self)
      File "/usr/local/Cellar/python/2.7.13/Frameworks/Python.framework/Versions/2.7/lib/python2.7/SocketServer.py", line 652, in __init__
        self.handle()
      File "/usr/local/Cellar/python/2.7.13/Frameworks/Python.framework/Versions/2.7/lib/python2.7/BaseHTTPServer.py", line 340, in handle
        self.handle_one_request()
      File "/usr/local/Cellar/python/2.7.13/Frameworks/Python.framework/Versions/2.7/lib/python2.7/BaseHTTPServer.py", line 310, in handle_one_request
        self.raw_requestline = self.rfile.readline(65537)
      File "/usr/local/Cellar/python/2.7.13/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 480, in readline
        data = self._sock.recv(self._rbufsize)
    KeyboardInterrupt
    ----------------------------------------
    127.0.0.1 - - [29/Sep/2017 17:14:26] "GET /domain3.html HTTP/1.1" 200 -
    127.0.0.1 - - [29/Sep/2017 17:14:26] "GET /style.css HTTP/1.1" 200 -
    ^CTraceback (most recent call last):
      File "/usr/local/Cellar/python/2.7.13/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 174, in _run_module_as_main
        "__main__", fname, loader, pkg_name)
      File "/usr/local/Cellar/python/2.7.13/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 72, in _run_code
        exec code in run_globals
      File "/usr/local/Cellar/python/2.7.13/Frameworks/Python.framework/Versions/2.7/lib/python2.7/SimpleHTTPServer.py", line 235, in <module>
        test()
      File "/usr/local/Cellar/python/2.7.13/Frameworks/Python.framework/Versions/2.7/lib/python2.7/SimpleHTTPServer.py", line 231, in test
        BaseHTTPServer.test(HandlerClass, ServerClass)
      File "/usr/local/Cellar/python/2.7.13/Frameworks/Python.framework/Versions/2.7/lib/python2.7/BaseHTTPServer.py", line 610, in test
        httpd.serve_forever()
      File "/usr/local/Cellar/python/2.7.13/Frameworks/Python.framework/Versions/2.7/lib/python2.7/SocketServer.py", line 231, in serve_forever
        poll_interval)
      File "/usr/local/Cellar/python/2.7.13/Frameworks/Python.framework/Versions/2.7/lib/python2.7/SocketServer.py", line 150, in _eintr_retry
        return func(*args)
    KeyboardInterrupt
    

    Same issue with python3

    $ python3 --version
    Python 3.6.0
    $ brew info python3
    ...
    /usr/local/Cellar/python3/3.6.0 (3,611 files, 55.9MB) *
      Poured from bottle on 2017-01-30 at 16:57:16
    From: https://github.com/Homebrew/homebrew-core/blob/master/Formula/python3.rb
    

    Note only one CTRL-C to exit... but note it didn't server domain3.html...so stuck in the same place as before. With python2 it serves domain3.html after hitting the first CTRL-C. With python3 it never serves it, just quits after the CTRL-C and browser is "spinning" waiting for the file.

    $ python3 -m http.server 8000
    Serving HTTP on 0.0.0.0 port 8000 (http://0.0.0.0:8000/) ...
    127.0.0.1 - - [29/Sep/2017 18:04:38] "GET /domain1.html HTTP/1.1" 200 -
    127.0.0.1 - - [29/Sep/2017 18:04:38] "GET /style.css HTTP/1.1" 200 -
    127.0.0.1 - - [29/Sep/2017 18:04:39] "GET /domain2.html HTTP/1.1" 200 -
    127.0.0.1 - - [29/Sep/2017 18:04:39] "GET /style.css HTTP/1.1" 200 -
    ^C
    Keyboard interrupt received, exiting.
    

    I can reproduce this EVERY time, but I can't imagine everyone has this issue or it would have been fixed already, so tried to provide lots of info about my environment. Let me know if you need more.

    There seems to be a similar issue on windows: microsoft/WSL#1906

    @mattpr mattpr mannequin added OS-mac type-bug An unexpected behavior, bug, or error labels Sep 29, 2017
    @mattpr
    Copy link
    Mannequin Author

    mattpr mannequin commented Sep 29, 2017

    It has been pointed out to me that this issue may be related to chrome making multiple requests in parallel.

    A test with wget seems to support this.

    wget -E -H -k -K -p http://domain1.com:8000/domain1.html

    ...does not hang, whereas requests from chrome do hang.

    On some level I guess I wonder about the usefulness of simple web servers if they choke on very basic website requests from modern browsers.

    @vadmium
    Copy link
    Member

    vadmium commented Oct 1, 2017

    The change in handling KeyboardInterrupt was my intention in bpo-23430. I hope it isn’t a problem on its own :)

    Running the module with “python -m http.server” uses the HTTPServer class, based on socketserver.TCPServer. This only accepts one connection at a time, and waits for the SimpleHTTPRequestHandler class to finish handling each TCP connection before shutting it down and waiting for the next one. But SimpleHTTPRequestHandler supports persistent HTTP connections, meaning the TCP connection stays open after one HTTP request, ready for another. This prevents the simple TCPServer from accepting new connections.

    What is probably happening is the browser is trying to make one request (e.g. for domain3.html) on a new connection while it has an existing persistent connection already open. Having multiple host names pointing to the same server is not going to help; perhaps the browser does not realize that the two connections are to the same TCP server or that it could reuse the old connection.

    A simple workaround would be to use the socketserver.ThreadingMixIn or ForkingMixIn class. Each TCP connection will then be handled in a background thread. The mixed-in TCPServer will not wait for the handler, and will accept concurrent connections.

    If you want to avoid multiple threads, I think it is also possible to augment the server and handler classes to use “select” or similar so that the server will still handle each HTTP request one at a time, but can wait for requests on multiple TCP connections. But it requires subclassing and overriding some methods with custom code, and probably depends on deeper knowledge of how the classes work than is specified in the documentation.

    For existing versions of Python, I don’t there is much that could be done other than documenting the shortcomings of how a persistent HTTP connection vs multiple connections is handled.

    @vadmium
    Copy link
    Member

    vadmium commented Oct 1, 2017

    Actually take back a lot of what I wrote above. I forgot that SimpleHTTPRequestHandler only supports HTTP 1.0; I don’t think it uses keep-alive or persistent connections, so it should close its TCP connections promptly. There may be something else going on.

    Unfortunately I don’t have Chrome handy to experiment with. Perhaps it is holding a TCP connection open without making any request at all, and then trying to open a second connection. You would have to look at the TCP connections being created and shut down, and the HTTP requests being made, to verify.

    @vpython
    Copy link
    Mannequin

    vpython mannequin commented Dec 18, 2017

    Same behavior on Windows.

    @vpython vpython mannequin added the OS-windows label Dec 18, 2017
    @vpython
    Copy link
    Mannequin

    vpython mannequin commented Dec 18, 2017

    This probably has been around for a while: this 2011 thread in a Chromium wontfix bug is enlightening, but the solution suggested, a ThreadingMixIn for the HTTPServer, didn't help me.

    https://bugs.chromium.org/p/chromium/issues/detail?id=195550

    @JulienPalard
    Copy link
    Member

    I straced both chromium and Python during the issue and seen this:

    Chromium open a socket (port 55084), sends "GET /domain1.html" to it.
    Python accepts it, reads "GET /domain1.html", replies, OK
    Chromium closes socket on port 55084
    Chromium opens three sockets:

    • port 55086
    • port 55088
    • port 55090
      Python accepts a socket on port 55088 and read on it (blocking)
      Chromium writes "GET /domain2.html" on socket on port 55090

    At this point we're stuck, three socket are opened, Python is reading on one of them, Chromium is writing on another.

    @JulienPalard
    Copy link
    Member

    I wrote a 2 lines PR in which I propose to use threads to handle connections as it's well supported by socketserver via a ThreadingMixIn and fixes the issue.

    @bitdancer
    Copy link
    Member

    I don't think the PR as it stands is a good idea. These classes are designed to be composable, so it should be up to the library user whether or not to use threads. However it would be perfectly reasonable to choose to use threads in the 'test' function and thus the cli. Which fact should then be documented, and chromium can even be mentioned as one of the motivations for using threads in the cli server.

    @vpython
    Copy link
    Mannequin

    vpython mannequin commented Dec 27, 2017

    I tried the approach in the PR "externally" (when starting the server using a test program), and I couldn't get it to work. But my test case was probably different: I was using Chrome rather than Chromium, and while they both work for me for simple HTTP file access to localhost without threading, I had tried to set up a PWA with a service worker, and maybe that does something different, so I got the hang, applied the Threading.mixIn, and it still hung. I don't know how to tell for sure if it is the same sort of hang, or something different, in the Windows environment.

    @JulienPalard
    Copy link
    Member

    David you're right, I updated my patch, also added some documentation and a news entry.

    Glenn I can't tell if my PR fixes your issue without an strace or a test. It fixes the one I straced (pre-opening sockets leading to Python reading indefinitly on a socket which may never be used). If you could try my PR I would appreciate it.

    @vpython
    Copy link
    Mannequin

    vpython mannequin commented Dec 27, 2017

    I don't know how to look back at the previous version of the PR, I was barely able to find the "current version" each time. The following line is in the current version:

    daemon_threads = True

    Whether it was in the previous version, I don't know, but I didn't notice it, but maybe I overlooked it due to other changes in the same area, which are now gone. This line was not in the old suggestion that I had found and tried. When I added it, my test case started working. I have no idea what the line really does, but the HTTP server is a daemon, and we are adding threading, so it sounds appropriate.

    I do wonder if it should somehow be put in the definition of ThreadedHTTPServer instead of "pass". And the old solution I had found had called the HTTPServer.__init__ which yours does not, which was surprising, but I'll not argue with success.

    @JulienPalard
    Copy link
    Member

    Glenn you're right I modified my PR.

    @JulienPalard
    Copy link
    Member

    New changeset 8bcfa02 by Julien Palard in branch 'master':
    bpo-31639: Use threads in http.server module. (GH-5018)
    8bcfa02

    @miss-islington
    Copy link
    Contributor

    New changeset f8d2c3c by Miss Islington (bot) in branch '3.7':
    bpo-31639: Use threads in http.server module. (GH-5018)
    f8d2c3c

    @JulienPalard
    Copy link
    Member

    New changeset 1cee216 by Julien Palard (Géry Ogam) in branch 'master':
    bpo-31639: Change ThreadedHTTPServer to ThreadingHTTPServer class name (GH-7195)
    1cee216

    @JulienPalard
    Copy link
    Member

    New changeset 4f53e2a by Julien Palard in branch '3.7':
    [3.7] bpo-31639: Change ThreadedHTTPServer to ThreadingHTTPServer class name (GH-7195) (GH-7219)
    4f53e2a

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    OS-mac OS-windows type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants