Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change in http.server default IP behavior? #83392

Closed
ShaneSmith mannequin opened this issue Jan 4, 2020 · 11 comments
Closed

Change in http.server default IP behavior? #83392

ShaneSmith mannequin opened this issue Jan 4, 2020 · 11 comments
Labels
3.8 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@ShaneSmith
Copy link
Mannequin

ShaneSmith mannequin commented Jan 4, 2020

BPO 39211
Nosy @jaraco, @tirkarthi
Superseder
  • bpo-38907: http.server (command) fails to bind dual-stack on Windows
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2020-01-06.02:40:16.371>
    created_at = <Date 2020-01-04.18:26:34.444>
    labels = ['3.8', 'type-bug', 'library']
    title = 'Change in http.server default IP behavior?'
    updated_at = <Date 2020-01-06.02:40:16.370>
    user = 'https://bugs.python.org/ShaneSmith'

    bugs.python.org fields:

    activity = <Date 2020-01-06.02:40:16.370>
    actor = 'Shane Smith'
    assignee = 'none'
    closed = True
    closed_date = <Date 2020-01-06.02:40:16.371>
    closer = 'Shane Smith'
    components = ['Library (Lib)']
    creation = <Date 2020-01-04.18:26:34.444>
    creator = 'Shane Smith'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 39211
    keywords = ['3.8regression']
    message_count = 11.0
    messages = ['359299', '359303', '359308', '359349', '359355', '359374', '359376', '359377', '359379', '359393', '359396']
    nosy_count = 3.0
    nosy_names = ['jaraco', 'Shane Smith', 'xtreak']
    pr_nums = []
    priority = 'normal'
    resolution = 'duplicate'
    stage = 'resolved'
    status = 'closed'
    superseder = '38907'
    type = 'behavior'
    url = 'https://bugs.python.org/issue39211'
    versions = ['Python 3.8']

    @ShaneSmith
    Copy link
    Mannequin Author

    ShaneSmith mannequin commented Jan 4, 2020

    It seems to me that the direct invocation behavior for http.server changed, probably with Python 3.8 (I'm currently using 3.8.1 on Windows 10). On 3.7.X I was able to use it as described in the docs (https://docs.python.org/3/library/http.server.html)

    python -m http.server 8000

    and it would default to whatever IP address was available. Now, in order for it to function at all (not return "This site can’t be reached" in Chrome), I have to bind it to a specific IP address (say, 127.0.0.1, sticking with the docs example).

    python -m http.server 8000 --bind 127.0.0.1

    At which point it works fine. So it's still quite usable for this purpose, though I was surprised and -simple as the solution is- the solution is less simple when you don't know it!

    Was this an intended change? Something something security, perhaps? If so, should it be noted in the "What's new" of the docs? And of course, there's always the slight possibility that some aspect of Windows or Chrome behavior changed, but based on the termal's response I don't think that's the case.

    Thanks,

    @ShaneSmith ShaneSmith mannequin added 3.8 only security fixes type-bug An unexpected behavior, bug, or error labels Jan 4, 2020
    @tirkarthi
    Copy link
    Member

    Can you please paste the output of http.server as in the port and address printed as a log when started for different commands? As I can see from the history the binding process was changed to include IPv6 as default : #11767

    @ShaneSmith
    Copy link
    Mannequin Author

    ShaneSmith mannequin commented Jan 4, 2020

    For the basic invocation:

    python -m http.server 8080
    Serving HTTP on :: port 8080 (http://[::]:8080/) ...

    It just sits there, because I can't access it (http://[::]:8080/ is not a valid address, so far as I know, and inserting my IP address doesn't find it either). If I bind it to an IP address, it works as expected (using 127.0.0.1 from the docs, for the sake of consistency). For the following messages, I'm starting up the server in my user directory, browsing to http://127.0.0.1:8080/ in Chrome, and following the Documents link.

    python -m http.server 8080 --bind 127.0.0.1
    Serving HTTP on 127.0.0.1 port 8080 (http://127.0.0.1:8080/) ...
    127.0.0.1 - - [04/Jan/2020 15:15:18] "GET / HTTP/1.1" 200 -
    127.0.0.1 - - [04/Jan/2020 15:15:18] code 404, message File not found
    127.0.0.1 - - [04/Jan/2020 15:15:18] "GET /favicon.ico HTTP/1.1" 404 -
    127.0.0.1 - - [04/Jan/2020 15:15:28] "GET /Documents/ HTTP/1.1" 200 -

    @ShaneSmith
    Copy link
    Mannequin Author

    ShaneSmith mannequin commented Jan 5, 2020

    A small update:

    Using the direct invocation:

    python -m http.server 8000
    Serving HTTP on :: port 8080 (http://[::]:8080/) ...

    Is NOT accessible at the following addresses:
    http://[::]:8080/ # most surprising, because this is where it tells you to go
    http://<my_ip_address>:8080/ # this was the Python <= 3.7 behavior, as I used it anyhow

    But it IS accessible at the following addresses:
    http://[::1]:8080/
    http://localhost:8080/

    There may be others I don't know about. I recognize that my difficulties likely arise from a lack of familiarity with internet protocols, as this isn't something I use with any kind of regularity. But I do think it's possible (and desirable) for the method to be as casual-friendly as it was in Python 3.7.

    Specifically, the direct invocation tells the user they can go to http://[::]:8080/, which they cannot. They CAN go to http://[::1]:8080/. Should this instead be the message returned on direct invocation?

    So far as I can tell, this is still a behavior change, as the old behavior was accessible from your IP address and therefore visible to other computers on the network (I assume localhost is not). But it would at least do what it says on the tin.

    @SilentGhost
    Copy link
    Mannequin

    SilentGhost mannequin commented Jan 5, 2020

    It's the addition of flags=socket.AI_PASSIVE on Lib/http/server.py:1233 that's causing this. I think this also breaks for IPv4 sockets.

    @SilentGhost SilentGhost mannequin added stdlib Python modules in the Lib dir labels Jan 5, 2020
    @jaraco
    Copy link
    Member

    jaraco commented Jan 5, 2020

    First, a quick primer in IP:

    • Addresses are written as XXXX:XXXX:XXXX:XXXX:XXXX:XXXX:XXXX:XXXX, but any single span of zeros can be written as ::, so `::` is all zeros and `::1` is the same as 0000:0000:0000:0000:0000:0000:0000:0001.
    • ::1 is the local host (the some machine as where the code is running), equivalent to 127.0.0.1 in IPv4.
    • To listen on all interfaces, the socket library expects the system to bind to 0.0.0.0 (IPv4) or :: (IPv6).
    • When specified in a URL, an IPv6 address must be wrapped in [] to distinguish the `:` characters from the port separator. For example, http://[::1]:8000/ specifies connect to localhost over IPv6 on port 8000.
    • If the system supports dual-stack IPv4 over IPv6, all IPv4 addresses are mapped to a specific IPv6 subnet, so binding/listening on IPv6 often allows a client to connect to IPv4.
    • Even if the server is listening on all interfaces (0.0.0.0/::), the client must specify an internet address that will reach that address.

    As a result of this last point, it's not possible for a server like http.server to reliably know what address a client would be able to use to connect to the server. That is, if the server is bound on all interfaces, a local client could connect over localhost/127.0.0.1/::1 (assuming that interface exist, which it doesn't sometimes) or to another address assigned by the host, e.g. 2601:547:501:6ba:d1e6:300d:7e83:6b6f. A client on another host, however, would not be able to use localhost to connect to the server. It _must_ use an address that's both assigned to the server's host, bound by the server, and routeable to/from the client (i.e. not blocked by a firewall).

    Prior to Python 3.8, the default behavior was to bind to all interfaces on IPv4 only, which was unnecessarily limiting, but was subject to the same unexpected behavior:

    draft $ python3.7 -m http.server                                                                                                                                                                             
    Serving HTTP on 0.0.0.0 port 8000 (http://0.0.0.0:8000/) ...
    

    The URL there http://0.0.0.0:8000/ has the same failure mode as the one described above. One cannot browse to that address, but must replace 0.0.0.0 with localhost or 127.0.0.1 (to connect from localhost) or replace with a routable address to connect from another host. The only difference is that with Python 3.8, now IPv6 is honored.

    Note if one passes localhost or 127.0.0.1 or ::1 as the bind parameter, the URL indicated would work:

    draft $ python -m http.server --bind localhost                                                                                                                                                               
    Serving HTTP on ::1 port 8000 (http://[::1]:8000/) ...
    

    Since it's not possible in general to supply the URL a client would need to connect to the server, it's difficult to reliably provide a useful URL.

    Some web servers do apply a heuristic that translates "all addresses" to a "localhost" address, and Python stdlib could implement that heuristic.

    On 3.7.X I was able to use it as described in the docs and it would default to whatever IP address was available.

    That behavior should be the same, except that it should now bind to both IPv6 and IPv4. If you previously ran without any parameters, it would bind to all interfaces on IPv4. Now it binds on all interfaces on IPv6, which should be backward compatible in dual-stack environments like Windows. You just have to translate [::] to localhost instead of translating 0.0.0.0 to localhost.

    When I tested your findings on macOS, everything worked as I expected. I launched the server with python -m http.server, and the site could be reached on http://localhost:8000/ and http://127.0.0.1:8000 and http://[::1]:8000/. Nevertheless, when I tried the same thing on my Windows machine, I got a different outcome. The server bound to [::0] but was unreachable on http://127.0.0.1:8000.

    That was unexpected, and I'll try to ascertain why the dual-stack behavior isn't working as I'd expect.

    @jaraco
    Copy link
    Member

    jaraco commented Jan 5, 2020

    It's the addition of flags=socket.AI_PASSIVE on Lib/http/server.py:1233 that's causing this.

    Can you elaborate? What is it causing?

    I can see that flag was added in 62dbe55 for the purpose of:

    indicate to get the wildcard address (all interfaces).

    I don't recall beyond that why I went that route.

    I can see in cherrypy/cherrypy#871, CherryPy had to add this code to support dual-stack operation. I suspect that's also what Python needs here (in addition to a test that binding on :: responds on 127.0.0.1).

    @jaraco
    Copy link
    Member

    jaraco commented Jan 5, 2020

    Indeed, if I apply this patch:

    diff --git a/Lib/http/server.py b/Lib/http/server.py
    index 47a4fcf9a6..de995ae4b9 100644
    --- a/Lib/http/server.py
    +++ b/Lib/http/server.py
    @@ -1246,6 +1246,11 @@ def test(HandlerClass=BaseHTTPRequestHandler,
         """
         ServerClass.address_family, addr = _get_best_family(bind, port)
     
    +    def server_bind(self, orig=ServerClass.server_bind):
    +        self.socket.setsockopt(socket.IPPROTO_IPV6, socket.IPV6_V6ONLY, 0)
    +        return orig(self)
    +    ServerClass.server_bind = server_bind
    +
         HandlerClass.protocol_version = protocol
         with ServerClass(addr, HandlerClass) as httpd:
             host, port = httpd.socket.getsockname()[:2]
    

    And then run `python -m http.server`, it binds to `::` but responds on `127.0.0.1` on Windows:

    ~ # python -m http.server
    Serving HTTP on :: port 8000 (http://[::]:8000/) ... 
    ::ffff:127.0.0.1 - - [05/Jan/2020 14:48:09] "GET / HTTP/1.1" 200 - 
    

    I think the solution is to add a patch similar to that until Python has a socketserver that supports dual-stack binding. See related issues bpo-25667, bpo-20215, bpo-36208, bpo-17561, and bpo-38907.

    In fact, since bpo-38907 captures more concretely what I believe is the main issue here, I'm going to use that issue to address the concern. If Windows is able to bind dual-stack to IPv6/IPv4, I believe that would address the compatibility concern raised herein.

    I'm going to mark this as a duplicate, but if you believe there is another issue at play here, please don't hesitate to re-open or comment and I can.

    @ShaneSmith
    Copy link
    Mannequin Author

    ShaneSmith mannequin commented Jan 5, 2020

    Jason, thank you for the primer.

    Nevertheless, when I tried the same thing on my Windows machine, I got a different outcome. The server bound to [::0] but was unreachable on http://127.0.0.1:8000.

    Perhaps it's an issue with IPv4 addresses in general for Python 3.8 on Windows when they are not directly bound during invocation of the server. I used to be able to reach the server with http://<my_ipv4_address>:8080/ (this was my initial surprise), but now this behavior doesn't work for me. However, on further testing http://<my_ipv6_address>:8080/ DOES work.

    @jaraco
    Copy link
    Member

    jaraco commented Jan 6, 2020

    Other than addressing bpo-38907, is there anything else to be done here? In #62051, I've proposed a surgical fix to address the issue with IPv4 being unbound on Windows.

    @ShaneSmith
    Copy link
    Mannequin Author

    ShaneSmith mannequin commented Jan 6, 2020

    Based on my understanding, your fix should do it.

    @ShaneSmith ShaneSmith mannequin closed this as completed Jan 6, 2020
    @ShaneSmith ShaneSmith mannequin closed this as completed Jan 6, 2020
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.8 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants