Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Directory traversal with http.server and SimpleHTTPServer on windows #70844

Closed
Thomas mannequin opened this issue Mar 28, 2016 · 16 comments
Closed

Directory traversal with http.server and SimpleHTTPServer on windows #70844

Thomas mannequin opened this issue Mar 28, 2016 · 16 comments
Labels
OS-windows stdlib Python modules in the Lib dir type-security A security issue

Comments

@Thomas
Copy link
Mannequin

Thomas mannequin commented Mar 28, 2016

BPO 26657
Nosy @pfmoore, @larryhastings, @tjguk, @ned-deily, @vadmium, @zware, @zooba, @zhangyangyu
PRs
  • [security][3.4] bpo-26657: Fix Windows directory traversal vulnerability with http.server #782
  • [Security][3.4] bpo-26657: Fix Windows directory traversal vulnerability with http.server #226
  • [3.3][security] bpo-26657: Fix Windows directory traversal vulnerability with http.se… #2860
  • Files
  • fuzz.py: Fuzzing test
  • fix-path-traversal-26657.patch: Patch to fix the problem (includes testcase)
  • fix-path-traversal-26657.patch: Patch to fix the problem (includes testcase, updated)
  • fix-path-traversal-26657.v3.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2016-04-18.09:41:33.329>
    created_at = <Date 2016-03-28.15:30:15.736>
    labels = ['type-security', 'library', 'OS-windows']
    title = 'Directory traversal with http.server and SimpleHTTPServer on windows'
    updated_at = <Date 2019-05-10.18:08:30.658>
    user = 'https://bugs.python.org/Thomas'

    bugs.python.org fields:

    activity = <Date 2019-05-10.18:08:30.658>
    actor = 'ned.deily'
    assignee = 'none'
    closed = True
    closed_date = <Date 2016-04-18.09:41:33.329>
    closer = 'martin.panter'
    components = ['Library (Lib)', 'Windows']
    creation = <Date 2016-03-28.15:30:15.736>
    creator = 'Thomas'
    dependencies = []
    files = ['42315', '42318', '42319', '42352']
    hgrepos = []
    issue_num = 26657
    keywords = ['patch']
    message_count = 16.0
    messages = ['262572', '262581', '262583', '262585', '262595', '262596', '262786', '262793', '262798', '262806', '263533', '263653', '263661', '298156', '298212', '299203']
    nosy_count = 11.0
    nosy_names = ['paul.moore', 'larry', 'tim.golden', 'ned.deily', 'phihag', 'python-dev', 'martin.panter', 'zach.ware', 'steve.dower', 'xiang.zhang', 'Thomas']
    pr_nums = ['782', '226', '2860']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'security'
    url = 'https://bugs.python.org/issue26657'
    versions = ['Python 2.7', 'Python 3.3', 'Python 3.4', 'Python 3.5', 'Python 3.6']

    @Thomas
    Copy link
    Mannequin Author

    Thomas mannequin commented Mar 28, 2016

    SimpleHTTPServer and http.server allow directory traversal on Windows.
    To exploit this vulnerability, replace all ".." in URLs with "c:c:c:..".

    Example:
    Run
    python -m http.server
    and visit
    127.0.0.1:8000/c:c:c:../secret_file_that_should_be_secret_but_is_not.txt

    There is a warning that those modules are not secure in the module docs,
    but for some reason they do not appear in the online docs:
    https://docs.python.org/3/library/http.server.html
    https://docs.python.org/2/library/simplehttpserver.html

    It would be nice if that warning was as apparent as for example here:
    https://docs.python.org/2/library/xml.etree.elementtree.html

    There are a lot of other URLs that are insecure as well, which can all
    be traced back to here:
    https://hg.python.org/cpython/file/tip/Lib/http/server.py#l766

    The splitdrive and split functions, which should make sure that the
    final output is free of ".." are only called once which leads to this
    control flow:
    ---------------------------------------------------------------

    path = "c:/secret/public"
    word = "c:c:c:.."

    _, word = os.path.splitdrive(word) # word = "c:c:.."
    _, word = os.path.split(word) # word = "c:.."
    path = os.path.join(path, word) # path = "c:/secret/public\\.."
    ---------------------------------------------------------------

    Iterating splitdrive and split seems safer:
    ---------------------------------------------------------------

    for word in words:
        # Call split and splitdrive multiple times until
        # word does not change anymore.
        has_changed = True
        while has_changed:
            previous_word = word
            _, word = os.path.split(word)
            _, word = os.path.splitdrive(word)
            has_changed = word != previous_word

    There is another weird thing which I am not quite sure about here:
    https://hg.python.org/cpython/file/tip/Lib/http/server.py#l761

    ---------------------------------------------------------------

    path = posixpath.normpath(path)
    words = path.split('/')

    posixpath.normpath does not do anything with backslashes and then the
    path is split by forward slashes, so it may still contain backslashes.
    Maybe replacing posixpath.normpath with os.path.normpath and then
    splitting by os.sep would work, but I don't have enough different
    operating systems to test this, so someone else should have a look.

    I have attached some simple fuzzing test that tries a few weird URLs and
    sees if they lead where they shouldn't.
    Disclaimer: Might still contain other bugs.

    @Thomas Thomas mannequin added stdlib Python modules in the Lib dir type-security A security issue labels Mar 28, 2016
    @SilentGhost SilentGhost mannequin added the OS-windows label Mar 28, 2016
    @phihag
    Copy link
    Mannequin

    phihag mannequin commented Mar 28, 2016

    Please find attached a patch which adds a testcase for Windows (on all platforms) as well as code to fix the problem. Since os.path.split returns everything after the final slash/backslash, it only needs to be called once.

    Note that the usage of posixpath is correct and only relates to the URL parsing - it powers foo/bar/../../ .

    The path elements may indeed contain backslashes - that's why we call os.path.split on them.

    @phihag
    Copy link
    Mannequin

    phihag mannequin commented Mar 28, 2016

    Update testcase, and call split before splitdrive

    @vadmium
    Copy link
    Member

    vadmium commented Mar 28, 2016

    Thomas: can you point to the “warning that those modules are not secure in the module docs”? All I can see is a warning in the pydoc output for http.server.__doc__, but that is specifically about the CGI server.

    The specific bug with allowing c:c:c:.. looks like it would have been triggered by fixing bpo-19456. If so, 2.7 would also be affected.

    The whole translate_path() algorithm looks over-complicated to me. One quick idea that comes to mind is to skip (or diagnose) each whole URL path component if there is any drive, directory etc syntax present, rather than making an attempt to strip it off. Perhaps check with os.path.dirname() or pathlib’s is_reserved().

    One other thing I have wondered about, but I don’t know Windows well enough to be sure. Shouldn’t there be some protection against accessing special files like <http://127.0.0.1:8000/con.fusion\>?

    Ideally, I would like to see a common function somewhere to do this kind of path validation and conversion. There are other places even in the standard library where this is needed, which I mentioned at <https://bugs.python.org/issue21109#msg216675\>.

    @Thomas
    Copy link
    Mannequin Author

    Thomas mannequin commented Mar 29, 2016

    Martin Panter: Regarding the warning, you appear to be correct.
    However, reading the source of http.server again made me notice
    _url_collapse_path(path)
    which seems to have some overlap with translate_path. Also it
    crashes with an IndexError if path contains '..'.

    Also, yes, python 2.7's SimpleHTTPServer is affected as well.

    Discarding weird paths instead of trying to repair them would change semantics, but from a user perspective, it would be easier to understand what is going on, so I'd agree with that change.

    Further, I agree that it would be nice if there was some library function to safely handle path operations.
    The function you proposed in https://bugs.python.org/issue21109#msg216675 and https://bitbucket.org/vadmium/pyrescene/src/34264f6/rescene/utility.py#cl-217 leaves handling path separators to the user. Maybe that should be handled as well?
    The function withstood my fuzzing tests on windows, so it might be correct.
    There is probably a good reason for disallowing paths that contain /dev/null but I don't know why. Could you add a word or two of documentation to explain?

    A really high-level solution would be to do away with all the strings and handle paths properly as the structure that they represent instead of trying to fake all kinds of things with strings, but that is probably beyond the scope of this issue.

    @zhangyangyu
    Copy link
    Member

    Url handling in http.server is not perfect and there have already been some issues talking about it, i.e, bpo-5714, bpo-14567.

    @vadmium
    Copy link
    Member

    vadmium commented Apr 2, 2016

    Thomas: My check for os.path.devnull was just a half-hearted attempt to check for special device names like NUL on Windows. It is far from foolproof, and would fail my CON.fusion test that I mentioned above. Anyway, to address this specific bug it would be better to keep the changes to a minimum and not add any new APIs.

    One slight concern I have with Philipp’s patch is the new os_path parameter. I am a bit squeamish about adding parameters that are just to help testing. Perhaps it is enough to just rely on testing this on Windows, or to monkey-patch os.path = ntpath in the test suite? What do others think?

    I am posting a modified version (v3) of Philipp’s patch. This version monkey-patches os.path in the tests and avoids the os_path parameter. It is also stricter, by ignoring any path component that does not appear to be a simple file or directory name.

    This version will change how some questionable URLs are handled, but I expect that all of these URLs won’t have genuine use cases. Let me know if you think it is okay or not.

    @Thomas
    Copy link
    Mannequin Author

    Thomas mannequin commented Apr 2, 2016

    Looks ok to me security-wise. But I just noticed that it the trailing slash is inconsistent on Windows, e.g.:

    translate_path('asdf/')
    ==
    'C:\\Users\\User\\Desktop\\temp\\asdf/' <- this slash

    because path += '/' is used instead of os.path.sep. But apparently nobody complained about this yet, so it probably is not an issue.

    @zooba
    Copy link
    Member

    zooba commented Apr 2, 2016

    Windows-only tests are fine, and certainly better than adding a new parameter just for testing.

    Forward slashes are valid path segment separators on Windows 99% of the time, so that'll be why nobody has complained. Personally I prefer consistency, but not strongly enough to force a change here.

    @vadmium
    Copy link
    Member

    vadmium commented Apr 3, 2016

    Regarding the trailing slash: this is certainly inconsistent, but one call site of translate_path() appears to depend on this being a forward slash. There seems to be confusion about whether the output is an OS path or a URL. I think this is just more thing to look at in a hypothetical future overhaul of path and URL handling (e.g. see Zhang’s links above).

    @vadmium
    Copy link
    Member

    vadmium commented Apr 16, 2016

    I will try to commit my patch in a couple days if there are no objections.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Apr 18, 2016

    New changeset 8054a68dfce2 by Martin Panter in branch '3.5':
    Issue bpo-26657: Fix Windows directory traversal vulnerability with http.server
    https://hg.python.org/cpython/rev/8054a68dfce2

    New changeset 5d8042ab3361 by Martin Panter in branch 'default':
    Issue bpo-26657: Merge http.server fix from 3.5
    https://hg.python.org/cpython/rev/5d8042ab3361

    New changeset 8aa032b26552 by Martin Panter in branch '2.7':
    Issue bpo-26657: Fix SimpleHTTPServer Windows directory traversal vulnerability
    https://hg.python.org/cpython/rev/8aa032b26552

    @vadmium
    Copy link
    Member

    vadmium commented Apr 18, 2016

    Thanks for the report and the patch.

    @vadmium vadmium closed this as completed Apr 18, 2016
    @larryhastings
    Copy link
    Contributor

    Will this be backported to 3.3 or 3.6? I don't see a PR or checkin for either of those versions on this issue, and both those versions are open for security fixes.b

    @larryhastings
    Copy link
    Contributor

    New changeset 6f6bc1d by larryhastings (Victor Stinner) in branch '3.4':
    bpo-26657: Fix Windows directory traversal vulnerability with http.server (#782)
    6f6bc1d

    @ned-deily
    Copy link
    Member

    New changeset 7b92f9f by Ned Deily (Victor Stinner) in branch '3.3':
    bpo-26657: Fix Windows directory traversal vulnerability with http.server (#782) (bpo-2860)
    7b92f9f

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    OS-windows stdlib Python modules in the Lib dir type-security A security issue
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants