New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Directory traversal with http.server and SimpleHTTPServer on windows #70844
Comments
SimpleHTTPServer and http.server allow directory traversal on Windows. Example: There is a warning that those modules are not secure in the module docs, It would be nice if that warning was as apparent as for example here: There are a lot of other URLs that are insecure as well, which can all The splitdrive and split functions, which should make sure that the path = "c:/secret/public"
word = "c:c:c:.." _, word = os.path.splitdrive(word) # word = "c:c:.." Iterating splitdrive and split seems safer: for word in words:
# Call split and splitdrive multiple times until
# word does not change anymore.
has_changed = True
while has_changed:
previous_word = word
_, word = os.path.split(word)
_, word = os.path.splitdrive(word)
has_changed = word != previous_word There is another weird thing which I am not quite sure about here: --------------------------------------------------------------- path = posixpath.normpath(path)
words = path.split('/') posixpath.normpath does not do anything with backslashes and then the I have attached some simple fuzzing test that tries a few weird URLs and |
Please find attached a patch which adds a testcase for Windows (on all platforms) as well as code to fix the problem. Since os.path.split returns everything after the final slash/backslash, it only needs to be called once. Note that the usage of posixpath is correct and only relates to the URL parsing - it powers foo/bar/../../ . The path elements may indeed contain backslashes - that's why we call os.path.split on them. |
Update testcase, and call split before splitdrive |
Thomas: can you point to the “warning that those modules are not secure in the module docs”? All I can see is a warning in the pydoc output for http.server.__doc__, but that is specifically about the CGI server. The specific bug with allowing c:c:c:.. looks like it would have been triggered by fixing bpo-19456. If so, 2.7 would also be affected. The whole translate_path() algorithm looks over-complicated to me. One quick idea that comes to mind is to skip (or diagnose) each whole URL path component if there is any drive, directory etc syntax present, rather than making an attempt to strip it off. Perhaps check with os.path.dirname() or pathlib’s is_reserved(). One other thing I have wondered about, but I don’t know Windows well enough to be sure. Shouldn’t there be some protection against accessing special files like <http://127.0.0.1:8000/con.fusion\>? Ideally, I would like to see a common function somewhere to do this kind of path validation and conversion. There are other places even in the standard library where this is needed, which I mentioned at <https://bugs.python.org/issue21109#msg216675\>. |
Martin Panter: Regarding the warning, you appear to be correct. Also, yes, python 2.7's SimpleHTTPServer is affected as well. Discarding weird paths instead of trying to repair them would change semantics, but from a user perspective, it would be easier to understand what is going on, so I'd agree with that change. Further, I agree that it would be nice if there was some library function to safely handle path operations. A really high-level solution would be to do away with all the strings and handle paths properly as the structure that they represent instead of trying to fake all kinds of things with strings, but that is probably beyond the scope of this issue. |
Thomas: My check for os.path.devnull was just a half-hearted attempt to check for special device names like NUL on Windows. It is far from foolproof, and would fail my CON.fusion test that I mentioned above. Anyway, to address this specific bug it would be better to keep the changes to a minimum and not add any new APIs. One slight concern I have with Philipp’s patch is the new os_path parameter. I am a bit squeamish about adding parameters that are just to help testing. Perhaps it is enough to just rely on testing this on Windows, or to monkey-patch os.path = ntpath in the test suite? What do others think? I am posting a modified version (v3) of Philipp’s patch. This version monkey-patches os.path in the tests and avoids the os_path parameter. It is also stricter, by ignoring any path component that does not appear to be a simple file or directory name. This version will change how some questionable URLs are handled, but I expect that all of these URLs won’t have genuine use cases. Let me know if you think it is okay or not. |
Looks ok to me security-wise. But I just noticed that it the trailing slash is inconsistent on Windows, e.g.: translate_path('asdf/')
==
'C:\\Users\\User\\Desktop\\temp\\asdf/' <- this slash because path += '/' is used instead of os.path.sep. But apparently nobody complained about this yet, so it probably is not an issue. |
Windows-only tests are fine, and certainly better than adding a new parameter just for testing. Forward slashes are valid path segment separators on Windows 99% of the time, so that'll be why nobody has complained. Personally I prefer consistency, but not strongly enough to force a change here. |
Regarding the trailing slash: this is certainly inconsistent, but one call site of translate_path() appears to depend on this being a forward slash. There seems to be confusion about whether the output is an OS path or a URL. I think this is just more thing to look at in a hypothetical future overhaul of path and URL handling (e.g. see Zhang’s links above). |
I will try to commit my patch in a couple days if there are no objections. |
New changeset 8054a68dfce2 by Martin Panter in branch '3.5': New changeset 5d8042ab3361 by Martin Panter in branch 'default': New changeset 8aa032b26552 by Martin Panter in branch '2.7': |
Thanks for the report and the patch. |
Will this be backported to 3.3 or 3.6? I don't see a PR or checkin for either of those versions on this issue, and both those versions are open for security fixes.b |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: