classification
Title: urljoin behavior unclear/not following RFC 3986
Type: behavior Stage:
Components: Library (Lib) Versions: Python 3.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Matthew Kenigsberg, orsenthil, xtreak
Priority: normal Keywords:

Created on 2019-06-11 15:45 by Matthew Kenigsberg, last changed 2019-06-11 16:05 by xtreak.

Messages (1)
msg345243 - (view) Author: Matthew Kenigsberg (Matthew Kenigsberg) Date: 2019-06-11 15:45
Was trying to figure out the exact behavior of urljoin. As far as I can tell (see https://bugs.python.org/issue22118) it should follow RFC 3986.  According to the algorithm in 5.2.2, I think this is wrong:
>>> urljoin("ftp://netloc", "http://a/b/../c/d")
'http://a/b/../c/d'

And the .. should get removed.

Might be a separate issue, but at the very least, I think the docs should be updated to describe the exact behavior, or at least more directly state that the behavior defined in RFC 3986 is followed.

Would be happy to write a patch if a change is needed.
History
Date User Action Args
2019-06-11 16:05:07xtreaksetnosy: + orsenthil, xtreak
components: + Library (Lib)
2019-06-11 15:45:18Matthew Kenigsbergcreate