classification
Title: urljoining an empty query string doesn't clear query string
Type: behavior Stage: patch review
Components: Library (Lib) Versions: Python 3.6, Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: asvetlov, orsenthil, pfish
Priority: normal Keywords: patch

Created on 2018-02-06 04:48 by pfish, last changed 2018-02-15 20:28 by pfish.

Pull Requests
URL Status Linked Edit
PR 5645 open python-dev, 2018-02-12 22:01
Messages (4)
msg311704 - (view) Author: Paul Fisher (pfish) * Date: 2018-02-06 04:48
urljoining with '?' will not clear a query string:

ACTUAL:
>>> import urllib.parse
>>> urllib.parse.urljoin('http://a/b/c?d=e', '?')
'http://a/b/c?d=e'

EXPECTED:
'http://a/b/c' (optionally, with a ? at the end)

WhatWG's URL standard expects a relative URL consisting of only a ? to replace a query string:

https://url.spec.whatwg.org/#relative-state

Seen in versions 3.6 and 2.7, but probably also affects later versions.
msg311937 - (view) Author: Paul Fisher (pfish) * Date: 2018-02-10 06:05
I'm working on a patch for this and can have one up in the next week or so, once I get the CLA signed and other boxes ticked.  I'm new to the Github process but hopefully it will be a good start for the discussion.
msg312201 - (view) Author: Andrew Svetlov (asvetlov) * (Python committer) Date: 2018-02-15 11:04
Python follows not WhatWG but RFC.
https://tools.ietf.org/html/rfc3986#section-5.2.2 is proper definition for url joining algorithm.
msg312223 - (view) Author: Paul Fisher (pfish) * Date: 2018-02-15 20:28
In this case, the RFC is mismatched from the actual behaviour of browsers (as described and codified by WhatWG).  It was surprising to me that urljoin() didn't do what I percieved as "the right thing" (and I expect other users would too).

I would personally expect urljoin to do "the thing that everybody else does".  Is there a sensible way to reduce this mismatch?

For reference, Java's stdlib does what I would expect here:

    URI base = URI.create("https://example.com/?a=b");
    URI rel = base.resolve("?");
    System.out.println(rel);

https://example.com/?
History
Date User Action Args
2018-02-15 20:28:51pfishsetmessages: + msg312223
2018-02-15 11:04:20asvetlovsetnosy: + asvetlov
messages: + msg312201
2018-02-12 22:01:41python-devsetkeywords: + patch
stage: patch review
pull_requests: + pull_request5446
2018-02-10 06:05:14pfishsetmessages: + msg311937
2018-02-10 03:38:04terry.reedysetnosy: + orsenthil
2018-02-06 04:48:49pfishcreate