This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: urlparse.urljoin does not add query part
Type: behavior Stage: resolved
Components: Versions: Python 2.7
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: albertsmuktupavels, martin.panter
Priority: normal Keywords:

Created on 2015-04-22 21:17 by albertsmuktupavels, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Messages (2)
msg241827 - (view) Author: (albertsmuktupavels) Date: 2015-04-22 21:17
From documentation:
"Construct a full (“absolute”) URL by combining a “base URL” (base) with another URL (url). Informally, this uses components of the base URL, in particular the addressing scheme, the network location and (part of) the path, to provide missing components in the relative URL."

base = http://www.example.com/example/foo.php?param=10
url = bar.php

I am expecting this result:
http://www.example.com/example/bar.php?param=10

But real result is:
http://www.example.com/example/bar.php

This should be easy fixable. Right now query= bquery is done only in one case - when path and params are empty.

I think that
if not query:
    query = bquery
should be moved before query might be used for first time. that means above code should be right after these lines:
if scheme != bscheme or scheme not in uses_relative:
    return url
msg243047 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-05-13 06:54
This is not how URL joining is meant to work. For example if the base URL “. . ./foo.php?param=10” produces a HTML file with a relative link to “bar.php”, the parent path should be joined on, but not the query part.

I understand the Python implementation is meant to more or less follow the RFC. See the second example at <https://tools.ietf.org/html/rfc3986.html#section-5.4> which is the same form as your case, and shows the query part being removed:

Base URI: http://a/b/c/d;p?q
Relative reference: "g"
Target URL: "http://a/b/c/g"

There are occasionally cases where keeping the base query, or even joining two sets of query parameters together, is desirable. But these cases are rare and urljoin() is not meant to handle them.
History
Date User Action Args
2022-04-11 14:58:16adminsetgithub: 68220
2015-05-13 12:34:41r.david.murraysetstatus: open -> closed
resolution: not a bug
stage: resolved
2015-05-13 06:54:08martin.pantersetnosy: + martin.panter
messages: + msg243047
2015-04-22 21:17:03albertsmuktupavelscreate