This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Title: urllib.parse should not discard delimiters when associated component is empty
Type: Stage: resolved
Components: Versions:
Status: closed Resolution: duplicate
Dependencies: Superseder: urllib.parse wrongly strips empty #fragment, ?query, //netloc
View: 22852
Assigned To: Nosy List: gdata gmail, martin.panter, orsenthil
Priority: normal Keywords:

Created on 2015-05-30 18:05 by gdata gmail, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Messages (2)
msg244477 - (view) Author: gdata gmail (gdata gmail) Date: 2015-05-30 18:05
The documenatation for urllib.parse ( states several times:

"This may result in a slightly different, but equivalent URL, if the URL that was parsed originally had unnecessary delimiters (for example, a ? with an empty query; the RFC states that these are equivalent)."

This is false -- RFC 3986 explicitly states that ? with an empty query is _not_ equivalent to a URL without it.  For example, the following two URL's should be considered different:
msg244515 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-05-31 04:34
This is essentially the same as Issue 22852. The title just refers to stripping an empty #fragment, but the netloc and query components are also affected. I have a patch there which needs reviewing, if you are interested. Or if you have any alternative ideas on how to solve this they would be welcome too.
Date User Action Args
2022-04-11 14:58:17adminsetgithub: 68520
2015-05-31 04:34:30martin.pantersetstatus: open -> closed

superseder: urllib.parse wrongly strips empty #fragment, ?query, //netloc

nosy: + martin.panter
messages: + msg244515
resolution: duplicate
stage: resolved
2015-05-30 23:03:38ned.deilysetnosy: + orsenthil
2015-05-30 18:05:17gdata gmailcreate