classification
Title: Normalization error in urlunparse
Type: behavior Stage:
Components: Library (Lib) Versions: Python 3.2, Python 3.1, Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: orsenthil Nosy List: dstanek, eric.araujo, orsenthil
Priority: normal Keywords:

Created on 2009-04-25 19:12 by eric.araujo, last changed 2010-11-02 19:36 by eric.araujo.

Messages (3)
msg86538 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2009-04-25 19:12
Docstring for urlunparse says:
    """Put a parsed URI back together again.  This may result in a
    slightly different, but equivalent URI, if the URI that was parsed
    originally had redundant delimiters, e.g. a ? with an empty query
    (the draft states that these are equivalent)."""

“Draft” here refers to RFC 1808, superseded by 3986. However, RFC 3986
(section 6.2.3) states:
“Normalization should not remove delimiters when their associated
component is empty unless licensed to do so by the scheme  
specification.  For example, the URI "http://example.com/?" cannot be  
 assumed to be equivalent to any of the examples above.  Likewise, the 
  presence or absence of delimiters within a userinfo subcomponent is  
 usually significant to its interpretation.  The fragment component is 
  not subject to any scheme-based normalization; thus, two URIs that   
differ only by the suffix "#" are considered different regardless of   
the scheme.”

I guess we need some tests here to check compliance.
msg86541 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2009-04-25 19:45
This is indeed a bug. urlunparse should special-case "#" so as not to
discard it.
msg110314 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2010-07-14 19:09
Currently this claim will fail:

>>> obj = urlparse.urlparse('http://a/b/c?')
>>> urlparse.urlunparse(obj)
'http://a/b/c'
>>> obj = urlparse.urlparse('http://a/b/c#')
>>> urlparse.urlunparse(obj)
'http://a/b/c'

If we move away from the current behavior, there will surely be some test failures that can be observed for urljoins. We will have to consider those cases too while fixing this.
History
Date User Action Args
2010-11-02 19:36:38eric.araujosetnosy: orsenthil, dstanek, eric.araujo
title: Possible normalization error in urlparse.urlunparse -> Normalization error in urlunparse
components: + Library (Lib)
versions: + Python 3.1, Python 2.7, Python 3.2
2010-08-18 00:15:17dstaneksetnosy: + dstanek
2010-07-14 19:09:30orsenthilsetmessages: + msg110314
2010-07-11 14:28:57eric.araujosetassignee: orsenthil

type: behavior
nosy: + orsenthil
2009-04-25 19:45:10eric.araujosetmessages: + msg86541
2009-04-25 19:12:38eric.araujocreate