This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author yaaboukir
Recipients PaulMcMillan, benjamin.peterson, martin.panter, orsenthil, pitrou, python-dev, soilandreyes, vstinner, yaaboukir
Date 2015-03-04.18:41:29
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1425494489.7.0.699781300971.issue23505@psf.upfronthosting.co.za>
In-reply-to
Content
"Following the syntax specifications in RFC 1808, urlparse recognizes a netloc 

only if it is properly introduced by ‘//’. Otherwise the input is presumed to be 

a relative URL and thus to start with a path component." 

https://docs.python.org/2/library/urlparse.html

2015-03-03 22:16 GMT+00:00 Paul McMillan <>:

    Yeah. I agree the lack of round trip is surprising, and I agree we
    should fix it.

    I think the underlying issue here is that urlparse has a pretty
    different view of the world when compared with the browsers. I know
    that bit me when I first started using python, and it periodically
    surfaces in cases like this, where the browser thinks that
    "//evil.com" is a url, but we've parsed it as part of a path.
    Backwards compatibility makes it hard to update urlparse to precisely
    match browser behavior, but there's probably room for a new library
    designed with browser compatibility as a primary feature.

    -Paul

    On Tue, Mar 3, 2015 at 10:07 PM, Antoine Pitrou <> wrote:
    >
    > Hi Paul,
    >
    > Le 03/03/2015 23:01, Paul McMillan a écrit :
    >> I understand how this works. You don't need to paste the example again.
    >>
    >> The documentation makes no guarantee that parse/unparse will do what
    >> you want them to do, and does explicitly lay out the specific rules
    >> used for separating the parts.
    >
    > Well, I don't know if it's a security issue, but failure to roundtrip
    > *is* surprising (and IMHO dangerous for that reason) behaviour to say
    > the least.
    >
    > Moreover, the urlunparse() documentation (in 3.x) says:
    > """
    > Construct a URL from a tuple as returned by urlparse(). [...] This may
    > result in a slightly different, but equivalent URL, if the URL that was
    > parsed originally had unnecessary delimiters
    > """
    > 

(https://docs.python.org/3/library/urllib.parse.html#urllib.parse.urlunparse)
    >
    > which implies that any divergence when roundtripping should only consist
    > in cosmetic, not essential, differences ("equivalent URL").
    >
    > Regards
    >
    > Antoine.
    > -----------------------------
    > Python Security Response Team
    > Unsubscribe: https://mail.python.org/mailman/options/psrt/paul

%40mcmillan.ws
History
Date User Action Args
2015-03-04 18:41:29yaaboukirsetrecipients: + yaaboukir, orsenthil, pitrou, vstinner, benjamin.peterson, python-dev, martin.panter, PaulMcMillan, soilandreyes
2015-03-04 18:41:29yaaboukirsetmessageid: <1425494489.7.0.699781300971.issue23505@psf.upfronthosting.co.za>
2015-03-04 18:41:29yaaboukirlinkissue23505 messages
2015-03-04 18:41:29yaaboukircreate