Message237200
"Following the syntax specifications in RFC 1808, urlparse recognizes a netloc
only if it is properly introduced by ‘//’. Otherwise the input is presumed to be
a relative URL and thus to start with a path component."
https://docs.python.org/2/library/urlparse.html
2015-03-03 22:16 GMT+00:00 Paul McMillan <>:
Yeah. I agree the lack of round trip is surprising, and I agree we
should fix it.
I think the underlying issue here is that urlparse has a pretty
different view of the world when compared with the browsers. I know
that bit me when I first started using python, and it periodically
surfaces in cases like this, where the browser thinks that
"//evil.com" is a url, but we've parsed it as part of a path.
Backwards compatibility makes it hard to update urlparse to precisely
match browser behavior, but there's probably room for a new library
designed with browser compatibility as a primary feature.
-Paul
On Tue, Mar 3, 2015 at 10:07 PM, Antoine Pitrou <> wrote:
>
> Hi Paul,
>
> Le 03/03/2015 23:01, Paul McMillan a écrit :
>> I understand how this works. You don't need to paste the example again.
>>
>> The documentation makes no guarantee that parse/unparse will do what
>> you want them to do, and does explicitly lay out the specific rules
>> used for separating the parts.
>
> Well, I don't know if it's a security issue, but failure to roundtrip
> *is* surprising (and IMHO dangerous for that reason) behaviour to say
> the least.
>
> Moreover, the urlunparse() documentation (in 3.x) says:
> """
> Construct a URL from a tuple as returned by urlparse(). [...] This may
> result in a slightly different, but equivalent URL, if the URL that was
> parsed originally had unnecessary delimiters
> """
>
(https://docs.python.org/3/library/urllib.parse.html#urllib.parse.urlunparse)
>
> which implies that any divergence when roundtripping should only consist
> in cosmetic, not essential, differences ("equivalent URL").
>
> Regards
>
> Antoine.
> -----------------------------
> Python Security Response Team
> Unsubscribe: https://mail.python.org/mailman/options/psrt/paul
%40mcmillan.ws |
|
Date |
User |
Action |
Args |
2015-03-04 18:41:29 | yaaboukir | set | recipients:
+ yaaboukir, orsenthil, pitrou, vstinner, benjamin.peterson, python-dev, martin.panter, PaulMcMillan, soilandreyes |
2015-03-04 18:41:29 | yaaboukir | set | messageid: <1425494489.7.0.699781300971.issue23505@psf.upfronthosting.co.za> |
2015-03-04 18:41:29 | yaaboukir | link | issue23505 messages |
2015-03-04 18:41:29 | yaaboukir | create | |
|