msg58990 - (view) |
Author: Olivier Croquette (ocroquette) |
Date: 2007-12-25 14:22 |
Some servers allow the @ character is usernames. It gives URLs like:
ftp://user@xyz@host/dir
user@xyz could for example by an email address.
I am not sure if this is RFC compliant. What's sure is that is makes
trouble with urlparse:
>>> from urlparse import urlparse
>>> p = urlparse("ftp://user@host1@host2/dir")
>>> print p.username
user
>>> print p.hostname
host1@host2
By using rsplit instead of split in lib/python2.5/urlparse.py, the
problem can be solved.
|
msg58995 - (view) |
Author: Senthil Kumaran (orsenthil) * |
Date: 2007-12-26 14:43 |
Hi ocroquette,
Even though I have seen ftp sites requesting email addresses as user
names, I would rather put this issue for discussion as this not per any RFC.
urlparse is based upon RFC1808 and it points to RFC1738 for the URL
Syntax. RFC1738 specifically says (line 225) that:
The user name (and password), if present, are followed by a commercial
at-sign "@". Within the user and password field, any ":", "@" or "/"
must be encoded.
>>> url = "ftp://user@xyz@host/dir"
>>> from urlparse import urlparse
>>> p = urlparse(url)
>>> print p.username
user
>>> print p.hostname
xyz@host
>>> corr_url = "ftp://user%40xyz@host/dir"
>>> q = urlparse(corr_url)
>>> print q.username
user%40xyz
>>> print q.hostname
host
>>>
So, it is upon the user/client to encode the @sign in the user name, if
he has to be compliant with RFC.
urlparse is not at fault here.
What shall we do with this issue?
|
msg58996 - (view) |
Author: Olivier Croquette (ocroquette) |
Date: 2007-12-26 16:38 |
Hi!
Thanks for the reply!
The problem right now is that urlparse parses silently an URL which is
not compliant, but does the wrong thing with it (since usernames can
contain @, and hostname can not, it's a more logical thing to parse from
the right using rsplit instead of split).
I see two possibilities to address that:
1. return a parse error if the URL contains two raw @
This way users and app developers will notice the problem rapidly
2. still accept this malformed URLs, but do what the user expects
Both solutions seem to me better than the current behaviour, so I would
say a change is necessary anyways.
PS: will urlparse transform an encoded "@" in the username when
.username is called, or does the application have to decode explicitely?
Olivier
|
msg58997 - (view) |
Author: Olivier Croquette (ocroquette) |
Date: 2007-12-26 16:40 |
See also the related bug on duplicity:
http://savannah.nongnu.org/bugs/?21475
|
msg58998 - (view) |
Author: Olivier Croquette (ocroquette) |
Date: 2007-12-26 16:45 |
And about the decoding, sorry, it's clear from your snippets that
urlparse doesn't do it:
>>> print q.username
user%40xyz
Maybe it should do it, I am not sure. What do you think? It would save
work for the module user.
|
msg59001 - (view) |
Author: Senthil Kumaran (orsenthil) * |
Date: 2007-12-26 17:24 |
> Olivier Croquette added the comment:
>
>
> The problem right now is that urlparse parses silently an URL which is
> not compliant, but does the wrong thing with it (since usernames can
> contain @, and hostname can not, it's a more logical thing to parse from
> the right using rsplit instead of split).
>
> I see two possibilities to address that:
>
> 1. return a parse error if the URL contains two raw @
> This way users and app developers will notice the problem rapidly
>
> 2. still accept this malformed URLs, but do what the user expects
>
> Both solutions seem to me better than the current behaviour, so I would
> say a change is necessary anyways.
I am inclined towards the option 2, wherein even if the non-conformant URL is
given, which has "@" in the user name, the urlparse should be able atleast
recognize the hostname correctly. In this case, using rsplit instead of split
does the trick.
Attached is the patch against the trunk, which adds this. I notice duplicity
project also retorting to the same.
Unless anyone has an objection, we should commit this change.
|
msg59002 - (view) |
Author: Senthil Kumaran (orsenthil) * |
Date: 2007-12-26 17:29 |
>
> Olivier Croquette added the comment:
>
> And about the decoding, sorry, it's clear from your snippets that
> urlparse doesn't do it:
> >>> print q.username
> user%40xyz
>
> Maybe it should do it, I am not sure. What do you think? It would save
> work for the module user.
No, urlparse does not have any method for encoding/decoding.
It has been left to urllib2 in urlencode, urldecode, quote, unquote etc.
Strangely though, user should use urllib2 for that purpose.
And yeah, there is been an effort to merge the urlparse and urllib2
functionality and I am slowly upto it. :)
Before more requests come this way, I guess I have to show some progress.
|
msg59109 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2008-01-02 23:14 |
Could you please add a unit test for this? Then I can check it in.
I'm thinking this could safely be fixed in 2.5.2.
|
msg59213 - (view) |
Author: Senthil Kumaran (orsenthil) * |
Date: 2008-01-04 04:17 |
Hi Guido,
I have added the unit tests, tested it on my system and created a new patch
combining the previous one against the trunk.
You can verify and check in this.
Thank you,
Senthil
|
msg59275 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2008-01-05 01:23 |
Committed revision 59726.
This is in the trunk (2.6). Do you need this backported to 2.5.2 as well?
|
msg59518 - (view) |
Author: Senthil Kumaran (orsenthil) * |
Date: 2008-01-08 02:42 |
Thank you, Yes having it in the 2.5.2 would be a good idea. The bug
report is also on 2.5 only.
I have attached the patch for the release25-maint branch.
Thanks,
Senthil
|
msg59519 - (view) |
Author: Senthil Kumaran (orsenthil) * |
Date: 2008-01-08 02:45 |
The patch against the trunk could also have been applied to 2.5. There
is NO change in the trunk and 2.4-maint patch.
Thanks,
Senthil
|
|
Date |
User |
Action |
Args |
2022-04-11 14:56:29 | admin | set | github: 46039 |
2008-01-08 02:45:02 | orsenthil | set | messages:
+ msg59519 |
2008-01-08 02:42:31 | orsenthil | set | files:
+ issue6898-r25-maint.patch messages:
+ msg59518 |
2008-01-05 01:23:01 | gvanrossum | set | status: open -> closed keywords:
+ patch messages:
+ msg59275 |
2008-01-04 04:17:57 | orsenthil | set | files:
+ issue1698-withtests.patch messages:
+ msg59213 |
2008-01-02 23:14:19 | gvanrossum | set | assignee: gvanrossum resolution: accepted messages:
+ msg59109 nosy:
+ gvanrossum |
2007-12-26 17:29:46 | orsenthil | set | messages:
+ msg59002 |
2007-12-26 17:24:55 | orsenthil | set | files:
+ urlparse_issue1698.patch messages:
+ msg59001 |
2007-12-26 16:45:00 | ocroquette | set | messages:
+ msg58998 |
2007-12-26 16:40:19 | ocroquette | set | messages:
+ msg58997 |
2007-12-26 16:38:23 | ocroquette | set | messages:
+ msg58996 |
2007-12-26 14:43:33 | orsenthil | set | nosy:
+ orsenthil messages:
+ msg58995 |
2007-12-25 14:22:11 | ocroquette | create | |