Title: default_scheme in urlparse.urlparse() useless
Type: behavior Stage:
Components: Library (Lib) Versions: Python 2.5
Status: closed Resolution: duplicate
Dependencies: Superseder:
Assigned To: Nosy List: elachuni, facundobatista, pk
Priority: normal Keywords:

Created on 2008-04-07 08:37 by pk, last changed 2008-06-21 17:38 by facundobatista. This issue is now closed.

Messages (4)
msg65071 - (view) Author: (pk) Date: 2008-04-07 08:37

the urlparse() function accepts a parameter default_scheme, to be used
if the address given does not contain one, but I cannot make
use of it, because I would expect these two returning
identical values:

>>> from urlparse import urlparse
>>> urlparse("www","http")
('http', '', 'www', '', '', '')
>>> urlparse("http://www","http")
('http', 'www', '', '', '', '')

This has been reported about six years ago but apparently
the behaviour hasn't changed.  I cannot imagine that this
really is the intended behaviour.


msg65072 - (view) Author: (pk) Date: 2008-04-07 08:45
and this is the url to the old report:
msg68520 - (view) Author: Anthony Lenton (elachuni) * Date: 2008-06-21 17:31
In there's already a discussion about
The RFC that urlparse is following (rfc 1808) requires the net_loc
component to start with // even if the scheme component is missing,
which is why urlparse("www","http") puts the 'www' in to the path
component instead of net_loc.

It seems that this is indeed the intended behavior, and the patch for
issue 754016 adds a docfix clarifying this.
msg68522 - (view) Author: Facundo Batista (facundobatista) * (Python committer) Date: 2008-06-21 17:38
Thanks (pk) and Anthony!
Date User Action Args
2008-06-21 17:38:08facundobatistasetstatus: open -> closed
resolution: duplicate
messages: + msg68522
2008-06-21 17:35:54elachunisetnosy: + facundobatista
2008-06-21 17:31:17elachunisetnosy: + elachuni
messages: + msg68520
2008-04-07 08:45:10pksetmessages: + msg65072
2008-04-07 08:37:55pkcreate