This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: urllib2 doesn't handle urls without scheme
Type: Stage:
Components: Library (Lib) Versions:
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: facundobatista, jackjansen, jjlee
Priority: normal Keywords:

Created on 2003-05-28 18:54 by jackjansen, last changed 2022-04-10 16:08 by admin. This issue is now closed.

Messages (6)
msg60337 - (view) Author: Jack Jansen (jackjansen) * (Python committer) Date: 2003-05-28 18:54
urllib2.urlopen does not handle URLs without a scheme, so the 
following code will not work:
    url = urllib.pathname2url('/etc/passwd')
    urllib2.urlopen(url)
The same code does work with urllib.urlopen.
msg60338 - (view) Author: John J Lee (jjlee) Date: 2003-11-30 23:24
Logged In: YES 
user_id=261020

Is it wise to allow this?  Maybe it's unlikely to cause bugs, but 
"/etc/passwd" could refer to any URI scheme, not only file:. 
 
Since it seems reasonable to only allow absolute URLs, I think 
it's a bad idea to guess the scheme is file: when given a 
relative URL. 
msg60339 - (view) Author: John J Lee (jjlee) Date: 2005-05-19 20:24
Logged In: YES 
user_id=261020

Could somebody close this?
msg60340 - (view) Author: Jack Jansen (jackjansen) * (Python committer) Date: 2005-05-19 23:53
Logged In: YES 
user_id=45365

I'm not convinced it isn't a bug. I agree that the URL '/etc/passwd' isn't 
always a file: url, but I think that in that case urllib2 should get its own 
pathname2url() method that returns urls with the file: prefix.
msg60341 - (view) Author: John J Lee (jjlee) Date: 2005-05-22 12:25
Logged In: YES 
user_id=261020

That sounds like a feature request to me, not a bug. 
 
I agree it's desirable to have a better pathname2url (I haven't 
submitted one partly because I'm scared of getting it wrong!). 
 
I disagree that it should be a method, since OpenerDirector has 
no knowledge of base URL (and urllib2.Request or the response 
class also seem like the wrong places for that method: the URLs 
they have aren't always the URL you want to use as the base 
URL).  It would be nice to have a couple of functions 
urlparse.urlfrompathname(pathname) and 
urlparse.absurlfrompathname(pathname, baseurl) (better 
names / places for those, anyone?). 
 
Or you could resubmit this as a bug in urllib for allowing relative 
URLs without knowing the base URL ;-) 
 
msg63056 - (view) Author: Facundo Batista (facundobatista) * (Python committer) Date: 2008-02-26 23:26
Closing because it isn't clear if it's a bug (describe why do you think
it's wrong the actual state) or a feature request (describe what exactly
do you want/propose for the future).

Note that most probably urllib and urllib2 will be merged for py3k.
History
Date User Action Args
2022-04-10 16:08:58adminsetgithub: 38563
2008-02-26 23:26:35facundobatistasetstatus: open -> closed
nosy: + facundobatista
resolution: not a bug
messages: + msg63056
2003-05-28 18:54:09jackjansencreate