This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Confusing TypeError in urllib.urlopen
Type: behavior Stage: resolved
Components: Documentation, Library (Lib) Versions: Python 2.7
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: docs@python Nosy List: Stefan.Bucur, docs@python, r.david.murray
Priority: normal Keywords:

Created on 2013-04-03 10:45 by Stefan.Bucur, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Messages (2)
msg185913 - (view) Author: Stefan Bucur (Stefan.Bucur) Date: 2013-04-03 10:45
When calling urllib.urlopen with a string containing the NULL ('\x00') character, a TypeError exception is thrown, as in the following example:

urllib.urlopen('\x00\x00\x00')

[...]
  File "/home/bucur/onion/python-bin/lib/python2.7/urllib.py", line 86, in urlopen
    return opener.open(url)
  File "/home/bucur/onion/python-bin/lib/python2.7/urllib.py", line 207, in open
    return getattr(self, name)(url)
  File "/home/bucur/onion/python-bin/lib/python2.7/urllib.py", line 462, in open_file
    return self.open_local_file(url)
  File "/home/bucur/onion/python-bin/lib/python2.7/urllib.py", line 474, in open_local_file
    stats = os.stat(localname)
TypeError: must be encoded string without NULL bytes, not str


This exception is confusing, since apparently the right type (a string) is passed to the function.  Since this behavior cannot change, it would be good to mention this exception in the function documentation.

I can imagine code that composes a URL based on user-supplied input and passes it to urlopen crashing if it doesn't properly sanitize the URL and/or doesn't catch TypeError.
msg185915 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2013-04-03 11:09
In Python3 the equivalent urllib.request.urlopen call produces:

  ValueError: unknown url type:

So this is effectively already fixed (although that error message should be doing a repr on the value, so I fixed that).  

We don't in general document every exception that might be raised by a function.  Here the TypeError is coming from treating the url as a local filename.  I don't think it is appropriate to document all the errors that can arise from treating the URL as a filename in the urllib docs, so I don't believe any changes should be made here.  I've added the 'doc' componennt, so if someone from the doc team disagrees with me they can reopen the issue.

As for your specific concern, the application has more problems (as in, security problems) than crashing because of a TypeError if it is composing the URL from user input such that the URL gets treated as a local filename.  (This is arguably a bug in urllib, that it appears has been fixed in Python3.)
History
Date User Action Args
2022-04-11 14:57:43adminsetgithub: 61824
2013-04-03 11:09:50r.david.murraysetstatus: open -> closed

assignee: docs@python
components: + Documentation

nosy: + r.david.murray, docs@python
messages: + msg185915
resolution: not a bug
stage: resolved
2013-04-03 10:45:05Stefan.Bucurcreate