classification
Title: urllib.request.urlretrieve calls URLError with 3 args
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.4, Python 3.3, Python 3.2
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: orsenthil Nosy List: Alexandru.Moșoi, chris.jerdonek, ezio.melotti, haypo, juho, ncoghlan, orsenthil, python-dev, techtonik, terry.reedy
Priority: high Keywords: easy, patch

Created on 2011-01-05 18:27 by Alexandru.Moșoi, last changed 2012-10-27 10:55 by orsenthil. This issue is now closed.

Files
File name Uploaded Description Edit
issue10836.diff ezio.melotti, 2012-10-15 17:01
issue10836-2.diff ezio.melotti, 2012-10-15 17:39 Patch with test against 3.2.
issue10836.patch orsenthil, 2012-10-21 07:19 review
Messages (16)
msg125449 - (view) Author: Alexandru Moșoi (Alexandru.Moșoi) Date: 2011-01-05 18:27
If I try to download a inexistent file I get a TypeError which is thrown during exception handling.

>>> import urllib.request
>>> urllib.request.urlretrieve('missing')
Traceback (most recent call last):
  File "/usr/lib/python3.1/urllib/request.py", line 1705, in open_local_file
    stats = os.stat(localname)
OSError: [Errno 2] No such file or directory: 'missing'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.1/urllib/request.py", line 133, in urlretrieve
    return _urlopener.retrieve(url, filename, reporthook, data)
  File "/usr/lib/python3.1/urllib/request.py", line 1507, in retrieve
    fp = self.open_local_file(url1)
  File "/usr/lib/python3.1/urllib/request.py", line 1707, in open_local_file
    raise URLError(e.errno, e.strerror, e.filename)
TypeError: __init__() takes at most 3 positional arguments (4 given)
msg172954 - (view) Author: anatoly techtonik (techtonik) Date: 2012-10-15 10:11
Looks like a regression - I can't remember such messages in Python 2.
msg172959 - (view) Author: anatoly techtonik (techtonik) Date: 2012-10-15 11:02
Clearly regressing. In Python 2 it was IOError exception:

>>> import urllib
>>> urllib.urlretrieve('missing')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\pp\lib\urllib.py", line 93, in urlretrieve
    return _urlopener.retrieve(url, filename, reporthook, data)
  File "C:\pp\lib\urllib.py", line 239, in retrieve
    fp = self.open(url, data)
  File "C:\pp\lib\urllib.py", line 207, in open
    return getattr(self, name)(url)
  File "C:\pp\lib\urllib.py", line 462, in open_file
    return self.open_local_file(url)
  File "C:\pp\lib\urllib.py", line 476, in open_local_file
    raise IOError(e.errno, e.strerror, e.filename)
IOError: [Errno 2] The system cannot find the file specified: 'missing'
>>>
msg172978 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2012-10-15 14:49
The problem is in open_local_file (Lib/urllib/request.py:1885).  It passes 3 arguments to URLError, but the constructor only accepts 2: reason and filename (Lib/urllib/error.py:29).
This used to be an IOError when urllib was a single module rather than a package.
msg172982 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2012-10-15 15:13
The presence of "During handling of the above exception, another exception occurred:" is part of an intentional change between 2.x and 3.x. Exception handling has been tweaked a bit more in the past year. 3.2.3 only gives the second half of the traceback, which I suppose is worse.

In 3.3.0, the exception raising an exception behavior is fixed for this particular input on my Win 7 system.

>>> import urllib.request
>>> urllib.request.urlretrieve('missing')
Traceback (most recent call last):
  File "<pyshell#1>", line 1, in <module>
    urllib.request.urlretrieve('missing')
  File "C:\Programs\Python33\lib\urllib\request.py", line 185, in urlretrieve
    with contextlib.closing(urlopen(url, data)) as fp:
  File "C:\Programs\Python33\lib\urllib\request.py", line 160, in urlopen
    return opener.open(url, data, timeout)
  File "C:\Programs\Python33\lib\urllib\request.py", line 458, in open
    req = Request(fullurl, data)
  File "C:\Programs\Python33\lib\urllib\request.py", line 279, in __init__
    self._parse()
  File "C:\Programs\Python33\lib\urllib\request.py", line 284, in _parse
    raise ValueError("unknown url type: %s" % self.full_url)
ValueError: unknown url type: missing

urlretrieve no longer assumes the file: scheme. (It was never documented to do so, and there is no good reason to do so.) If given explicitly with an invalid name, it gives the current equivalent of the 2.x error and message.

>>> urllib.request.urlretrieve('file:missing')
...
urllib.error.URLError: <urlopen error [WinError 2] The system cannot find the file specified: 'missing'>

Nonetheless, this 3 arg call should be fixed
1887 except OSError as e:
1888   raise URLError(e.errno, e.strerror, e.filename)
msg172988 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2012-10-15 17:01
While fixing this issue is easy, there is a wider problem with the use of URLError.
The constructor accepts two args, reason and filename. About half of the errors in Lib/urllib/request.py use only one argument, and in the other half not a single one uses the second arg for the filename:

raise URLError('file error', 'proxy support for file protocol currently not implemented')
raise URLError('local file error', 'not on local host')
raise URLError('ftp error', 'proxy support for ftp protocol currently not implemented')
if not host: raise URLError('ftp error', 'no host given')
raise URLError('ftp error', msg).with_traceback(sys.exc_info()[2])
raise URLError('data error', 'proxy support for data protocol currently not implemented')
raise URLError('ftp error', reason).with_traceback(
raise URLError('ftp error', reason) from reason

Note that the only the first arg ends up in the error message.

This should be reported in another issue though.


In the attached patch I passed only the first argoment and did

-            raise URLError(e.errno, e.strerror, e.filename)
+            raise URLError('local file error')

The result is

Traceback (most recent call last):
  File "/home/wolf/dev/py/3.2/Lib/urllib/request.py", line 1769, in open_local_file
    stats = os.stat(localname)
OSError: [Errno 2] No such file or directory: 'foo'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/wolf/dev/py/3.2/Lib/urllib/request.py", line 1771, in open_local_file
    raise URLError('local file error')
urllib.error.URLError: <urlopen error local file error>

If that's OK I'll add a couple of tests and commit.
msg173053 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2012-10-16 15:50
> This should be reported in another issue though.

Has this been done?
msg173054 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2012-10-16 15:54
Nick reported #16247 today which is very similar.
msg173059 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2012-10-16 16:21
For tracking purposes, I created issue 16250 for what you observed above, which as you said is related to but not the same as Nick's issue 16247.
msg173135 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2012-10-17 06:55
@Ezio, regarding msg172988. That's embarrassing. It should be fixed soon.

Review on the patch, how about having the strerror and filename from the exception?

-            raise URLError(e.errno, e.strerror, e.filename)
+            raise URLError(e.strerror, e.filename)

I believe, that could have been the original intention which got overlooked.

And yes to fixing the other URLError msg related issues too, with just changing to single msg first, instead of wrongly sending the filename.
msg173434 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2012-10-21 07:19
Here is the patch which captures both HTTPError and URLError at the open_file and thus preventing multiple exceptions to be raised ( URLError and next IOError). This can go in 3.4 and since this is bug, where correct exception is not being caught and wrong args are sent, I think the catching of correct exception can be backported to old versions
msg173480 - (view) Author: Roundup Robot (python-dev) Date: 2012-10-21 20:30
New changeset daef5526d2ac by Senthil Kumaran in branch 'default':
Issue #10836: Fix exception raised when file not found in urlretrieve
http://hg.python.org/cpython/rev/daef5526d2ac
msg173482 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2012-10-21 20:31
This one is fixed in 3.4. I can see the previous versions raised IOError exception and it is not a good idea to change by backporting. But the wrong arguments on URLError should be fixed tough.
msg173586 - (view) Author: Juho Vuori (juho) * Date: 2012-10-23 09:58
Should the bug be closed as it seems to be fixed now?
msg173620 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2012-10-23 15:20
Nope,  I have backport it to other versions. I shall close it then.
msg173921 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2012-10-27 10:55
This has been backported.

3.2 5e71f2712076
3.3 30547e2cd04d
History
Date User Action Args
2012-10-27 10:55:40orsenthilsetstatus: open -> closed
messages: + msg173921

assignee: orsenthil
resolution: fixed
stage: patch review -> resolved
2012-10-23 15:20:13orsenthilsetmessages: + msg173620
2012-10-23 09:58:12juhosetnosy: + juho
messages: + msg173586
2012-10-21 20:31:23orsenthilsetmessages: + msg173482
2012-10-21 20:30:10python-devsetnosy: + python-dev
messages: + msg173480
2012-10-21 07:19:10orsenthilsetfiles: + issue10836.patch

messages: + msg173434
2012-10-17 06:55:32orsenthilsetassignee: orsenthil -> (no value)
messages: + msg173135
2012-10-16 16:21:06chris.jerdoneksetmessages: + msg173059
2012-10-16 15:54:46ezio.melottisetnosy: + ncoghlan
messages: + msg173054
2012-10-16 15:50:49chris.jerdoneksetnosy: + chris.jerdonek
messages: + msg173053
2012-10-15 17:39:22ezio.melottisetfiles: + issue10836-2.diff
stage: needs patch -> patch review
2012-10-15 17:01:58ezio.melottisetfiles: + issue10836.diff
keywords: + patch
messages: + msg172988
2012-10-15 15:13:46terry.reedysetnosy: + terry.reedy

messages: + msg172982
title: regression: TypeError during exception handling in urllib.request.urlretrieve -> urllib.request.urlretrieve calls URLError with 3 args
2012-10-15 14:49:07ezio.melottisetnosy: + ezio.melotti

messages: + msg172978
versions: + Python 3.3, Python 3.4, - Python 3.1
2012-10-15 14:35:19gvanrossumsetpriority: normal -> high
2012-10-15 12:06:03hayposetnosy: + haypo
2012-10-15 11:57:57amaury.forgeotdarcsetkeywords: + easy
2012-10-15 11:02:50techtoniksetmessages: + msg172959
title: TypeError during exception handling in urllib.request.urlretrieve -> regression: TypeError during exception handling in urllib.request.urlretrieve
2012-10-15 10:11:35techtoniksetnosy: + techtonik
messages: + msg172954
2011-01-05 18:29:12pitrousetassignee: orsenthil
stage: needs patch

nosy: + orsenthil
versions: + Python 3.2
2011-01-05 18:27:04Alexandru.Moșoicreate