This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: urllib.request.urlopen leaks exceptions from socket and httplib.client
Type: behavior Stage:
Components: Library (Lib) Versions: Python 3.3
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: orsenthil Nosy List: eric.araujo, ezio.melotti, jmoy, martin.panter, orsenthil, pitrou
Priority: normal Keywords: patch

Created on 2012-01-08 06:22 by jmoy, last changed 2022-04-11 14:57 by admin.

Files
File name Uploaded Description Edit
test_urllib_except.py jmoy, 2012-01-08 06:22 Test to show bug.
urllib_exception_leak.patch jmoy, 2012-01-08 06:25 Patch to fix issue #13736 review
Messages (7)
msg150849 - (view) Author: Jyotirmoy Bhattacharya (jmoy) * Date: 2012-01-08 06:22
The documentation for urlopen says that it raises URLError on error. 
But there exist error conditions such as a socket timeout or a bad HTTP status line under which the exception from the underlying library leaks through urllib, thus breaking the promise in the documentation.

I am attaching a test program that demonstrates this bug.
msg150850 - (view) Author: Jyotirmoy Bhattacharya (jmoy) * Date: 2012-01-08 06:25
A patch to fix this issue. Catches exceptions from underlying libraries and reraises them as URLError.

I put the class name of the underlying exception in the reason to make it more descriptive.
msg151690 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-01-20 14:04
I think the general pattern of wrapping exceptions in other exceptions is rather unfortunate. It makes it harder to examine the original problem (for example the ``errno`` attribute) for no actual gain.
msg151697 - (view) Author: Jyotirmoy Bhattacharya (jmoy) * Date: 2012-01-20 17:00
Antoine, could "raise ... from" be used here? Perhaps also using new subclasses of URLError to allow the exceptions to be caught with finer granularity. That way no information would be lost while at the same time not surprising clients who only catch URLError based on the documentation.

At least one exception which I feel must be wrapped is socket.timeout. Since we allow a timeout argument to urlopen, it doesn't make sense for the fact of the timeout to be announced by an exception from another library.
msg151698 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-01-20 17:13
> Antoine, could "raise ... from" be used here?

That's a possible compromise indeed. It's up to Senthil to decide
anyway.

>  Perhaps also using new subclasses of URLError to allow the exceptions
> to be caught with finer granularity. That way no information would be
> lost while at the same time not surprising clients who only catch
> URLError based on the documentation.

I agree there is a problem with the documentation announcing that all
exceptions will be wrapped. Honestly I would like it better if that
guarantee were dropped. In the meantime I'm not sure what the best
course of action is.

> At least one exception which I feel must be wrapped is socket.timeout.
> Since we allow a timeout argument to urlopen, it doesn't make sense
> for the fact of the timeout to be announced by an exception from
> another library.

You still may be interested to know that the request failed because of a
timeout rather than (say) an authentication failure, no?
msg152881 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2012-02-08 16:26
I have stumbled upon a wrong impression as well by reading the docs, but usually in the code, I tend to catch the specific Exceptions, like timeout instead or URLError when it is known. I saw some libraries following similar pattern too. But that could be changed, if promise in the docs that URLError exception is raised is corrected. 

I think, the course of action for this bug could be.

1. raise ... from .. for the appropriate Exception in 3.3 and appropriate doc changes.
2. Doc changes in 2.7,3.2 which say indicate the possibility of other exceptions besides URLError.

Having any important information in the msg part of the Exception is a bad idea, because it is seldom relied upon and can be changed anytime.

Shall come out with a patch.
msg254761 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-11-16 22:14
There is another patch in duplicate Issue 8823, which has simpler exception wrapping code, but does not add wrapping of reply exceptions.

Personally, I wouldn’t endorse any new wrapping HTTP client exceptions, and even if wrapping more socket OS errors would be consistent, it could be a backwards compatibility problem. I think the best way forward is documentation: Issue 25633, Issue 22797.
History
Date User Action Args
2022-04-11 14:57:25adminsetgithub: 57945
2015-11-16 22:14:15martin.pantersetmessages: + msg254761
2015-11-16 22:09:16martin.panterlinkissue8823 superseder
2013-12-07 22:01:38martin.pantersetnosy: + martin.panter
2012-02-08 16:26:52orsenthilsetmessages: + msg152881
2012-01-20 17:13:24pitrousetmessages: + msg151698
2012-01-20 17:00:58jmoysetmessages: + msg151697
2012-01-20 14:04:41pitrousetnosy: + pitrou
messages: + msg151690
2012-01-20 10:17:35orsenthilsetassignee: orsenthil
2012-01-14 04:01:31eric.araujosetnosy: + ezio.melotti, eric.araujo
2012-01-08 06:25:56jmoysetfiles: + urllib_exception_leak.patch
type: behavior
messages: + msg150850

keywords: + patch
2012-01-08 06:22:19jmoycreate