classification
Title: urllib2.build_opener() skips ProxyHandler
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.3, Python 3.4, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: barry, eric.araujo, jesstess, python-dev, r.david.murray
Priority: high Keywords: patch

Created on 2009-10-16 19:06 by barry, last changed 2013-04-28 21:08 by r.david.murray. This issue is now closed.

Files
File name Uploaded Description Edit
issue7152.patch jesstess, 2013-04-13 18:46 documentation patch review
Messages (7)
msg94144 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2009-10-16 19:06
Try this:

>>> from urllib2 import build_opener
>>> build_opener().handlers

In Python 2.4, you will see ProxyHandler as the first handler, but this
handler is missing from the list in Python 2.5, 2.6, and 2.7, despite this
text in the documentation:

    urllib2.build_opener([handler, ...])

    Return an OpenerDirector instance, which chains the handlers in the
order
    given. handlers can be either instances of BaseHandler, or subclasses of
    BaseHandler (in which case it must be possible to call the constructor
    without any parameters). Instances of the following classes will be in
    front of the handlers, unless the handlers contain them, instances
of them
    or subclasses of them: ProxyHandler, UnknownHandler, HTTPHandler,
    HTTPDefaultErrorHandler, HTTPRedirectHandler, FTPHandler, FileHandler,
    HTTPErrorProcessor.

In fact, there is no way to add a ProxyHandler at all using the public API.
This is because the following code was added to Python 2.5, purportedly as a
fix for bug 972322:

    http://bugs.python.org/issue972322

# urllib2.py:307

            if meth in ["redirect_request", "do_open", "proxy_open"]:
                # oops, coincidental match
                continue

Because of this, the following are not a workarounds:

>>> opener.add_handler(ProxyHandler)
>>> build_opener(ProxyHandler())

In fact, as near as I can tell, the only way to get a ProxyHandler in
there is
to do an end-run around .add_handler():

>>> proxy_handler = ProxyHandler()
>>> opener.handlers.insert(0, proxy_handler)
>>> proxy_handler.add_parent(opener)

I'm actually quite shocked this has never been reported before.

ISTM that the right fix is what was originally suggested in bug 972322:

    http://bugs.python.org/msg46172

"The alternative would be to rename do_open and proxy_open, and leave the
redirect_request case unchanged (see below for why)."

The intent of this patch could not have been to completely prevent
ProxyHandler from being included in the list of handlers, otherwise why keep
ProxyHandler at all?  If that was the case, then the documentation for
urllib2
is broken, and it should have described this change as occurring in Python
2.5.
msg94150 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2009-10-16 21:06
This may end up being just a documentation issue.  If the environment
has http_proxy set, you do get a ProxyHandler automatically.

>>> import os
>>> os.environ['http_proxy'] = 'localhost'
>>> from urllib2 import build_opener
>>> build_opener().handlers
[<urllib2.ProxyHandler instance at 0x7fb664ec6e18>,
<urllib2.UnknownHandler instance at 0x7fb664eca050>,
<urllib2.HTTPHandler instance at 0x7fb664eca710>,
<urllib2.HTTPDefaultErrorHandler instance at 0x7fb664ecaa70>,
<urllib2.HTTPRedirectHandler instance at 0x7fb664ecad88>,
<urllib2.FTPHandler instance at 0x7fb664ecae60>, <urllib2.FileHandler
instance at 0x7fb664ecaf38>, <urllib2.HTTPSHandler instance at
0x7fb664ece3b0>, <urllib2.HTTPErrorProcessor instance at 0x7fb664ece128>]
msg186793 - (view) Author: Jessica McKellar (jesstess) * (Python triager) Date: 2013-04-13 18:46
I confirm Barry's observation in msg94150 that if you set http_proxy (or any `*_proxy` environment variable), ProxyHandler does show up in build_opener().handlers. You can also add a ProxyHandler to build_opener through the public API as described in http://docs.python.org/dev/library/urllib.request.html#examples.

I've attached a patch that updates the urllib2 and urllib.request documentation to clarify the situation. The patch also adds a missing DataHandler to the enumerated default handlers in the urllib2 note on basic authentication.

I built the documentation and inspected the generated HTML to confirm proper formatting.

--

As a side note on what's going on with ProxyHandler's interaction with `if meth in ["redirect_request", "do_open", "proxy_open"]:`, since this was confusing to me until I dug into the source a bit:

When you don't have any proxy environment variables set, ProxyHandler only has one <protocol>_<condition> method that might get the handler registered -- proxy_open -- which gets skipped because of the above blacklisted methods check. When a proxy environment variable is set, ProxyHandler grows a *_open method which does get it registered as a handler.
msg187987 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013-04-28 15:31
New changeset f2472fb98457 by R David Murray in branch '3.3':
#7152: Clarify that ProxyHandler is added only if proxy settings are detected.
http://hg.python.org/cpython/rev/f2472fb98457

New changeset aca80409ecdd by R David Murray in branch 'default':
Merge #7152: Clarify that ProxyHandler is added only if proxy settings are detected.
http://hg.python.org/cpython/rev/aca80409ecdd

New changeset 27999a389742 by R David Murray in branch '2.7':
#7152: Clarify that ProxyHandler is added only if proxy settings are detected.
http://hg.python.org/cpython/rev/27999a389742
msg187993 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2013-04-28 16:08
Patch adds a mention of DataHandler, that code doesn’t have yet.
msg188022 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013-04-28 21:07
New changeset 5da7bb478dd9 by R David Murray in branch '2.7':
#7152: Remove incorrectly added reference to DataHandler.
http://hg.python.org/cpython/rev/5da7bb478dd9

New changeset 122d42d5268e by R David Murray in branch '3.3':
#7152: Remove incorrectly added reference to DataHandler.
http://hg.python.org/cpython/rev/122d42d5268e
msg188023 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2013-04-28 21:08
Thanks, Jessica.  I reworded it slightly, since the proxy setting can come from things other than environment variables on Windows and OSX.  Also found one other place it needed to be mentioned, and fixed up the punctuation on one of the pre-existing sentences.

And thanks for the catch on DataHandler, Éric.
History
Date User Action Args
2013-04-28 21:08:51r.david.murraysetstatus: open -> closed

nosy: + r.david.murray
messages: + msg188023

resolution: fixed
stage: needs patch -> resolved
2013-04-28 21:07:34python-devsetmessages: + msg188022
2013-04-28 16:08:51eric.araujosetnosy: + eric.araujo
messages: + msg187993
2013-04-28 15:31:26python-devsetnosy: + python-dev
messages: + msg187987
2013-04-13 18:46:01jesstesssetfiles: + issue7152.patch
versions: + Python 3.3, Python 3.4, - Python 2.6
nosy: + jesstess

messages: + msg186793

keywords: + patch
2010-05-11 21:01:41terry.reedysetversions: - Python 2.5
2009-10-16 21:06:23barrysetmessages: + msg94150
2009-10-16 19:06:42barrycreate