This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: urllib.request.urlopen with cafile or capath set overrides any previous Handlers
Type: behavior Stage: patch review
Components: Library (Lib) Versions: Python 3.7, Python 3.6
process
Status: closed Resolution: duplicate
Dependencies: Superseder: urllib.parse.urlopen shouldn't ignore installed opener when called with any SSL argument
View: 18543
Assigned To: Nosy List: caligatio, christian.heimes, martin.panter, orsenthil, pitrou, r.david.murray
Priority: normal Keywords: patch

Created on 2012-08-23 01:35 by caligatio, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
request.patch caligatio, 2012-08-23 23:08
Messages (9)
msg168915 - (view) Author: Brian Turek (caligatio) * Date: 2012-08-23 01:35
Using the code snippet below:

cj = http.cookiejar.CookieJar()
opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor())
urllib.request.install_opener(opener)
request = urllib.request.Request(url, data, headers) # url is https://something
sock = urllib.request.urlopen(request, cafile = 'cacert.pem')

One would expect to establish a verified HTTPS connection to the host and the cookies stored in the cookie jar.  Unfortunately urllib.request.urlopen, when called with cafile or capath set, calls urllib.request.build_opener without HTTPCookieProcessor included in the chain.  This results in never being able to support cookies in a "verified" HTTPS connection.
msg168917 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012-08-23 01:41
I thought that was an issue when I was looking at the code the other day, but I didn't get around to testing it.  Thanks for the report.
msg168932 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2012-08-23 09:52
I could verify this bug and also looks like a tricky one. Because when we are sending the cacert (the second time), we create a new HTTPSHandler and then build the opener again using that handler. 

I thought, we can use the existing opener object itself like this - 

-        opener = build_opener(https_handler)
+        if _opener is None:
+            opener = build_opener(https_handler)
+        else:
+            opener = _opener
+            opener.add_handler(https_handler)


But even then the problem is, the existing default HTTPSHandler will be invoked before the newly added one, as add_handler does a bisect.insert to add new handlers.

Thinking what's the best way to insert this new updated HTTPS handler in front of the existing ones (without resorting to any hacks).
msg168934 - (view) Author: Brian Turek (caligatio) * Date: 2012-08-23 10:55
I was actually going to come up with a patch that does the same thing orsenthil mentioned until I too was stymied by the use of bisect.insort

Does anyone know why bisect.insort was used instead of just list.append?  I don't see an obvious reason so I think just having add_handler insert new handlers at the beginning of all the respective lists would be an easy fix.
msg168937 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012-08-23 12:47
It has been a while since I looked at the code, but if I remember correctly there is a (somewhat non-obvious) mechanism for assigning priority to handlers, so that you can control the order of application.  If I'm right the insort is about that priority, and it also ought to be possible to use that priority to fix the problem.  But I remember whatever it was I was looking at as being a bit hard to understand, so I could be way off base (or thinking of some other part of the code base).
msg168940 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-08-23 12:57
-        opener = build_opener(https_handler)
+        if _opener is None:
+            opener = build_opener(https_handler)
+        else:
+            opener = _opener
+            opener.add_handler(https_handler)

Well, isn't it a bad idea to mutate the global opener?
msg168965 - (view) Author: Brian Turek (caligatio) * Date: 2012-08-23 23:08
So I'm not saying the attached patch is the *best* solution but it doesn't mangle the existing urllib.request._opener too much.
msg275018 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2016-09-08 14:32
The ticket hasn't moved in about four years. Does the issue still persist?
msg286338 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2017-01-27 00:14
Nothing has been fixed; I don’t see any evidence that this is “out of date”. Here is a more complete test:

import urllib.request
opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor())
urllib.request.install_opener(opener)
request = 'https://httpbin.org/cookies/set?name=value'
# Download certifiate chain for https://httpbin.org/ into cacert.pem (I used Firefox)
sock = urllib.request.urlopen(request, cafile = 'cacert.pem')
sock.read()  # b'{\n  "cookies": {}\n}\n' (No cookies!)
sock.close()
sock = urllib.request.urlopen(request)  # Default SSL settings
sock.read()  # b'{\n  "cookies": {\n    "name": "value"\n  }\n}\n'

I’m not comfortable with applying a patch that “doesn’t mangle [the global state] too much”. Anyway, this is the same as Issue 18543, which has slightly more recent discussion.
History
Date User Action Args
2022-04-11 14:57:35adminsetgithub: 59973
2017-01-27 00:14:31martin.pantersetstatus: pending -> closed

nosy: + martin.panter
messages: + msg286338

superseder: urllib.parse.urlopen shouldn't ignore installed opener when called with any SSL argument
resolution: out of date -> duplicate
2016-09-08 14:32:05christian.heimessetstatus: open -> pending

versions: + Python 3.6, Python 3.7, - Python 3.2, Python 3.3, Python 3.4
nosy: + christian.heimes

messages: + msg275018
resolution: out of date
stage: needs patch -> patch review
2012-08-23 23:08:25caligatiosetfiles: + request.patch
keywords: + patch
messages: + msg168965
2012-08-23 12:57:52pitrousetmessages: + msg168940
2012-08-23 12:47:25r.david.murraysetmessages: + msg168937
2012-08-23 10:55:08caligatiosetmessages: + msg168934
2012-08-23 09:52:59orsenthilsetmessages: + msg168932
2012-08-23 01:41:03r.david.murraysetnosy: + r.david.murray, orsenthil, pitrou

messages: + msg168917
stage: needs patch
2012-08-23 01:35:08caligatiocreate