Issue34357
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2018-08-08 12:17 by deivid, last changed 2022-04-11 14:59 by admin. This issue is now closed.
Messages (4) | |||
---|---|---|---|
msg323275 - (view) | Author: David (deivid) | Date: 2018-08-08 12:17 | |
Hello! Newbie to python here. I run into an issue with one desktop library, Cinnamon. Specifically this one: https://github.com/linuxmint/Cinnamon/issues/5926#issuecomment-411232144. This library uses the urllib in the standard library to download some json. But for some reason, it does not work for me. If however, I use [https://github.com/urllib3/urllib3](urllib3), it just works. It sounds like something the standard library could do better, so I'm reporting it here in case it's helpful. A minimal example would be: ```python from urllib.request import urlopen data = urlopen("https://cinnamon-spices.linuxmint.com/json/applets.json").read() print(data) ``` which just hangs for me. If I pass a specific number of bytes (less than ~65000), it works, but only downloads parts of the file. Using the equivalent code in urllib3 works just fine: ```python import urllib3 http = urllib3.PoolManager() response = http.request('GET', 'https://cinnamon-spices.linuxmint.com/json/applets.json') print(response.data) ``` This is on ``` Python 3.7.0 (default, Aug 7 2018, 23:24:26) [GCC 5.5.0 20171010] on linux ``` Any help troubleshooting this would be appreciated! |
|||
msg323467 - (view) | Author: Martin Panter (martin.panter) * | Date: 2018-08-13 08:10 | |
I can’t get it to hang. Does your computer or Internet provider have a proxy or firewall that may be interfering? Perhaps it is worth comparing the HTTP header fields being sent and received. You can enable debug messages to see the request sent, and print the response fields directly. Most important things to look for are the Content-Length and Transfer-Encoding (if any) fields in the response. >>> import urllib.request >>> url = "https://cinnamon-spices.linuxmint.com/json/applets.json" >>> handler = urllib.request.HTTPSHandler(debuglevel=1) >>> opener = urllib.request.build_opener(handler) >>> resp = opener.open(url) send: b'GET /json/applets.json HTTP/1.1\r\nAccept-Encoding: identity\r\nHost: cinnamon-spices.linuxmint.com\r\nUser-Agent: Python-urllib/3.6\r\nConnection: close\r\n\r\n' reply: 'HTTP/1.1 200 OK\r\n' header: Server header: Date header: Content-Type header: Content-Length header: Connection header: Last-Modified header: ETag header: X-Sucuri-Cache header: X-XSS-Protection header: X-Frame-Options header: X-Content-Type-Options header: X-Sucuri-ID header: Accept-Ranges $ >>> print(response.info()) Server: Sucuri/Cloudproxy Date: Mon, 13 Aug 2018 07:18:11 GMT Content-Type: application/json Content-Length: 70576 Connection: close Last-Modified: Mon, 13 Aug 2018 07:25:14 GMT ETag: "113b0-5734bfe97145e" X-Sucuri-Cache: HIT X-XSS-Protection: 1; mode=block X-Frame-Options: SAMEORIGIN X-Content-Type-Options: nosniff X-Sucuri-ID: 11014 Accept-Ranges: bytes >>> data = resp.read() >>> len(data) 70576 Another experiment would be to try “http.client” directly, which I understand is used by both the built-in “urllib.request” module, and “urllib3”: from http.client import HTTPSConnection conn = HTTPSConnection("cinnamon-spices.linuxmint.com") headers = { # Same header fields sent by “urllib.request” "Accept-Encoding": "identity", "Host": "cinnamon-spices.linuxmint.com", "User-Agent": "Python-urllib/3.6", "Connection": "close", } conn.request("GET", "/json/applets.json", headers=headers) resp = conn.getresponse() print(resp.msg) data = resp.read() Try removing the “Connection: close” field from the request. Occasionally this triggers bad server behaviour (see Issue 12849); maybe your server or proxy is affected. |
|||
msg323478 - (view) | Author: David (deivid) | Date: 2018-08-13 12:26 | |
Hi Martin. It's definitely something with my internet connection. Yesterday I temporarily changed the way I connect to the internet to use the mobile connection from my cell phone instead of my WiFi connection, and things started working. I also debugged the headers being received and I did notice the "Connection: Close" header was the only relevant difference in the request when comparing it to the request sent by my browser when accessing that page directly. My next task was to investigate how to do what you just suggested... With my currently knowledge of python it would've taken me ages to figure out, so thanks so much! Let me try your suggestions and report back! Thanks so much for your help! :) |
|||
msg323480 - (view) | Author: David (deivid) | Date: 2018-08-13 12:39 | |
martin.parter, it worked! Thanks so much, I was going nuts!!!! I also read the issue you pointed to, very interesting. Even if all servers should just work here, it does not seem to be the case in real life (I guess it's something easy to misconfigure) so I agree not setting "Connection: false" by default would make the standard lib more user friendly. I guess I'll now talk to the maintainers of the upstream library and suggest the following: * Reading this issue and the one you pointed to. * Reviewing their server configuration. * Migrating to http.client, specially if they don't make to fix the server configuration. This can now be closed, thanks so much again <3 |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:59:04 | admin | set | github: 78538 |
2018-08-14 05:42:56 | martin.panter | set | superseder: Cannot override 'connection: close' in urllib2 headers |
2018-08-13 12:39:51 | deivid | set | status: open -> closed messages: + msg323480 stage: test needed -> resolved |
2018-08-13 12:26:39 | deivid | set | messages: + msg323478 |
2018-08-13 08:10:58 | martin.panter | set | versions:
+ Python 3.7 nosy: + martin.panter messages: + msg323467 type: behavior stage: test needed |
2018-08-08 12:17:07 | deivid | create |