Issue408085
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2001-03-13 01:05 by boswell, last changed 2022-04-10 16:03 by admin. This issue is now closed.
Messages (6) | |||
---|---|---|---|
msg3843 - (view) | Author: Dustin Boswell (boswell) | Date: 2001-03-13 01:05 | |
Using urllib.urlopen("https://...") seems to hang because of a redirect problem. Looks like its trying to follow the redirect with http not https. >>> import urllib >>> params = ... >>> f = urllib.urlopen("https://...", params) connect: (securesite.com, 80) #a printout from httplib, line 354 Traceback (most recent call last): File "<stdin>", line 1, in ? File "/usr/local/lib/python2.0/urllib.py", line 63, in urlopen return _urlopener.open(url, data) File "/usr/local/lib/python2.0/urllib.py", line 168, in open return getattr(self, name)(url, data) File "/usr/local/lib/python2.0/urllib.py", line 367, in open_https data) File "/usr/local/lib/python2.0/urllib.py", line 301, in http_error result = method(url, fp, errcode, errmsg, headers, data) File "/usr/local/lib/python2.0/urllib.py", line 537, in http_error_302 return self.open(newurl, data) File "/usr/local/lib/python2.0/urllib.py", line 168, in open return getattr(self, name)(url, data) File "/usr/local/lib/python2.0/urllib.py", line 269, in open_http h.putrequest('POST', selector) File "/usr/local/lib/python2.0/httplib.py", line 428, in putrequest self.send(str) File "/usr/local/lib/python2.0/httplib.py", line 370, in send self.connect() File "/usr/local/lib/python2.0/httplib.py", line 354, in connect self.sock.connect((self.host, self.port)) KeyboardInterrupt >>> |
|||
msg3844 - (view) | Author: Moshe Zadka (moshez) | Date: 2001-03-18 09:13 | |
Logged In: YES user_id=11645 Errr....I'm not sure I see the bug. Perhaps the "Location" header actually contained an "http://" URL? If you can give me the site or more information (like a printout of newurl), I can probably be of more help. In testing (sadly, against a server inside a firewall, so I cannot give the URL) I have found that it seems to work. One thing, that may or may not have to do with your problem: when POSTing, a 302 means "POST to that other URL", not "GET that other URL". Many webserver writers seem to ignore this, and many browsers compensate for that server bug. urllib2 does *not* compensate for that bug -- I haven't thought through whether *that* may be the explanation. |
|||
msg3845 - (view) | Author: Dustin Boswell (boswell) | Date: 2001-03-19 13:12 | |
Logged In: YES user_id=153527 The server is https://trading.etrade.com Unless you have an account there to try it yourself, there's not much else specific information I can give you. I know for sure that the redirection is to another https url. The "Location" header is actually a relative one, which is where the bug in urllib.py is. The problem is that when open_https is called, if an error is encountered, it calls http_error, which assumes the url was an http, and so when a relative url is encountered, just prepends a http:// to the front. I can't think of an elegant fix to this. Maybe when http_error realizes it's a relative location, it should prepend "proto" (some argument to the function that doesn't exist yet) and prepend THAT one to it... def open_https(self, url, data=None): if errcode == 200: return addinfourl(fp, headers, url) else: if data is None: return self.http_error(url, fp, errcode, errmsg, headers) else: return self.http_error(url, fp, errcode, errmsg, headers, data) ... and here's the function called after the error is realized... def http_error_302(self, url, fp, errcode, errmsg, headers, data=None): """Error 302 -- relocated (temporarily).""" ######Here's the problem############# # In case the server sent a relative URL, join with original: newurl = basejoin("http:" + url, newurl) #uh, what if it isn't http? we seem to have lost that information... if data is None: return self.open(newurl) else: return self.open(newurl, data) I originally was developing my project in JAVA and had it working, but was realizing that I was re-inventing the wheel (i.e. redirection handling). So I switched to Python (for other reasons too). But I went back and placed a POST instead of GET in the redirection handling and everything still worked, so as for the possible GET vs. POST redirect server bug, it wasn't that (although that's very interesting to know...). Am I making any sense? |
|||
msg3846 - (view) | Author: Nobody/Anonymous (nobody) | Date: 2001-03-26 02:55 | |
Logged In: NO the location header must be an absolute uri (rfc2616 section 14.30 and rfc1945 10.11). |
|||
msg3847 - (view) | Author: Moshe Zadka (moshez) | Date: 2001-04-09 14:36 | |
Logged In: YES user_id=11645 Fixed in urllib.py v 1.125 urllib.py added http: to the url, instead of self.type. I haven't checked with the original server or with POSTs since I couldn't find such a server -- but I verified it by opening https://sourceforge.net/account which redirects to https://sourceforge.net/account/. It redirects properly, unfortunately, but I did check that I'm adding the correct thing. |
|||
msg3848 - (view) | Author: Moshe Zadka (moshez) | Date: 2001-04-09 14:38 | |
Logged In: YES user_id=11645 Forgot to actually close the bug report. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-10 16:03:51 | admin | set | github: 34141 |
2001-03-13 01:05:37 | boswell | create |