New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
urllib.open sends full URL after GET command instead of local path #49322
Comments
Hello ... The first thing I have to say is that I searched the open issues and I Yesterday I was testing how to access the wiki pages in a Initially the behavior was as follows : {{{
#!python
>>> u = urllib.urlopen('http://localhost:8000/trac-dev')
>>> u.read()
'Environment not found'
>>> u.close()
}}} And tracd reported a line like this {{{ Which means that a 'Not found' error code was sent back to urllib I tried to access the same page from my browser and tracd reported {{{ The problem is obvious ... urllib was sending the full URL after GET I applied the following patch to urllib (yours will be better, I am {{{ --- /usr/lib/python2.5/urllib.py 2008-07-31 13:40:40.000000000
-0500
+++ /media/urllib_unix.py 2009-01-26 09:48:54.000000000 -0500
@@ -270,6 +270,7 @@
def open_http(self, url, data=None):
"""Use HTTP protocol."""
import httplib
+ from urlparse import urlparse
user_passwd = None
proxy_passwd= None
if isinstance(url, str):
@@ -312,12 +313,17 @@
else:
auth = None
h = httplib.HTTP(host)
+ target = ''.join(sep + part for sep, part in \
+ zip(['', ';', '?', '#'], \
+ urlparse(selector)[2:]) \
+ if part)
+ print target
if data is not None:
- h.putrequest('POST', selector)
+ h.putrequest('POST', target)
h.putheader('Content-Type', 'application/x-www-form-
urlencoded')
h.putheader('Content-Length', '%d' % len(data))
else:
- h.putrequest('GET', selector)
+ h.putrequest('GET', target)
if proxy_auth: h.putheader('Proxy-Authorization', 'Basic %s' %
proxy_auth)
if auth: h.putheader('Authorization', 'Basic %s' % auth)
if realhost: h.putheader('Host', realhost) }}} And everithing was «back» to normal ... {{{
#!python
>>> u = urllib.urlopen('http://localhost:8000/trac-dev')
>>> u.read()
... # Lots of beautiful HTML code ;)
>>> u.close()
}}} ... tracd outputted ... {{{ The same picture is shown when using both Python 2.5.1 and 2.5.2 ... ... so further research is needed, but IMO this is a serious bug :( PD: If this is a bug ... how could it be hidden so far ? Is there any .. [1] Trac |
Ooops ... sorry, remove the print statement. The patch is as follows : {{{ --- /usr/lib/python2.5/urllib.py 2008-07-31 13:40:40.000000000
-0500
+++ /media/urllib_unix.py 2009-01-26 09:48:54.000000000 -0500
@@ -270,6 +270,7 @@
def open_http(self, url, data=None):
"""Use HTTP protocol."""
import httplib
+ from urlparse import urlparse
user_passwd = None
proxy_passwd= None
if isinstance(url, str):
@@ -312,12 +313,17 @@
else:
auth = None
h = httplib.HTTP(host)
+ target = ''.join(sep + part for sep, part in \
+ zip(['', ';', '?', '#'], \
+ urlparse(selector)[2:]) \
+ if part)
if data is not None:
- h.putrequest('POST', selector)
+ h.putrequest('POST', target)
h.putheader('Content-Type', 'application/x-www-form-
urlencoded')
h.putheader('Content-Length', '%d' % len(data))
else:
- h.putrequest('GET', selector)
+ h.putrequest('GET', target)
if proxy_auth: h.putheader('Proxy-Authorization', 'Basic %s' %
proxy_auth)
if auth: h.putheader('Authorization', 'Basic %s' % auth)
if realhost: h.putheader('Host', realhost) }}} I apologize once again ... |
I could not reproduce this issue neither with Python 2.6 nor 2.5.2 |
Actually I am using a proxy hosted in some other machine (i.e. not my {{{ # urllib,py def open_http(self, url, data=None):
"""Use HTTP protocol."""
import httplib
user_passwd = None
proxy_passwd= None
if isinstance(url, str): # Branching here !!!!!!!!!!
host, selector = splithost(url)
if host:
user_passwd, host = splituser(host)
host = unquote(host)
realhost = host
else:
host, selector = url }}} url variable is bound to the following binary tuple {{{ ('172.18.2.7:3128', 'http://localhost:8000/trac-dev') My IP is 172.18.2.99 ... so the If you need further details ... dont hesitate and ask anything you PD: What d'u mean when you said?
I dont understand this since *I already said* that *I accessed* my Trac I dont understand how could all this be possible if I were running Anyway ... CMIIW ... I also checked that immediately before executing the following {{{ # urllib,py h = httplib.HTTP(host)
if data is not None:
h.putrequest('POST', selector)
h.putheader('Content-Type', 'application/x-www-form-
urlencoded')
h.putheader('Content-Length', '%d' % len(data))
else:
h.putrequest('GET', selector) }}} ... |
I suppose 172.18.2.7:3128 is the address:port of the your proxy, right? (but I suppose tracd could also be more permissive and allow the "GET |
Yes ...
This being said ...
... It works with Apache (I am talking about trac once again ...) Thnx a lot ! Sorry if I caused you any trouble ... |
If you had configured a proxy at localhost:8000, and *also* a Trac instance at that port, and Trac had "won the race" for the port, then you would observe exactly the symthoms you describe. That is, urllib talking to 8000 as it were a proxy, and the Trac instance actually there getting confused. Your patch, as you surely understand now, is not correct; in fact, the code is OK as it is. urllib builds the request in that specific way *because* he thinks there is a proxy. If the proxy is buggy, misconfigured, or inexistent, it's not the library's fault :) --
Recetas prácticas y comida saludable |
Anyone against closing this as "works for me"? |
Yup, This should be closed too. Thanks. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: