Author cameron
Recipients brian.curtin, cameron, orsenthil
Date 2010-01-26.02:24:04
SpamBayes Score 4.59289e-10
Marked as misclassified No
Message-id <1264472647.78.0.760930244515.issue7776@psf.upfronthosting.co.za>
In-reply-to
Content
Well, I've established a few things:
  - I'm mischaracterised this issue
  - httplib's _set_tunnel() is really meant to be called from
    urllib2, because using it directly with httplib is totally
    counter intuitive
  - a bare urllib2 setup fails with its own bug

To the first item: _tunnel() feels really fragile with that recursion issue, though it doesn't recurse called from urllib2.

For the second, here's my test script using httplib:

  H = httplib.HTTPSConnection("localhost", 3128)
  print H
  H._set_tunnel("localhost", 443)
  H.request("GET", "/boguspath")
  os.system("lsof -p %d | grep IPv4" % (os.getpid(),))
  R = H.getresponse()
  print R.status, R.reason

As you can see, one builds the HTTPSConnection object with the proxy's details instead of those of the target URL, and then put the target URL details in with _set_tunnel(). Am I alone in find this strange?

For the third, my test code is this:

  U = urllib2.Request('https://localhost/boguspath')
  U.set_proxy('localhost:3128', 'https')
  f = urllib2.urlopen(R)
  print f.read()

which fails like this:

  Traceback (most recent call last):
    File "thttp.py", line 15, in <module>
      f = urllib2.urlopen(R)
    File "/opt/python-2.6.4/lib/python2.6/urllib2.py", line 131, in urlopen
      return _opener.open(url, data, timeout)
    File "/opt/python-2.6.4/lib/python2.6/urllib2.py", line 395, in open
      protocol = req.get_type()
  AttributeError: HTTPResponse instance has no attribute 'get_type'

The line numbers are slightly off because I've got some debugging statements in there.

Finally, I flat out do not understand urllib2's set_proxy() method:
  
    def set_proxy(self, host, type):
        if self.type == 'https' and not self._tunnel_host:
            self._tunnel_host = self.host
        else:
            self.type = type
            self.__r_host = self.__original
        self.host = host

When my code calls set_proxy, self.type is None. Now, I had naively expected the first branch to be the only branch. Could someone explain what's happening here, and what is meant to happen?

I'm thinking that this bug may turn into a doc fix instead of a behaviour fix, but I'm finding it surprisingly hard to know how urllib2 is supposed to be used.
History
Date User Action Args
2010-01-26 02:24:08cameronsetrecipients: + cameron, orsenthil, brian.curtin
2010-01-26 02:24:07cameronsetmessageid: <1264472647.78.0.760930244515.issue7776@psf.upfronthosting.co.za>
2010-01-26 02:24:06cameronlinkissue7776 messages
2010-01-26 02:24:04cameroncreate