Author beazley
Recipients beazley
Date 2008-12-29.21:42:29
SpamBayes Score 1.42539e-06
Marked as misclassified No
Message-id <1230586952.45.0.591912771259.issue4773@psf.upfronthosting.co.za>
In-reply-to
Content
A file-like object u returned by the urlopen() function in both Python 
2.6/3.0 has a method info() that returns a 'HTTPMessage' object.  For 
example:

::: Python 2.6
>>> from urllib2 import urlopen
>>> u = urlopen("http://www.python.org")
>>> u.info()
<httplib.HTTPMessage instance at 0xce5738>
>>> 

::: Python 3.0
>>> from urllib.request import urlopen
>>> u = urlopen("http://www.python.org")
>>> u.info()
<http.client.HTTPMessage object at 0x4bfa10>
>>>

So far, so good.  HTTPMessage is defined in two different modules, but 
that's fine (it's just library reorganization).

Two major problems:

1. There is no documentation whatsoever on HTTPMessage.  No description 
in the docs for httplib (python 2.6) or http.client (python 3.0).

2. The HTTPMessage object in Python 2.6 derives from mimetools.Message 
and has a totally different programming interface than HTTPMessage in 
Python 3.0 which derives from email.message.Message.  Check it out:

:::Python 2.6
>>> dir(u.info())
['__contains__', '__delitem__', '__doc__', '__getitem__', '__init__', 
'__iter__', '__len__', '__module__', '__setitem__', '__str__', 
'addcontinue', 'addheader', 'dict', 'encodingheader', 'fp', 'get', 
'getaddr', 'getaddrlist', 'getallmatchingheaders', 'getdate', 
'getdate_tz', 'getencoding', 'getfirstmatchingheader', 'getheader', 
'getheaders', 'getmaintype', 'getparam', 'getparamnames', 'getplist', 
'getrawheader', 'getsubtype', 'gettype', 'has_key', 'headers', 
'iscomment', 'isheader', 'islast', 'items', 'keys', 'maintype', 
'parseplist', 'parsetype', 'plist', 'plisttext', 'readheaders', 
'rewindbody', 'seekable', 'setdefault', 'startofbody', 'startofheaders', 
'status', 'subtype', 'type', 'typeheader', 'unixfrom', 'values']

:::Python 3.0
>>> dir(u.info())
['__class__', '__contains__', '__delattr__', '__delitem__', '__dict__', 
'__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', 
'__getitem__', '__gt__', '__hash__', '__init__', '__iter__', '__le__', 
'__len__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', 
'__reduce_ex__', '__repr__', '__setattr__', '__setitem__', '__sizeof__', 
'__str__', '__subclasshook__', '__weakref__', '_charset', 
'_default_type', '_get_params_preserve', '_headers', '_payload', 
'_unixfrom', 'add_header', 'as_string', 'attach', 'defects', 
'del_param', 'epilogue', 'get', 'get_all', 'get_boundary', 
'get_charset', 'get_charsets', 'get_content_charset', 
'get_content_maintype', 'get_content_subtype', 'get_content_type', 
'get_default_type', 'get_filename', 'get_param', 'get_params', 
'get_payload', 'get_unixfrom', 'getallmatchingheaders', 'is_multipart', 
'items', 'keys', 'preamble', 'replace_header', 'set_boundary', 
'set_charset', 'set_default_type', 'set_param', 'set_payload', 
'set_type', 'set_unixfrom', 'values', 'walk']

I know that getting rid of mimetools was desired, but I have no idea if 
changing the API on HTTPMessage was intended or not.  In any case, it's 
one of the only cases in the entire library where the programming 
interface to an object radically changes from 2.6 -> 3.0.  

I ran into this problem with code that was trying to properly determine 
the charset encoding of the byte string returned by urlopen(). 

I haven't checked whether 2to3 deals with this or not, but it might be 
something for someone to look at in their copious amounts of spare time.
History
Date User Action Args
2008-12-29 21:42:32beazleysetrecipients: + beazley
2008-12-29 21:42:32beazleysetmessageid: <1230586952.45.0.591912771259.issue4773@psf.upfronthosting.co.za>
2008-12-29 21:42:31beazleylinkissue4773 messages
2008-12-29 21:42:29beazleycreate