As I've found myself in the awkward position of having to explain the new 3.0 api to my students I've thought about this and have some ideas/questions.
I'm also willing to help with the documentation or any enhancements.
>>> x['Date']
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'addinfourl' object is unsubscriptable
I wish I new what an addinfourl object was.
'Fri, 27 Mar 2009 00:41:34 GMT'
>>> x.headers['Date']
'Fri, 27 Mar 2009 00:41:34 GMT'
>>> x.headers.keys()
['Date', 'Server', 'Last-Modified', 'ETag', 'Accept-Ranges', 'Content-Length', 'Connection', 'Content-Type']
Using x.headers over
x.info() makes the most sense to me, but I don't know that I can give any good rationale. Which would we want to document?
>>> x.headers['Content-Type']
'text/html; charset=ISO-8859-1'
I guess technically this is correct since the charset is part of the Content-Type header in HTTP but it does make life difficult for what I think will be a pretty common use case in this new urllib: read from the url (as bytes) and then decode them into a string using the appropriate character set.
As you follow this road, you have the confusing option of these three calls:
>>> x.headers.get_charset()
>>> x.headers.get_content_charset()
'iso-8859-1'
>>> x.headers.get_charsets()
['iso-8859-1']
I think it should be a bug that get_charset() does not return anything in this case. It is not at all clear why get_content_charset() and get_charset() should have different behavior.
Brad