This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: urllib and cookie module improvements
Type: enhancement Stage:
Components: None Versions:
process
Status: closed Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: jjlee, phr
Priority: normal Keywords:

Created on 2003-11-13 20:56 by phr, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (4)
msg61131 - (view) Author: paul rubin (phr) Date: 2003-11-13 20:56
1. The Cookie module should do a better job parsing real-
world cookies (the stuff that comes from http servers 
following Set-cookie: headers) and should also have a 
documented way to emit a client-side cookie (i.e. 
generate a correct Cookie: header from a cookie 
object).  

2. Urllib or urllib2 should be enhanced to read incoming 
cookie headers and send back the appropriate cookies in 
the event of an HTTP redirect.  Many sites set a cookie 
then redirect to some other location which tries to read 
the cookie; if the cookie isn't there, the new location 
bounces back to the original one to set the cookie, so 
you get a redirection loop.

3. The scheme of having urllib.urlopen() return the http 
headers in a dictionary-like object doesn't quite work: 
for example, there can be several Set-cookie headers in 
a single http response.  I don't know if the opener 
currently combines them or discards some; neither way 
is really satisfactory.  There really should be a list for 
each header type, but that would mess up the existing 
published interface, so maybe a new 'urllib3' is needed.  
I'm just starting to explore this stuff but it seems to me 
like a serious urllib module needs to do quite a bit more 
than the existing ones do.  The Perl LWP documentation 
might be a good place to look for inspiration.
msg61132 - (view) Author: John J Lee (jjlee) Date: 2003-12-03 19:08
Logged In: YES 
user_id=261020

1. and 2. are dealt with in another tracker items, 3. is incorrect, 
so this should be closed. 
 
It's better if you submit different issues separately if possible. 
 
1. See http://wwwsearch.sf.net/ClientCookie.  It doesn't use the 
Cookie module, since the code in the two modules is almost 
disjoint, and it would just obfuscate ClientCookie, really.  (Oh, 
it's Paul Rubin... I see from your recent c.l.py message you've 
just noticed this module :-) 
 
2. As for 1. 
 
I'm working on getting ClientCookie into a state suitable for the 
standard library.  See also patches 852995, which makes it 
possible to implement cookie handling in a urllib2 handler, and  
548197, which is somebody else's old cookie-handling patch. 
 
3. You can already get the separate headers too: 
response.info().getallmatchingheaders("Set-Cookie"). 
msg61133 - (view) Author: John J Lee (jjlee) Date: 2003-12-09 22:59
Logged In: YES 
user_id=261020

Hmm, on 3., it's true that there is no documented way of 
getting at multiple headers (and in fact, at the moment the 
object returned by urlopen(url).info() is a subclass of 
mimetools.Message, which is deprecated, 
so .getallmatchingheaders() might well disappear soon).  
 
CVS rev 1.57 of httplib attempted to fix this (bug 432621), but  
the solution (making headers available joined with commas) is 
not sufficient, thanks to the nonstandard behaviour of 
Set-Cookie headers (Netscape cookie values may contain 
unquoted commas, in violation of RFC 2616). 
 
I suppose in future, HTTP response objects will be 
implemented using email.Message objects (since mimetools is 
deprecated), so it seems reasonable to add and document 
a .get_all(hdr_name) method to httplib.HTTPMessage (perhaps 
by going ahead and reimplementing it using email.Message). 
 
I'll put it on my list to write a patch. 
msg62894 - (view) Author: John J Lee (jjlee) Date: 2008-02-24 12:40
This should be closed.
History
Date User Action Args
2022-04-11 14:56:01adminsetgithub: 39543
2008-04-28 19:44:19georg.brandlsetstatus: open -> closed
2008-02-24 12:40:31jjleesetmessages: + msg62894
2003-11-13 20:56:28phrcreate