classification
Title: continuing problem with httplib multiple set-cookie headers
Type: behavior Stage: resolved
Components: Documentation, Library (Lib) Versions: Python 3.1, Python 3.2, Python 2.7
process
Status: closed Resolution: out of date
Dependencies: Superseder:
Assigned To: docs@python Nosy List: BreamoreBoy, ajaksu2, davidma, docs@python, georg.brandl, ggenellina, jjlee, piotr.dobrogost
Priority: normal Keywords: easy

Created on 2007-02-14 19:52 by davidma, last changed 2014-08-30 03:57 by terry.reedy. This issue is now closed.

Messages (17)
msg31267 - (view) Author: David Margrave (davidma) Date: 2007-02-14 19:52
This is related to [ 432621 ] httplib: multiple Set-Cookie headers, which I was unable to re-open.

The workaround that was adopted in the previous bug tracker item was to combine multiple set-cookie headers received from the server, into a single set-cookie element in the headers dictionary, with the cookies joined into a comma-separated string.

The problem arises when a comma character appears inside the 'expires' field of one of the cookies.  This makes it difficult to split the cookie headers back apart.  The comma character should be escaped, or a different separator character used.

i.e.

expires=Sun, 17-Jan-2038 19:14:07 GMT

For now I am using the workaround that gstein suggested, use response.msg.getallmatchingheaders()

Python 2.3 has this behavior, and probably later versions.

msg31268 - (view) Author: John J Lee (jjlee) Date: 2007-03-14 20:48
I'm not sure what your complaint is.  What's wrong with response.msg.getallmatchingheaders()?
msg31269 - (view) Author: David Margrave (davidma) Date: 2007-03-14 21:10

getallmatchingheaders() works fine.

The problem is with the self.headers in the SimpleHTTPRequestHandler and derived classes.

A website may send multiple set-cookie headers, using gmail.com as an example: 

Set-Cookie: GMAIL_RTT=EXPIRED; Domain=.google.com; Expires=Tue, 13-Mar-07 21:03:04 GMT; Path=/mail
Set-Cookie: GMAIL_LOGIN=EXPIRED; Domain=.google.com; Expires=Tue, 13-Mar-07 21:03:04 GMT; Path=/mail

The SimpleHTTPRequestHandler class combines multiple set-cookie response headers into a single comma-separated string which it stores in the headers dictionary

i.e. 

self.headers ['set-cookie'] =  GMAIL_RTT=EXPIRED; Domain=.google.com; Expires=Tue, 13-Mar-07 21:03:04 GMT; Path=/mail, GMAIL_LOGIN=EXPIRED; Domain=.google.com; Expires=Tue, 13-Mar-07 21:03:04 GMT; Path=/mail

The problem is if you try to use code that uses self.headers['set-cookie'] and use string.split to get the original distinct cookie values on the comma delimiter, you'll run into trouble because of the use of the comma character within the cookies' expiration tags, such as Expires=Tue, 13-Mar-07 21:03:04 GMT

Again, getallmatchingheaders() is fine as an alternative, but as long as you are going to the trouble of storing multiple set-cookie response headers in the self.headers dict, using a delimiter of some sort, I'd argue you might as well also take care that your delimiter is either unique or escaped within the fields you are delimiting.


msg31270 - (view) Author: John J Lee (jjlee) Date: 2007-03-14 23:57
SimpleHTTPRequestHandler is not part of httplib.  Did you mean to refer to module SimpleHTTPServer rather than httplib, perhaps?

I don't see the particular bit of code you refer to (neither in httplib nor in module SimpleHTTPServer), but re the general issue:

Regardless of the fact that RFC 2616 ss. 4.2 says headers MUST be able to be combined with commas, Netscape Set-Cookie headers simply don't work that way, and Netscape Set-Cookie headers are here to stay.  So, Set-Cookie headers must not be combined.

(Quoting does not help, because Netscape Set-Cookie headers contain cookie values that 1. may contain commas and 2. do not support quoting -- any quote (") characters are in fact part of the cookie value itself rather than being part of a quoting mechanism.  And there is no precedent for any choice of delimter other than a comma, nor for any other Netscape Set-Cookie cookie value quoting mechanism.)
msg31271 - (view) Author: David Margrave (davidma) Date: 2007-03-15 00:30

fair enough, the RFC says thay have to be joinable with commas, so the behavior is correct.  I can get by with getallmatchingheaders if I need access to the original individual cookie values.

thanks,

dave
msg31272 - (view) Author: John J Lee (jjlee) Date: 2007-03-15 00:45
Huh?

1. *What* behaviour is correct?  You still have not said which bit of code you're talking about, or even which module.

2. You seem to have got the sense of what I said backwards.  As I said, RFC 2616 is (in practice) WRONG about joining with commas being OK for Set-Cookie.  Set-Cookies headers must NOT be joined with commas, despite what RFC 2616 says.
msg31273 - (view) Author: David Margrave (davidma) Date: 2007-03-15 00:58

See the addheader method of the HTTPMessage class in httplib.py

    def addheader(self, key, value):
        """Add header for field key handling repeats."""
        prev = self.dict.get(key)
        if prev is None:
            self.dict[key] = value
        else:
            combined = ", ".join((prev, value))
            self.dict[key] = combined


also see the original tracker entry where this fix was first discussed & implemented

https://sourceforge.net/tracker/index.php?func=detail&aid=432621&group_id=5470&atid=105470




msg31274 - (view) Author: John J Lee (jjlee) Date: 2007-03-15 20:19
OK, thanks!

That certainly does look wrong.  The Set-Cookie case should be special-cased in that method, I think.
msg31275 - (view) Author: John J Lee (jjlee) Date: 2007-03-15 20:46
Hold on, httplib.HTTPMessage.addheader() is undocumented, hence private.  httplib.HTTPMessage.readheaders() itself calls that method, but also keeps the raw multiple-header data in the .headers list, so .getallmatchingheaders() still works.

So the only bug I see is that the documentation for APIs that return should point out the fact that Set-Cookie is an oddity, and that .getallmatchingheaders() should be used in that case.
msg31276 - (view) Author: John J Lee (jjlee) Date: 2007-03-15 20:49
Sorry, my last paragraph got garbled a bit, here it is again:

So the only bug I see is that the documentation for APIs that always return single headers (as opposed to lists of headers) should point out the fact that Set-Cookie is an oddity, and that .getallmatchingheaders() should be used in that case.
msg81471 - (view) Author: Gabriel Genellina (ggenellina) Date: 2009-02-09 19:03
I think this report is outdated and no more relevant.
msg81489 - (view) Author: John J Lee (jjlee) Date: 2009-02-09 21:01
Why?
msg81494 - (view) Author: David Margrave (davidma) Date: 2009-02-09 21:37
I'm not down in the weeds on this one at the moment (it was a long time
ago and I've mostly forgotten about it), but recall that I agreed with
jjlee's 3/15/07 annotation:

http://bugs.python.org/msg31276

At least, I was able to get my application working by just using
getallmatchingheaders().
msg86316 - (view) Author: Daniel Diniz (ajaksu2) (Python triager) Date: 2009-04-22 18:50
John J Lee wrote:
> Hold on, httplib.HTTPMessage.addheader() is undocumented, hence private.

Not so easy to know, as many things in the network libs are
undocumented. And it can be still be wrong, regardless of being private.
msg117103 - (view) Author: John J Lee (jjlee) Date: 2010-09-21 21:31
What I said in 2007 re commas could be well out of date (might well have been so even then, in fact).  Somebody should check what browsers do now...
msg179391 - (view) Author: Piotr Dobrogost (piotr.dobrogost) Date: 2013-01-08 23:12
@jjlee

What you said re commas in 2007 was wrong and still is. Joining (with commas) multiple header field values having the same field name unconditionally (without knowing it's safe) was not allowed by RFC 2616 and still is not allowed by the upcoming new RFC. See my comment at http://bugs.python.org/issue4773#msg179377

This was fixed in Python 3 - see http://bugs.python.org/issue4773#msg154781 As this is backward incompatible change (and I guess weather this is private api or not does not matter here) and there's working alternative (although it's private api) nothing will be done here.
msg222415 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2014-07-06 19:45
msg179391 states this is fixed in Python 3 so we can close this as "out of date".
History
Date User Action Args
2014-08-30 03:57:49terry.reedysetstatus: open -> closed
resolution: out of date
stage: test needed -> resolved
2014-07-06 19:45:06BreamoreBoysetnosy: + BreamoreBoy
messages: + msg222415
2013-01-08 23:12:08piotr.dobrogostsetmessages: + msg179391
2012-08-23 14:30:21moijes12setnosy: - moijes12
2012-07-03 04:31:16moijes12setnosy: + moijes12
2011-12-04 22:33:11piotr.dobrogostsetnosy: + piotr.dobrogost
2010-09-21 21:31:12jjleesetmessages: + msg117103
2010-09-16 20:59:22BreamoreBoysetassignee: georg.brandl -> docs@python

nosy: + docs@python
versions: + Python 3.1, Python 3.2
2009-04-22 18:50:16ajaksu2setnosy: + ajaksu2
messages: + msg86316

components: + Library (Lib)
keywords: + easy
stage: test needed
2009-02-09 21:37:16davidmasetmessages: + msg81494
2009-02-09 21:01:09jjleesetmessages: + msg81489
2009-02-09 19:03:07ggenellinasetnosy: + ggenellina
messages: + msg81471
2009-02-09 04:41:43ajaksu2setassignee: georg.brandl
type: behavior
nosy: + georg.brandl
components: + Documentation, - Library (Lib)
versions: + Python 2.7, - Python 2.3
2007-02-14 19:52:33davidmacreate