classification
Title: _LegalCharsPatt in cookies.py includes illegal characters
Type: behavior Stage: committed/rejected
Components: Library (Lib) Versions: Python 3.3, Python 2.7
process
Status: closed Resolution: duplicate
Dependencies: Superseder: comma separated cookie values
View: 1210326
Assigned To: Nosy List: Simon.Blanchard, grahamd, r.david.murray
Priority: normal Keywords:

Created on 2012-10-30 07:07 by Simon.Blanchard, last changed 2012-10-31 08:02 by Simon.Blanchard. This issue is now closed.

Messages (5)
msg174183 - (view) Author: Simon Blanchard (Simon.Blanchard) Date: 2012-10-30 07:07
_LegalCharsPatt  = r"[\w\d!#%&'~_`><@,:/\$\*\+\-\.\^\|\)\(\?\}\{\=]"

The above regex in cookies.py includes the the comma character but RFC 6265 https://tools.ietf.org/html/rfc6265 section 4.1.1 says:

 cookie-octet      = %x21 / %x23-2B / %x2D-3A / %x3C-5B / %x5D-7E
                       ; US-ASCII characters excluding CTLs,
                       ; whitespace DQUOTE, comma, semicolon,
                       ; and backslash

That is, no comma.
msg174210 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012-10-30 13:37
This is a pragmatic choice.  Try searching the tracker for 'cookie comma', and read about the lack of adherence to cookie RFCs by the major browsers.  Specifically, I think issue 1210326 is relevant here, and am closing this as a duplicate of that issue.  If you disagree, I think we'll need examples from real-world browser/server situations where this is an incorrect choice in order to consider changing it.

You will note that the comment block before that equate mentions that it does not follow the RFCs for pragmatic reasons.
msg174262 - (view) Author: Simon Blanchard (Simon.Blanchard) Date: 2012-10-31 04:28
I have a real world example. Using Apache, mod_wsgi and Django. Given this in the META dict:

 'HTTP_COOKIE': 'yaean_djsession=23ab7bf8b260cbb2f2bc80b1c1fd98fa, yaean_yasession=ff2a3030ee3f428f91c6f554a63b459c',

Django via the Python cookie api gives this:

COOKIES:{'yaean_djsession': '23ab7bf8b260cbb2f2bc80b1c1fd98fa,',
 'yaean_yasession': 'ff2a3030ee3f428f91c6f554a63b459c'},

Note the comma on the end of the cookie named yaean_djsession in COOKIES. It should not be there. In this case session lookup fails.
msg174263 - (view) Author: Graham Dumpleton (grahamd) Date: 2012-10-31 04:38
For that cookie string to be valid in the first place, shouldn't it have been sent as:

'HTTP_COOKIE': 'yaean_djsession=23ab7bf8b260cbb2f2bc80b1c1fd98fa; yaean_yasession=ff2a3030ee3f428f91c6f554a63b459c'

IOW, semicolon as separator.

What client generated that HTTP Cookie header with commas in it?

Only way I could see you ending up with that, if client isn't broken, is if when sent by application originally it sent it as only one Set-Cookie response header and had tried to set both values at same time with comma as separator. Then when it has come back from client like that to application, the cookie parser has then done the wrong thing on it.

If this is a browser client, check the browser cookie cache to see what it is stored as in there.
msg174267 - (view) Author: Simon Blanchard (Simon.Blanchard) Date: 2012-10-31 08:02
'HTTP_USER_AGENT': 'Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)',

It's the Baidu spider according to the user agent string. (Baidu is the biggest search engine in China.) The serving app is Django + mod_wsgi + Apache - which I think must be OK. I guess the Baidu spider is broken?

Thanks
History
Date User Action Args
2012-10-31 08:02:56Simon.Blanchardsetmessages: + msg174267
2012-10-31 04:38:30grahamdsetnosy: + grahamd
messages: + msg174263
2012-10-31 04:28:54Simon.Blanchardsetmessages: + msg174262
2012-10-30 13:37:31r.david.murraysetstatus: open -> closed

superseder: comma separated cookie values

nosy: + r.david.murray
messages: + msg174210
resolution: duplicate
stage: committed/rejected
2012-10-30 07:07:29Simon.Blanchardcreate