classification
Title: http.cookies: Handle malformed cookie
Type: enhancement Stage: test needed
Components: Library (Lib) Versions: Python 3.4
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: georg.brandl, karlcow, keakon, martin.panter, spookylukey, terry.reedy
Priority: normal Keywords:

Created on 2013-03-03 11:10 by keakon, last changed 2016-02-08 07:11 by martin.panter.

Messages (13)
msg183367 - (view) Author: keakon (keakon) Date: 2013-03-03 11:10
One of my user told me that she couldn't login to my website yesterday. I logged her cookie, and found it began with ',BRIDGE_R=;' which was a malformed cookie.
Tornado uses Cookie.SimpleCookie.load() to parse her cookie, and returns an empty dict when catching an exception such as CookieError.

In that case, Tornado has to treat her as a new user since it believes she didn't provide any cookies.
Even after Tornado tried to set cookie (like user_id) for her, it still couldn't parse her cookie the next time.

I checked Issue2193 and found the patch provided by spookylukey could fix the bug, but it was rejected.
Why not add a default parameter like strict=True, and let users to decide whether to ignore invalid keys or to raise an error?

I believe SimpleCookie is useless for handling malformed cookies right now.
If it's still not acceptable, should I implement my own Cookie class for Tornado like Django did (https://github.com/django/django/blob/master/django/http/cookie.py)?
msg183781 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2013-03-09 01:23
Code behaving as documented is not a bug for tracker purposes.  Adding a parameter to allow new behavior is an enhancement for a future release. 

Who is responsible for the invalid cookie. Pardon my ignorance, but if tornado re-sets the cookie, why cannot it read it the next time?

If the existing test suite tests for CookieError for invalid cookies, writing tests for strict=False (return instead of CookieError) would be trivial.
msg183789 - (view) Author: keakon (keakon) Date: 2013-03-09 03:31
Terry, say that a user's cookie is ",BRIDGE_R=; a=b;" right now.
When he login, the server sends "Set-Cookie: user_id=1; Path=/" header to him.
Then his cookie is ",BRIDGE_R=; a=b; user_id=1;" now.

The next time he sends cookie to the server, Cookie.SimpleCookie.load() tries to parse the cookie, but raises a CookieError.
So the server has no way to get his user_id from cookie. It has to let him login again and sends "Set-Cookie: user_id=1; Path=/" header infinitely.

I cannot clear all cookies because Cookie.SimpleCookie.load() even dosen't let me know the keys in his cookie.
msg183790 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2013-03-09 03:42
keakon, changing the headers after a developer sets them is insulting, annoying, a waste of my time to change them back again, and a distraction from the issue.
msg183794 - (view) Author: keakon (keakon) Date: 2013-03-09 04:16
Terry, I think that's the standard process of web applications.

1. The user agent send cookie via HTTP headers to the web server.
2. The web server parse its cookie. If the server fails to find something proves the user has logged in from his cookie, redirect him to the login page.
3. The user agent post login information to the web server.
4. The web server verify the post data. If it's correct, the server send Set-Cookie headers which can be used as a proof in the step 2 to the user agent.

After the 4 steps, the user agent should be considered as a logged-in user.
However, in the step 2, the server cannot parse his cookie duo to CookieError. It has to redirect the user to the login page and continue the next steps.

I don't think there is anything wrong with the process except the strange behavior of Cookie.SimpleCookie.load().
msg183799 - (view) Author: karl (karlcow) * Date: 2013-03-09 06:59
Just a quick note that the new specification for HTTP State Mechanism (aka cookies) is http://tools.ietf.org/html/rfc6265

keakon, Do you know why her cookie was ',BRIDGE_R=;'
msg183800 - (view) Author: keakon (keakon) Date: 2013-03-09 07:17
karl, I don't know the exact reason.
"BRIDGE_R" is a cookie name set by Baidu Bridge. I don't know why there is a comma before it.

The Baidu Bridge is an external JavaScript resource. It can do anything like: document.cookie = ",BRIDGE_R=;";
I think Baidu Bridge set the wrong cookie by mistake. But we still rely on Baidu Bridge, and we have no way to clear the wrong cookie.
msg183805 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2013-03-09 07:40
Carl, do you know if the (2 year old) draft better reflect actual usage than 2965? Is there much change other than "deprecates the use of the Cookie2 and Set-Cookie2 header fields."?
msg183812 - (view) Author: karl (karlcow) * Date: 2013-03-09 12:24
Yes the new RFC has been written by Adam Barth who wanted to describe things matching the reality of HTTP and servers/browsers issues.
msg183839 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2013-03-09 19:16
I believe our normal policy is to only follow accepted RFCs. But your comment suggests that in this case we should pay attention to the new draft. Do you have any idea why apparently nothing has happened in two years. Do some people oppose it?
msg183846 - (view) Author: Luke Plant (spookylukey) Date: 2013-03-09 20:28
I'm a core developer on Django, and I've looked into cookies a lot, and also Python's SimpleCookie, and I've found that all accepted RFCs are completely irrelevant for this issue.

No accepted RFC was ever widely implemented - instead browsers mainly did something like the original "Netscape cookies", with various interpretations. Opera attempted RFC 2965, at least at one point, but no-one else.

RFC 6265, whatever its status, is probably the closest thing to a useful document of how cookies "should" work. But even then, I'm afraid that the main guiding principle has to be sheer pragmatism. Read the source code or bug trackers of any other project that has to handle cookies and you'll find they have all come to that conclusion, unfortunately.
msg183863 - (view) Author: karl (karlcow) * Date: 2013-03-10 02:54
The current status of RFC6265 is PROPOSED STANDARD 
http://www.rfc-editor.org/info/rfc6265

Adam Barth is part of the Google Chrome Team. I do not want to talk for Adam. So better ask him, I don't think he has the energy/will to push further through the IETF process.
msg259823 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-02-08 07:11
The current Python 3.5 and default branches actually seem to parse the test case given:

>>> c = SimpleCookie()
>>> c.load(",BRIDGE_R=; a=b; user_id=1;")
>>> c
<SimpleCookie: ,BRIDGE_R='' a='b' user_id='1'>

But that is just a side effect of Issue 26302. When that is fixed, parsing the cookie string will raise CookieError and fail to set the invalid cookie “morsel”, and the ones that come after it.

There seems to be a disconnect between _LegalChars (causes the CookieError if a comma is in a cookie key name) and _LegalKeyChars (allows a comma, but causes cookie string parsing to silently abort for other illegal characters).

There are other cases where the entire cookie string is rejected, specifically added by Issue 22796 (revision a065ab1c67a8).

On the other hand, Issue 25228 has a which has a patch to skip over some invalid cookie “morsels” and continue on to valid ones.
History
Date User Action Args
2016-02-08 07:11:05martin.pantersetnosy: + martin.panter

messages: + msg259823
title: Handle malformed cookie -> http.cookies: Handle malformed cookie
2016-02-08 05:51:24martin.panterlinkissue22983 superseder
2013-03-10 02:54:57karlcowsetmessages: + msg183863
2013-03-09 20:28:42spookylukeysetmessages: + msg183846
2013-03-09 19:16:21terry.reedysetmessages: + msg183839
2013-03-09 12:24:50karlcowsetmessages: + msg183812
2013-03-09 07:40:21terry.reedysetmessages: + msg183805
2013-03-09 07:17:30keakonsetmessages: + msg183800
2013-03-09 06:59:58karlcowsetnosy: + karlcow
messages: + msg183799
2013-03-09 04:16:10keakonsetmessages: + msg183794
2013-03-09 03:42:54terry.reedysettype: behavior -> enhancement
messages: + msg183790
versions: + Python 3.4, - Python 2.7
2013-03-09 03:31:55keakonsettype: enhancement -> behavior
messages: + msg183789
versions: + Python 2.7, - Python 3.4
2013-03-09 01:23:16terry.reedysetversions: + Python 3.4, - Python 2.7
nosy: + terry.reedy

messages: + msg183781

type: behavior -> enhancement
stage: test needed
2013-03-03 11:10:33keakoncreate