classification
Title: Regression in http.cookies parsing with brackets and quotes
Type: behavior Stage: patch review
Components: Library (Lib) Versions: Python 3.6, Python 3.5, Python 3.4, Python 3.2, Python 3.3, Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Pathangi Jatinshravan, Tim.Graham, collinanderson, harris, martin.panter, pitrou, r.david.murray
Priority: normal Keywords: 3.2regression, patch

Created on 2015-09-24 16:57 by Tim.Graham, last changed 2016-08-22 12:34 by martin.panter.

Files
File name Uploaded Description Edit
cookie-bracket-quotes-test.diff Tim.Graham, 2015-09-24 16:57 review
patch.diff Pathangi Jatinshravan, 2015-10-06 15:55 review
patch_final.diff Pathangi Jatinshravan, 2015-10-06 16:26 review
patch_unittest.diff Pathangi Jatinshravan, 2015-10-06 17:37 review
patch_with_test.diff Pathangi Jatinshravan, 2015-10-07 12:53 review
patch_str_find.diff Pathangi Jatinshravan, 2015-10-08 14:26 review
patch_review.diff Pathangi Jatinshravan, 2015-11-01 15:26 review
cookie-bracket-quotes.diff collinanderson, 2016-02-10 18:12 Logically the same, just more clear. review
Messages (29)
msg251538 - (view) Author: Tim Graham (Tim.Graham) * Date: 2015-09-24 16:57
Regression in https://hg.python.org/cpython/rev/9e765e65e5cb (affects 2.7 and 3.2+), similar to issue22931 where inserting an invalid cookie value can cause the rest of the cookie to be ignored. A test is attached, and here's a quick demo:

Old:
>>> from http.cookies import SimpleCookie
>>> SimpleCookie('a=b; messages=[\"\"]; c=d;')
{'a': 'b', 'c': 'd', 'messages': ''}

New:
>>> SimpleCookie('a=b; messages=[\"\"]; c=d;')
{'a': 'b'}

Reported in Django's tracker, but Django simply delegates to SimpleCookie: https://code.djangoproject.com/ticket/25458
msg252193 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-10-03 05:13
Thanks for the test case. It looks like the commit in question was done as a security fix in 3.2.6 and 3.3.6. I’m not sure on the policy, but maybe that justifies putting any fixes into 3.2+.

I’m not familiar with HTTP cookies. Is this a case of a 100% specification-compiliant cookie, or a technically invalid one that would be nice to handle better? If the second case, maybe it is an instance of Issue 22983.
msg252213 - (view) Author: Tim Graham (Tim.Graham) * Date: 2015-10-03 14:45
It might be a case of issue22983. I'll try to look into the details and offer a patch next week.

For what it's worth, there are other regressions in Python 3.2 cookie parsing that makes the latest patch release (3.2.6) unusable with Django (issue22758), so from my perspective fixing this issue there isn't as high priority as that one.
msg252331 - (view) Author: Pathangi Jatinshravan (Pathangi Jatinshravan) Date: 2015-10-05 16:05
Hi, can I be assigned to this behaviour issue?
msg252332 - (view) Author: Tim Graham (Tim.Graham) * Date: 2015-10-05 16:06
Sure, feel free to propose a patch.
msg252333 - (view) Author: Pathangi Jatinshravan (Pathangi Jatinshravan) Date: 2015-10-05 16:09
This is the first ever bug I will be working on so there might be a bit of a learning curve, but I'll do my best to come out with something by this week.
msg252400 - (view) Author: Pathangi Jatinshravan (Pathangi Jatinshravan) Date: 2015-10-06 15:55
Hi I have made a patch for this, can anyone review and let me know?
msg252401 - (view) Author: Pathangi Jatinshravan (Pathangi Jatinshravan) Date: 2015-10-06 16:01
Oops, sorry looks like a unit test is failing. I will fix it and submit another one soon.
msg252404 - (view) Author: Pathangi Jatinshravan (Pathangi Jatinshravan) Date: 2015-10-06 16:26
Hi Tim, I have submitted a patch for this issue (patch_final.diff, the earlier one failed a UT). Now all UTs are passing. Can you take a look at this?
msg252410 - (view) Author: Tim Graham (Tim.Graham) * Date: 2015-10-06 17:09
Could you please integrate my unit test into your patch?

You also need to sign the PSF Contributor Agreement:
https://www.python.org/psf/contrib/contrib-form/
msg252415 - (view) Author: Pathangi Jatinshravan (Pathangi Jatinshravan) Date: 2015-10-06 17:37
Added a patch where unit test has been modified to include the above case. I have signed the agreement.
msg252417 - (view) Author: Tim Graham (Tim.Graham) * Date: 2015-10-06 17:55
I had already proposed a test, see cookie-bracket-quotes-test.diff. What I meant was that the fix and the test should be combined into a single patch.
msg252469 - (view) Author: Pathangi Jatinshravan (Pathangi Jatinshravan) Date: 2015-10-07 12:53
Is this what you wanted?
msg252488 - (view) Author: Pathangi Jatinshravan (Pathangi Jatinshravan) Date: 2015-10-07 23:50
Hi Tim, I have submitted a patch (patch_with_test.diff). Can you take a look at this?
msg252490 - (view) Author: Tim Graham (Tim.Graham) * Date: 2015-10-07 23:54
Yes, when I have some time.

By the way, did you intentionally remove all the "Python 3.X" versions on the issue?
msg252500 - (view) Author: Pathangi Jatinshravan (Pathangi Jatinshravan) Date: 2015-10-08 01:43
Oh not intentional. Must have clicked something by mistake
msg252507 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-10-08 03:10
Instead of the while loop, can’t you use something like str.find(";", i)?
msg252542 - (view) Author: Pathangi Jatinshravan (Pathangi Jatinshravan) Date: 2015-10-08 14:26
Hi, I've made the change to use str.find() and removed the while loop, can you take a look at it?
msg252584 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-10-09 03:44
The str.find() call was kind of what I had in mind. But I don’t feel qualified to say whether the fix is good in general. I would have to find out about at the Cookie header format, and understand what the security implications are to do with lax parsing.
msg252612 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-10-09 14:06
Yes, we should get signoff from someone who was involved in the original security fix, since it was a security fix.
msg253534 - (view) Author: Pathangi Jatinshravan (Pathangi Jatinshravan) Date: 2015-10-27 12:35
Has there been any movement on this issue?
msg253827 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-11-01 06:11
Adding Guido and Antoine, who committed the security fix as 9e765e65e5cb in 2.7 and 5cfe74a9bfa4 in 3.2. Perhaps you are able to help decide if the proposal here would affect the original security report. Basically this issue (as well as #22758 and #22983) are complaining that 3.2’s cookie parsing became too strict. People would like to parse subsequent cookie “morsels” if an earlier one is considered invalid, rather than aborting the parse completely.

All I can find out about the security report is from <https://bugs.python.org/issue22796#msg230650> and <https://hackerone.com/reports/14883>, but that doesn’t explain the test cases with square brackets in the cookie names.

RFC 6265 says double quotes (") are not meant to be sent by the server, but the client should tolerate them without any special handling (different to Python’s handling and earlier specs, which parse a special double-quoted string syntax). One potential problem that comes to mind is that the current patch blindly searches for the next semicolon “;”, which would not be valid inside a double-quoted string, e.g. name="some;value".

Python behaviour:

* Before the 3.2 security fix, square brackets and double quotes caused truncation of the cookie value, but subsequent cookies were still parsed in most cases

* The security fix prevents parsing of subsequent cookies (either on purpose or as a side effect)

* The HttpOnly and Secure support in 3.3+ (Issue 16611) prevents parsing of the cookie morsel with the offending square bracket or double quote. This is proposed for 3.2 backport in Issue 22758.

* Square brackets are now allowed in 3.2+ thanks to Issue 22931. So 3.2 should truncate the original test case at the double quote, while 3.3+ drops the offending cookie.

The current patch proposed here appears to solve Issue 22983 (permissive parsing) in general. If the current cookie does not match the syntax, it is skipped, by falling back to a search for a semicolon “;”. So I am inclined to close Issue 22983 as a duplicate of this issue.

And Tim, I understand your main interest in Issue 22758 is that parsing aborts for things like "a=value1; HttpOnly; b=value2". If this patch were ported to 3.2 it should also fix that for free.

Pathangi: did you see my review comment about unnecessary backslashes? I also left another comment today.
msg253837 - (view) Author: Pathangi Jatinshravan (Pathangi Jatinshravan) Date: 2015-11-01 10:08
Just saw the code review comments now, didn't know that there was a separate section for code review comments until now. Will take a look and implement them.
msg253855 - (view) Author: Pathangi Jatinshravan (Pathangi Jatinshravan) Date: 2015-11-01 15:26
New patch with code review comments incorporated.
msg253860 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2015-11-01 16:57
I'm coming at this without much context (I don't recall the original issue)
but IIUC from a security POV, lenient parsing is unsafe -- it could allow
an attacker to modify a cookie (or part of a cookie -- I'm unclear on the
correct terminology here) and that's what we're trying to avoid.
msg259824 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-02-08 07:20
Looking at this a second time, I think I have figured out what the security report was about. Before the fix (before revision 270f61ec1157), an attacker could trick the parser into accepting a separate key=value cookie “morsel”, when it was supposed to be part of some other cookie value. Suppose the “c=d” text was meant to be associated with the “message” key. Before the security fix, “c=d” is separated:

>>> SimpleCookie('a=b; messages=[""]c=d;')
<SimpleCookie: a='b' c='d'>

With the fix applied, we now silently abort the parsing, and there is no spurious “c” key:

>>> SimpleCookie('a=b; messages=[""]c=d;')
<SimpleCookie: a='b'>

This also seems to be described by Sergey Bobrov in Russian at <https://habrahabr.ru/post/272187/>.

Looking at the proposed patch again, I think the fix might be okay. Some specifications for cookies allow semicolons to be quoted or escaped, and I was a bit worried that this might be a problem. But all the scenarios I can imagine would be no worse with the patch compared to without it.
msg260028 - (view) Author: Collin Anderson (collinanderson) Date: 2016-02-10 18:12
The issue I'm currently running into, is that although browsers correctly ignore invalid Set-Cookie values, they allow 'any CHAR except CTLs or ";"' in cookie values set via document.cookie.

So, if you say document.cookie = 'key=va"lue; path=/', the browser will happily pass 'key=va"lue;' to the server on future requests.

So, I like the behavior of this patch, which skips over these invalid cookies and continues parsing. I've cleaned the patch up a little, but it should be the same logically.
msg260038 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-02-10 21:04
To move forward on this, I would like someone else (hopefully Antoine? :) to confirm my theory about the cookie injection attack, or otherwise explain why the patch won’t (re)open any security holes. Also, I would like to add some more test cases based on Sergey Bobrov’s post (especially the from the heading Особенности обработки Cookie #3).
msg261388 - (view) Author: Collin Anderson (collinanderson) Date: 2016-03-08 22:41
It should be safe to hard split on semicolon. `name="some;value"` is not valid, even though it's quoted. I think raw double quotes, commas, semicolons and backslashes are _always_ invalid characters in cookie values.

From https://tools.ietf.org/html/rfc6265:

{{{
 cookie-value      = *cookie-octet / ( DQUOTE *cookie-octet DQUOTE )
 cookie-octet      = %x21 / %x23-2B / %x2D-3A / %x3C-5B / %x5D-7E
                       ; US-ASCII characters excluding CTLs,
                       ; whitespace DQUOTE, comma, semicolon,
                       ; and backslash
}}}
History
Date User Action Args
2016-08-22 12:34:37martin.pantersettitle: Regression in cookie parsing with brackets and quotes -> Regression in http.cookies parsing with brackets and quotes
2016-03-08 22:41:20collinandersonsetmessages: + msg261388
2016-02-10 21:04:56martin.pantersetmessages: + msg260038
2016-02-10 18:12:51collinandersonsetfiles: + cookie-bracket-quotes.diff
nosy: + collinanderson
messages: + msg260028

2016-02-08 07:20:30martin.pantersetmessages: + msg259824
2016-02-05 21:22:27gvanrossumsetnosy: - gvanrossum
2016-02-05 14:44:25harrissetnosy: + harris
2016-01-28 11:44:04berker.peksaglinkissue26230 superseder
2015-11-01 16:57:10gvanrossumsetmessages: + msg253860
2015-11-01 15:26:45Pathangi Jatinshravansetfiles: + patch_review.diff

messages: + msg253855
2015-11-01 10:08:49Pathangi Jatinshravansetmessages: + msg253837
2015-11-01 06:11:43martin.pantersetnosy: + gvanrossum, pitrou
messages: + msg253827
2015-10-27 12:35:29Pathangi Jatinshravansetmessages: + msg253534
2015-10-09 14:06:32r.david.murraysetnosy: + r.david.murray
messages: + msg252612
2015-10-09 03:44:26martin.pantersetmessages: + msg252584
2015-10-08 14:26:19Pathangi Jatinshravansetfiles: + patch_str_find.diff

messages: + msg252542
2015-10-08 03:10:42martin.pantersetmessages: + msg252507
stage: needs patch -> patch review
2015-10-08 01:43:48Pathangi Jatinshravansetmessages: + msg252500
versions: + Python 3.2, Python 3.3, Python 3.4, Python 3.5, Python 3.6
2015-10-07 23:54:04Tim.Grahamsetmessages: + msg252490
2015-10-07 23:50:49Pathangi Jatinshravansetmessages: + msg252488
2015-10-07 12:53:14Pathangi Jatinshravansetfiles: + patch_with_test.diff

messages: + msg252469
2015-10-06 17:55:39Tim.Grahamsetmessages: + msg252417
2015-10-06 17:37:21Pathangi Jatinshravansetfiles: + patch_unittest.diff

messages: + msg252415
2015-10-06 17:09:45Tim.Grahamsetmessages: + msg252410
2015-10-06 16:26:35Pathangi Jatinshravansetfiles: + patch_final.diff

messages: + msg252404
2015-10-06 16:01:07Pathangi Jatinshravansetmessages: + msg252401
2015-10-06 15:55:15Pathangi Jatinshravansetfiles: + patch.diff

messages: + msg252400
versions: - Python 3.2, Python 3.3, Python 3.4, Python 3.5, Python 3.6
2015-10-05 16:09:49Pathangi Jatinshravansetmessages: + msg252333
2015-10-05 16:06:51Tim.Grahamsetmessages: + msg252332
2015-10-05 16:05:51Pathangi Jatinshravansetnosy: + Pathangi Jatinshravan
messages: + msg252331
2015-10-03 14:45:51Tim.Grahamsetmessages: + msg252213
2015-10-03 05:13:10martin.pantersetnosy: + martin.panter
messages: + msg252193

keywords: + 3.2regression
stage: needs patch
2015-09-24 16:57:56Tim.Grahamcreate