New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regression in http.cookies parsing with brackets and quotes #69415
Comments
Regression in https://hg.python.org/cpython/rev/9e765e65e5cb (affects 2.7 and 3.2+), similar to bpo-22931 where inserting an invalid cookie value can cause the rest of the cookie to be ignored. A test is attached, and here's a quick demo: Old:
>>> from http.cookies import SimpleCookie
>>> SimpleCookie('a=b; messages=[\"\"]; c=d;')
{'a': 'b', 'c': 'd', 'messages': ''}
New:
>>> SimpleCookie('a=b; messages=[\"\"]; c=d;')
{'a': 'b'} Reported in Django's tracker, but Django simply delegates to SimpleCookie: https://code.djangoproject.com/ticket/25458 |
Thanks for the test case. It looks like the commit in question was done as a security fix in 3.2.6 and 3.3.6. I’m not sure on the policy, but maybe that justifies putting any fixes into 3.2+. I’m not familiar with HTTP cookies. Is this a case of a 100% specification-compiliant cookie, or a technically invalid one that would be nice to handle better? If the second case, maybe it is an instance of bpo-22983. |
It might be a case of bpo-22983. I'll try to look into the details and offer a patch next week. For what it's worth, there are other regressions in Python 3.2 cookie parsing that makes the latest patch release (3.2.6) unusable with Django (bpo-22758), so from my perspective fixing this issue there isn't as high priority as that one. |
Hi, can I be assigned to this behaviour issue? |
Sure, feel free to propose a patch. |
This is the first ever bug I will be working on so there might be a bit of a learning curve, but I'll do my best to come out with something by this week. |
Hi I have made a patch for this, can anyone review and let me know? |
Oops, sorry looks like a unit test is failing. I will fix it and submit another one soon. |
Hi Tim, I have submitted a patch for this issue (patch_final.diff, the earlier one failed a UT). Now all UTs are passing. Can you take a look at this? |
Could you please integrate my unit test into your patch? You also need to sign the PSF Contributor Agreement: |
Added a patch where unit test has been modified to include the above case. I have signed the agreement. |
I had already proposed a test, see cookie-bracket-quotes-test.diff. What I meant was that the fix and the test should be combined into a single patch. |
Is this what you wanted? |
Hi Tim, I have submitted a patch (patch_with_test.diff). Can you take a look at this? |
Yes, when I have some time. By the way, did you intentionally remove all the "Python 3.X" versions on the issue? |
Oh not intentional. Must have clicked something by mistake |
Instead of the while loop, can’t you use something like str.find(";", i)? |
Hi, I've made the change to use str.find() and removed the while loop, can you take a look at it? |
The str.find() call was kind of what I had in mind. But I don’t feel qualified to say whether the fix is good in general. I would have to find out about at the Cookie header format, and understand what the security implications are to do with lax parsing. |
Yes, we should get signoff from someone who was involved in the original security fix, since it was a security fix. |
Has there been any movement on this issue? |
Adding Guido and Antoine, who committed the security fix as 9e765e65e5cb in 2.7 and 5cfe74a9bfa4 in 3.2. Perhaps you are able to help decide if the proposal here would affect the original security report. Basically this issue (as well as bpo-22758 and bpo-22983) are complaining that 3.2’s cookie parsing became too strict. People would like to parse subsequent cookie “morsels” if an earlier one is considered invalid, rather than aborting the parse completely. All I can find out about the security report is from <https://bugs.python.org/issue22796#msg230650\> and <https://hackerone.com/reports/14883\>, but that doesn’t explain the test cases with square brackets in the cookie names. RFC 6265 says double quotes (") are not meant to be sent by the server, but the client should tolerate them without any special handling (different to Python’s handling and earlier specs, which parse a special double-quoted string syntax). One potential problem that comes to mind is that the current patch blindly searches for the next semicolon “;”, which would not be valid inside a double-quoted string, e.g. name="some;value". Python behaviour:
The current patch proposed here appears to solve bpo-22983 (permissive parsing) in general. If the current cookie does not match the syntax, it is skipped, by falling back to a search for a semicolon “;”. So I am inclined to close bpo-22983 as a duplicate of this issue. And Tim, I understand your main interest in bpo-22758 is that parsing aborts for things like "a=value1; HttpOnly; b=value2". If this patch were ported to 3.2 it should also fix that for free. Pathangi: did you see my review comment about unnecessary backslashes? I also left another comment today. |
Just saw the code review comments now, didn't know that there was a separate section for code review comments until now. Will take a look and implement them. |
New patch with code review comments incorporated. |
I'm coming at this without much context (I don't recall the original issue) |
Looking at this a second time, I think I have figured out what the security report was about. Before the fix (before revision 270f61ec1157), an attacker could trick the parser into accepting a separate key=value cookie “morsel”, when it was supposed to be part of some other cookie value. Suppose the “c=d” text was meant to be associated with the “message” key. Before the security fix, “c=d” is separated: >>> SimpleCookie('a=b; messages=[""]c=d;')
<SimpleCookie: a='b' c='d'> With the fix applied, we now silently abort the parsing, and there is no spurious “c” key: >>> SimpleCookie('a=b; messages=[""]c=d;')
<SimpleCookie: a='b'> This also seems to be described by Sergey Bobrov in Russian at <https://habrahabr.ru/post/272187/\>. Looking at the proposed patch again, I think the fix might be okay. Some specifications for cookies allow semicolons to be quoted or escaped, and I was a bit worried that this might be a problem. But all the scenarios I can imagine would be no worse with the patch compared to without it. |
The issue I'm currently running into, is that although browsers correctly ignore invalid Set-Cookie values, they allow 'any CHAR except CTLs or ";"' in cookie values set via document.cookie. So, if you say document.cookie = 'key=va"lue; path=/', the browser will happily pass 'key=va"lue;' to the server on future requests. So, I like the behavior of this patch, which skips over these invalid cookies and continues parsing. I've cleaned the patch up a little, but it should be the same logically. |
To move forward on this, I would like someone else (hopefully Antoine? :) to confirm my theory about the cookie injection attack, or otherwise explain why the patch won’t (re)open any security holes. Also, I would like to add some more test cases based on Sergey Bobrov’s post (especially the from the heading Особенности обработки Cookie #3). |
It should be safe to hard split on semicolon. From https://tools.ietf.org/html/rfc6265: {{{ |
I'm seeing a similar issue with curly brackets. from Cookie import BaseCookie
cookie = BaseCookie('asd={"asd"}; my-real-cookie=stuff i care about; blah=blah')
assert 'my-real-cookie' in cookie # False |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: