Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression in http.cookies parsing with brackets and quotes #69415

Open
timgraham mannequin opened this issue Sep 24, 2015 · 30 comments
Open

Regression in http.cookies parsing with brackets and quotes #69415

timgraham mannequin opened this issue Sep 24, 2015 · 30 comments
Labels
3.8 only security fixes 3.9 only security fixes 3.10 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@timgraham
Copy link
Mannequin

timgraham mannequin commented Sep 24, 2015

BPO 25228
Nosy @pitrou, @bitdancer, @vadmium, @timgraham, @collinanderson, @goodspark
Files
  • cookie-bracket-quotes-test.diff
  • patch.diff
  • patch_final.diff
  • patch_unittest.diff
  • patch_with_test.diff
  • patch_str_find.diff
  • patch_review.diff
  • cookie-bracket-quotes.diff: Logically the same, just more clear.
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2015-09-24.16:57:56.628>
    labels = ['3.8', 'type-bug', 'library', '3.9', '3.10']
    title = 'Regression in http.cookies parsing with brackets and quotes'
    updated_at = <Date 2020-11-06.19:28:14.369>
    user = 'https://github.com/timgraham'

    bugs.python.org fields:

    activity = <Date 2020-11-06.19:28:14.369>
    actor = 'iritkatriel'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Library (Lib)']
    creation = <Date 2015-09-24.16:57:56.628>
    creator = 'Tim.Graham'
    dependencies = []
    files = ['40566', '40700', '40701', '40703', '40709', '40717', '40919', '41889']
    hgrepos = []
    issue_num = 25228
    keywords = ['patch', '3.2regression']
    message_count = 30.0
    messages = ['251538', '252193', '252213', '252331', '252332', '252333', '252400', '252401', '252404', '252410', '252415', '252417', '252469', '252488', '252490', '252500', '252507', '252542', '252584', '252612', '253534', '253827', '253837', '253855', '253860', '259824', '260028', '260038', '261388', '317933']
    nosy_count = 8.0
    nosy_names = ['pitrou', 'r.david.murray', 'martin.panter', 'Tim.Graham', 'collinanderson', 'Pathangi Jatinshravan', 'harris', 'spark']
    pr_nums = []
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue25228'
    versions = ['Python 3.8', 'Python 3.9', 'Python 3.10']

    @timgraham
    Copy link
    Mannequin Author

    timgraham mannequin commented Sep 24, 2015

    Regression in https://hg.python.org/cpython/rev/9e765e65e5cb (affects 2.7 and 3.2+), similar to bpo-22931 where inserting an invalid cookie value can cause the rest of the cookie to be ignored. A test is attached, and here's a quick demo:

    Old:
    >>> from http.cookies import SimpleCookie
    >>> SimpleCookie('a=b; messages=[\"\"]; c=d;')
    {'a': 'b', 'c': 'd', 'messages': ''}
    
    New:
    >>> SimpleCookie('a=b; messages=[\"\"]; c=d;')
    {'a': 'b'}

    Reported in Django's tracker, but Django simply delegates to SimpleCookie: https://code.djangoproject.com/ticket/25458

    @timgraham timgraham mannequin added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Sep 24, 2015
    @vadmium
    Copy link
    Member

    vadmium commented Oct 3, 2015

    Thanks for the test case. It looks like the commit in question was done as a security fix in 3.2.6 and 3.3.6. I’m not sure on the policy, but maybe that justifies putting any fixes into 3.2+.

    I’m not familiar with HTTP cookies. Is this a case of a 100% specification-compiliant cookie, or a technically invalid one that would be nice to handle better? If the second case, maybe it is an instance of bpo-22983.

    @timgraham
    Copy link
    Mannequin Author

    timgraham mannequin commented Oct 3, 2015

    It might be a case of bpo-22983. I'll try to look into the details and offer a patch next week.

    For what it's worth, there are other regressions in Python 3.2 cookie parsing that makes the latest patch release (3.2.6) unusable with Django (bpo-22758), so from my perspective fixing this issue there isn't as high priority as that one.

    @PathangiJatinshravan
    Copy link
    Mannequin

    PathangiJatinshravan mannequin commented Oct 5, 2015

    Hi, can I be assigned to this behaviour issue?

    @timgraham
    Copy link
    Mannequin Author

    timgraham mannequin commented Oct 5, 2015

    Sure, feel free to propose a patch.

    @PathangiJatinshravan
    Copy link
    Mannequin

    PathangiJatinshravan mannequin commented Oct 5, 2015

    This is the first ever bug I will be working on so there might be a bit of a learning curve, but I'll do my best to come out with something by this week.

    @PathangiJatinshravan
    Copy link
    Mannequin

    PathangiJatinshravan mannequin commented Oct 6, 2015

    Hi I have made a patch for this, can anyone review and let me know?

    @PathangiJatinshravan
    Copy link
    Mannequin

    PathangiJatinshravan mannequin commented Oct 6, 2015

    Oops, sorry looks like a unit test is failing. I will fix it and submit another one soon.

    @PathangiJatinshravan
    Copy link
    Mannequin

    PathangiJatinshravan mannequin commented Oct 6, 2015

    Hi Tim, I have submitted a patch for this issue (patch_final.diff, the earlier one failed a UT). Now all UTs are passing. Can you take a look at this?

    @timgraham
    Copy link
    Mannequin Author

    timgraham mannequin commented Oct 6, 2015

    Could you please integrate my unit test into your patch?

    You also need to sign the PSF Contributor Agreement:
    https://www.python.org/psf/contrib/contrib-form/

    @PathangiJatinshravan
    Copy link
    Mannequin

    PathangiJatinshravan mannequin commented Oct 6, 2015

    Added a patch where unit test has been modified to include the above case. I have signed the agreement.

    @timgraham
    Copy link
    Mannequin Author

    timgraham mannequin commented Oct 6, 2015

    I had already proposed a test, see cookie-bracket-quotes-test.diff. What I meant was that the fix and the test should be combined into a single patch.

    @PathangiJatinshravan
    Copy link
    Mannequin

    PathangiJatinshravan mannequin commented Oct 7, 2015

    Is this what you wanted?

    @PathangiJatinshravan
    Copy link
    Mannequin

    PathangiJatinshravan mannequin commented Oct 7, 2015

    Hi Tim, I have submitted a patch (patch_with_test.diff). Can you take a look at this?

    @timgraham
    Copy link
    Mannequin Author

    timgraham mannequin commented Oct 7, 2015

    Yes, when I have some time.

    By the way, did you intentionally remove all the "Python 3.X" versions on the issue?

    @PathangiJatinshravan
    Copy link
    Mannequin

    PathangiJatinshravan mannequin commented Oct 8, 2015

    Oh not intentional. Must have clicked something by mistake

    @vadmium
    Copy link
    Member

    vadmium commented Oct 8, 2015

    Instead of the while loop, can’t you use something like str.find(";", i)?

    @PathangiJatinshravan
    Copy link
    Mannequin

    PathangiJatinshravan mannequin commented Oct 8, 2015

    Hi, I've made the change to use str.find() and removed the while loop, can you take a look at it?

    @vadmium
    Copy link
    Member

    vadmium commented Oct 9, 2015

    The str.find() call was kind of what I had in mind. But I don’t feel qualified to say whether the fix is good in general. I would have to find out about at the Cookie header format, and understand what the security implications are to do with lax parsing.

    @bitdancer
    Copy link
    Member

    Yes, we should get signoff from someone who was involved in the original security fix, since it was a security fix.

    @PathangiJatinshravan
    Copy link
    Mannequin

    PathangiJatinshravan mannequin commented Oct 27, 2015

    Has there been any movement on this issue?

    @vadmium
    Copy link
    Member

    vadmium commented Nov 1, 2015

    Adding Guido and Antoine, who committed the security fix as 9e765e65e5cb in 2.7 and 5cfe74a9bfa4 in 3.2. Perhaps you are able to help decide if the proposal here would affect the original security report. Basically this issue (as well as bpo-22758 and bpo-22983) are complaining that 3.2’s cookie parsing became too strict. People would like to parse subsequent cookie “morsels” if an earlier one is considered invalid, rather than aborting the parse completely.

    All I can find out about the security report is from <https://bugs.python.org/issue22796#msg230650\> and <https://hackerone.com/reports/14883\>, but that doesn’t explain the test cases with square brackets in the cookie names.

    RFC 6265 says double quotes (") are not meant to be sent by the server, but the client should tolerate them without any special handling (different to Python’s handling and earlier specs, which parse a special double-quoted string syntax). One potential problem that comes to mind is that the current patch blindly searches for the next semicolon “;”, which would not be valid inside a double-quoted string, e.g. name="some;value".

    Python behaviour:

    • Before the 3.2 security fix, square brackets and double quotes caused truncation of the cookie value, but subsequent cookies were still parsed in most cases

    • The security fix prevents parsing of subsequent cookies (either on purpose or as a side effect)

    • The HttpOnly and Secure support in 3.3+ (bpo-16611) prevents parsing of the cookie morsel with the offending square bracket or double quote. This is proposed for 3.2 backport in bpo-22758.

    • Square brackets are now allowed in 3.2+ thanks to bpo-22931. So 3.2 should truncate the original test case at the double quote, while 3.3+ drops the offending cookie.

    The current patch proposed here appears to solve bpo-22983 (permissive parsing) in general. If the current cookie does not match the syntax, it is skipped, by falling back to a search for a semicolon “;”. So I am inclined to close bpo-22983 as a duplicate of this issue.

    And Tim, I understand your main interest in bpo-22758 is that parsing aborts for things like "a=value1; HttpOnly; b=value2". If this patch were ported to 3.2 it should also fix that for free.

    Pathangi: did you see my review comment about unnecessary backslashes? I also left another comment today.

    @PathangiJatinshravan
    Copy link
    Mannequin

    PathangiJatinshravan mannequin commented Nov 1, 2015

    Just saw the code review comments now, didn't know that there was a separate section for code review comments until now. Will take a look and implement them.

    @PathangiJatinshravan
    Copy link
    Mannequin

    PathangiJatinshravan mannequin commented Nov 1, 2015

    New patch with code review comments incorporated.

    @gvanrossum
    Copy link
    Member

    I'm coming at this without much context (I don't recall the original issue)
    but IIUC from a security POV, lenient parsing is unsafe -- it could allow
    an attacker to modify a cookie (or part of a cookie -- I'm unclear on the
    correct terminology here) and that's what we're trying to avoid.

    @vadmium
    Copy link
    Member

    vadmium commented Feb 8, 2016

    Looking at this a second time, I think I have figured out what the security report was about. Before the fix (before revision 270f61ec1157), an attacker could trick the parser into accepting a separate key=value cookie “morsel”, when it was supposed to be part of some other cookie value. Suppose the “c=d” text was meant to be associated with the “message” key. Before the security fix, “c=d” is separated:

    >>> SimpleCookie('a=b; messages=[""]c=d;')
    <SimpleCookie: a='b' c='d'>

    With the fix applied, we now silently abort the parsing, and there is no spurious “c” key:

    >>> SimpleCookie('a=b; messages=[""]c=d;')
    <SimpleCookie: a='b'>

    This also seems to be described by Sergey Bobrov in Russian at <https://habrahabr.ru/post/272187/\>.

    Looking at the proposed patch again, I think the fix might be okay. Some specifications for cookies allow semicolons to be quoted or escaped, and I was a bit worried that this might be a problem. But all the scenarios I can imagine would be no worse with the patch compared to without it.

    @collinanderson
    Copy link
    Mannequin

    collinanderson mannequin commented Feb 10, 2016

    The issue I'm currently running into, is that although browsers correctly ignore invalid Set-Cookie values, they allow 'any CHAR except CTLs or ";"' in cookie values set via document.cookie.

    So, if you say document.cookie = 'key=va"lue; path=/', the browser will happily pass 'key=va"lue;' to the server on future requests.

    So, I like the behavior of this patch, which skips over these invalid cookies and continues parsing. I've cleaned the patch up a little, but it should be the same logically.

    @vadmium
    Copy link
    Member

    vadmium commented Feb 10, 2016

    To move forward on this, I would like someone else (hopefully Antoine? :) to confirm my theory about the cookie injection attack, or otherwise explain why the patch won’t (re)open any security holes. Also, I would like to add some more test cases based on Sergey Bobrov’s post (especially the from the heading Особенности обработки Cookie #3).

    @collinanderson
    Copy link
    Mannequin

    collinanderson mannequin commented Mar 8, 2016

    It should be safe to hard split on semicolon. name="some;value" is not valid, even though it's quoted. I think raw double quotes, commas, semicolons and backslashes are always invalid characters in cookie values.

    From https://tools.ietf.org/html/rfc6265:

    {{{
    cookie-value = *cookie-octet / ( DQUOTE *cookie-octet DQUOTE )
    cookie-octet = %x21 / %x23-2B / %x2D-3A / %x3C-5B / %x5D-7E
    ; US-ASCII characters excluding CTLs,
    ; whitespace DQUOTE, comma, semicolon,
    ; and backslash
    }}}

    @vadmium vadmium changed the title Regression in cookie parsing with brackets and quotes Regression in http.cookies parsing with brackets and quotes Aug 22, 2016
    @goodspark
    Copy link
    Mannequin

    goodspark mannequin commented May 28, 2018

    I'm seeing a similar issue with curly brackets.

    from Cookie import BaseCookie
    cookie = BaseCookie('asd={"asd"}; my-real-cookie=stuff i care about; blah=blah')
    assert 'my-real-cookie' in cookie  # False

    @iritkatriel iritkatriel added 3.8 only security fixes 3.9 only security fixes 3.10 only security fixes labels Nov 6, 2020
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.8 only security fixes 3.9 only security fixes 3.10 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    Status: No status
    Development

    No branches or pull requests

    4 participants