classification
Title: cookiejar parses cookie value as int with empty name-value pair and Expires
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.5, Python 3.4, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: serhiy.storchaka Nosy List: BreamoreBoy, berker.peksag, chfoo, demian.brecht, python-dev, serhiy.storchaka
Priority: normal Keywords: easy, patch

Created on 2014-12-31 05:06 by chfoo, last changed 2015-03-13 10:34 by serhiy.storchaka. This issue is now closed.

Files
File name Uploaded Description Edit
issue23138_tip.patch demian.brecht, 2015-02-27 16:33 review
issue23138_1.patch demian.brecht, 2015-03-10 05:41 review
issue23138_2.patch demian.brecht, 2015-03-12 23:15 review
Messages (11)
msg233227 - (view) Author: Christopher Foo (chfoo) Date: 2014-12-31 05:06
Something like "Set-Cookie: ; Expires=Thu, 01 Jan 1970 00:00:10 GMT" causes the resulting cookie.value to be parsed as an int.

I expected either str or None as described in the documentation.

Example evil server:

    try:
        import http.server as http_server
    except ImportError:
        import BaseHTTPServer as http_server


    class MyHandler(http_server.BaseHTTPRequestHandler):
        def do_GET(self):
            self.send_response(200)
            self.send_header('Set-Cookie', '; Expires=Thu, 01 Jan 1970 00:00:10 GMT')
            self.send_header('Set-Cookie', 'good=123.45600')
            self.end_headers()


    def main():
        server = http_server.HTTPServer(('127.0.0.1', 8000), MyHandler)
        server.serve_forever()


    if __name__ == '__main__':
        main()
        

Example innocent client:

    try:
        import http.cookiejar as http_cookiejar
    except ImportError:
        import cookielib as http_cookiejar

    try:
        import urllib.request as urllib_request
    except ImportError:
        import urllib2 as urllib_request
        

    def main():
        cj = http_cookiejar.CookieJar()
        opener = urllib_request.build_opener(urllib_request.HTTPCookieProcessor(cj))
        r = opener.open("http://127.0.0.1:8000/")
        
        print(cj._cookies)

    if __name__ == '__main__':
        main()


The resulting output is:

{'127.0.0.1': {'/': {'expires': Cookie(version=0, name='expires', value=10.0, port=None, port_specified=False, domain='127.0.0.1', domain_specified=False, domain_initial_dot=False, path='/', path_specified=False, secure=False, expires=None, discard=True, comment=None, comment_url=None, rest={}, rfc2109=False), 'good': Cookie(version=0, name='good', value='123.45600', port=None, port_specified=False, domain='127.0.0.1', domain_specified=False, domain_initial_dot=False, path='/', path_specified=False, secure=False, expires=None, discard=True, comment=None, comment_url=None, rest={}, rfc2109=False)}}}

It gives two cookies where the first one contains name='expires', value=10.0 which is unexpected. I expected that either the bad cookie is discarded or it is accepted but the value is always a str (even if it is garbage) or None.

This bug was found in my custom cookie policy where I do len(cookie.value or ''). There is also a reference on StackOverflow but I believe no Python library bug report was filed: http://stackoverflow.com/q/20325571/1524507 . 

This was tested on Python 2.7.8, 3.2.6, 3.3.6, and 3.4.2.
msg236497 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2015-02-24 14:11
@Demian I believe this may be of interest to you.
msg236782 - (view) Author: Demian Brecht (demian.brecht) * Date: 2015-02-27 16:33
Attached is a fix that ignores the entire invalid cookie as defined in RFC 6265, Section 5.2. I'm also attaching patches for maintenance branches as it's a valid bug (NAME=VALUE pairs are required across all RFCs), although it would break backwards compatibility if the user was expecting invalid behaviour.
msg237433 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-03-07 10:19
According to RFC 6265, Section 5.2:

   2.  If the name-value-pair string lacks a %x3D ("=") character,
       ignore the set-cookie-string entirely.

But "Set-Cookie: spam; Expires=Thu, 01 Jan 1970 00:00:10 GMT" is accepted. key="spam", value=None.

   5.  If the name string is empty, ignore the set-cookie-string
       entirely.

But "Set-Cookie: =spam; Expires=Thu, 01 Jan 1970 00:00:10 GMT" is accepted. key="", value="spam".
msg237652 - (view) Author: Demian Brecht (demian.brecht) * Date: 2015-03-09 14:26
I agree that the current implementation doesn't conform to standards, but do you think those cases are worth fixing as they can potentially break backwards compatibility? I think that the reported case makes sense to fix as the name/value pair are entirely unexpected. However, the current behaviour is logical for the cases that you've pointed out.
msg237653 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-03-09 14:43
I think that for consistency either parse empty name-value pair as key="", value=None, or ignore all non-conformed cases. For backward compatibility I prefer first way.
msg237728 - (view) Author: Demian Brecht (demian.brecht) * Date: 2015-03-10 05:41
> I think that for consistency either parse empty name-value pair as key="", value=None

There is already a test present (https://hg.python.org/cpython/file/0469af231d22/Lib/test/test_http_cookiejar.py#l1084) that ensures an unset name/value pair is ignored altogether, so I don't think that makes sense from a backwards compatibility standpoint. For consistency, I kept the functionality where nameless cookies are ignored (i.e. "=foo"). I think that while it may be breaking backwards compatibility for buggy edge cases, it's more consistent with existing functionality and actually conforms to the RFC. That said, I'm not going to argue over it heatedly, so if you'd still rather see those cases permitted, let me know and I'll change it.

Valueless cookies are still permitted to keep backwards compatible as there are existing tests for that.
msg237733 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-03-10 07:34
This looks reasonable.
msg238010 - (view) Author: Roundup Robot (python-dev) Date: 2015-03-13 07:10
New changeset 44c1c0cbdc06 by Serhiy Storchaka in branch '2.7':
Issue #23138: Fixed parsing cookies with absent keys or values in cookiejar.
https://hg.python.org/cpython/rev/44c1c0cbdc06

New changeset c1abcbcfefab by Serhiy Storchaka in branch '3.4':
Issue #23138: Fixed parsing cookies with absent keys or values in cookiejar.
https://hg.python.org/cpython/rev/c1abcbcfefab

New changeset 7cc7c794d1cb by Serhiy Storchaka in branch 'default':
Issue #23138: Fixed parsing cookies with absent keys or values in cookiejar.
https://hg.python.org/cpython/rev/7cc7c794d1cb
msg238011 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-03-13 07:17
As side effect the parsing is now twice faster.

$ ./python -m timeit -s "from http.cookiejar import parse_ns_headers" -- "parse_ns_headers('foo=bar; Expires=Thu, 01 Jan 1970 00:00:10 GMT')"
Before: 1000 loops, best of 3: 976 usec per loop
After:  1000 loops, best of 3: 537 usec per loop
msg238022 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-03-13 10:34
Oh, this was incorrect example. The correct one is:

$ ./python -m timeit -s "from http.cookiejar import parse_ns_headers" -- "parse_ns_headers(['foo=bar; path=/; version=1; Expires=Thu, 01 Jan 1970 00:00:10 GMT'])"
Before: 10000 loops, best of 3: 177 usec per loop
After:  10000 loops, best of 3: 104 usec per loop
History
Date User Action Args
2015-03-13 10:34:11serhiy.storchakasetmessages: + msg238022
2015-03-13 07:17:20serhiy.storchakasetstatus: open -> closed
resolution: fixed
messages: + msg238011

stage: commit review -> resolved
2015-03-13 07:10:58python-devsetnosy: + python-dev
messages: + msg238010
2015-03-12 23:15:49demian.brechtsetfiles: + issue23138_2.patch
2015-03-10 14:16:49demian.brechtsetstage: patch review -> commit review
2015-03-10 07:34:20serhiy.storchakasetmessages: + msg237733
2015-03-10 05:41:34demian.brechtsetfiles: + issue23138_1.patch

messages: + msg237728
2015-03-09 23:24:46demian.brechtsetfiles: - issue23138_27.patch
2015-03-09 23:24:39demian.brechtsetfiles: - issue23138_34.patch
2015-03-09 14:43:42serhiy.storchakasetmessages: + msg237653
2015-03-09 14:26:57demian.brechtsetmessages: + msg237652
2015-03-07 10:19:01serhiy.storchakasetmessages: + msg237433
2015-03-07 10:00:35serhiy.storchakasetassignee: serhiy.storchaka

nosy: + serhiy.storchaka
2015-02-28 06:14:01berker.peksagsetnosy: + berker.peksag
2015-02-27 16:34:41demian.brechtsetfiles: + issue23138_27.patch
2015-02-27 16:34:24demian.brechtsetfiles: + issue23138_34.patch
2015-02-27 16:33:56demian.brechtsetkeywords: + easy, patch
files: + issue23138_tip.patch
messages: + msg236782

stage: patch review
2015-02-24 14:11:50BreamoreBoysetnosy: + demian.brecht, BreamoreBoy

messages: + msg236497
versions: - Python 3.2, Python 3.3
2014-12-31 05:06:29chfoocreate