classification
Title: Cookie.Morsel breaks in parsing cookie values with whitespace
Type: behavior Stage: committed/rejected
Components: Library (Lib) Versions: Python 3.2, Python 3.1, Python 2.7
process
Status: closed Resolution: out of date
Dependencies: Superseder:
Assigned To: Nosy List: BreamoreBoy, berker.peksag, cburroughs, mixedpuppy, rhymes, rushman, trentm
Priority: normal Keywords: needs review, patch

Created on 2008-06-10 10:11 by rhymes, last changed 2014-01-21 17:23 by berker.peksag. This issue is now closed.

Files
File name Uploaded Description Edit
cookie.patch trentm, 2009-02-17 00:20 patch to python 2.7 head fix Cookie.py and to test_cookie.py review
Messages (6)
msg67901 - (view) Author: Lawrence Oluyede (rhymes) Date: 2008-06-10 10:11
It seems the Cookie module has an odd behavior with whitespaces.
According to http://wp.netscape.com/newsref/std/cookie_spec.html and
http://en.wikipedia.org/wiki/HTTP_cookie#Cookie_attributes the 'Expires'
attribute of the cookie should have this format:

"Wdy, DD-Mon-YYYY HH:MM:SS GMT"

and this is recognized by all the browsers. The oddity comes when I try
to load or create a cookie with that attribute:

Python 2.5.2 (r252:60911, Apr 21 2008, 11:12:42) 
[GCC 4.2.3 (Ubuntu 4.2.3-2ubuntu7)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from Cookie import SimpleCookie
>>> cookies = SimpleCookie()
>>> cookies.load('foo=baz; expires=Sat, 10-Jun-1978 09:41:04 GMT')
>>> cookies
<SimpleCookie: foo='baz'>
>>> cookies['foo']['expires']
'Sat,'
>>> cookies.load('foo=baz; expires=2008-06-10T09:44:45.963024')
>>> cookies['foo']['expires']
'2008-06-10T09:44:45.963024'

It really seems the parser breaks on whitespaces.
msg75625 - (view) Author: mixedpuppy (mixedpuppy) Date: 2008-11-08 00:44
The problem is that spaces are not allowed in the attribute values. 
Here is a work around

# hack fix of Cookie.BasicCookie
import Cookie
import re
_LegalCharsPatt  = r"\w\d!#%&'~_`><@,:/\$\*\+\-\.\^\|\)\(\?\}\{\="
_FixedCookiePattern = re.compile(
    r"(?x)"                       # This is a Verbose pattern
    r"(?P<key>"                   # Start of group 'key'
    "["+ _LegalCharsPatt +"]+?"     # Any word of at least one letter,
nongreedy
    r")"                          # End of group 'key'
    r"\s*=\s*"                    # Equal Sign
    r"(?P<val>"                   # Start of group 'val'
    r'"(?:[^\\"]|\\.)*"'            # Any doublequoted string
    r"|"                            # or
    "["+ _LegalCharsPatt +"\ ]*"        # Any word or empty string
    r")"                          # End of group 'val'
    r"\s*;?"                      # Probably ending in a semi-colon
    )

class FixedCookie(Cookie.SimpleCookie):
    def load(self, rawdata):
        """Load cookies from a string (presumably HTTP_COOKIE) or
        from a dictionary.  Loading cookies from a dictionary 'd'
        is equivalent to calling:
            map(Cookie.__setitem__, d.keys(), d.values())
        """
        if type(rawdata) == type(""):
            self._BaseCookie__ParseString(rawdata, _FixedCookiePattern)
        else:
            self.update(rawdata)
        return
msg82286 - (view) Author: Trent Mick (trentm) * (Python committer) Date: 2009-02-17 00:14
Marking this as affecting all Python versions. I've tested that in
Python 2.4, 2.5, 2.6, 2.7 and 3.0. Haven't tried 3.1.

Re-starting the problem: Cookie.py doesn't parse unquoted morsel values,
as Lawrence said in the original description. An unquoted "Expires"
cookie attribute is common in the wild. In fact, "Set-Cookie" headers
created by Cookie.py include unquoted "Expires" -- as it should given
this from RFC 2109:

   HTTP/1.1 servers must send Expires: old-date (where old-date is a
   date long in the past) on responses containing Set-Cookie response
   headers unless they know for certain (by out of band means) that
   there are no downsteam HTTP/1.0 proxies.
   ...
   Note that the Expires date format contains embedded spaces, and that
   "old" cookies did not have quotes around values.  Clients that
   implement to this specification should be aware of "old" cookies and
   Expires.

Here is a shell session showing how Cookie.py doesn't round-trip:

>>> import sys
>>> if sys.version_info[0] == 3:
...     from http.cookies import SimpleCookie
... else:
...     from Cookie import SimpleCookie
... 
>>> a = SimpleCookie()
>>> a["test"] = "expiry"
>>> a["test"]["expires"] = 10  # expire 10s from now
>>> cookie_str = a["test"].OutputString()
>>> cookie_str
'test=expiry; expires=Tue, 17-Feb-2009 00:13:03 GMT'
>>> str(a)
'Set-Cookie: test=expiry; expires=Tue, 17-Feb-2009 00:13:19 GMT'
>>> 
>>> b = SimpleCookie()
>>> b.load(cookie_str)
>>> str(b)
'Set-Cookie: test=expiry; expires=Tue,'


Patch coming...
msg82288 - (view) Author: Trent Mick (trentm) * (Python committer) Date: 2009-02-17 00:20
Here is a patch to the Python 2.7 head to fix Cookie.py and to
test_cookie.py to test parsing a cookie header string with an "Expires"
attribute with spaces. All the existing tests (including mainly the
docstring tests in Cookie.py) still pass.

If someone could review this for sanity, I'd be happy to check it in and
also do any tweaks necessary to patch the Python 3.1 tree as well.
msg110640 - (view) Author: Mark Lawrence (BreamoreBoy) Date: 2010-07-18 12:06
Anyone with knowledge of cookies and/or regexes who could review this?  The patch isn't that big and includes unit tests.  Note also Trent's offer to check it in and patch py3k.
msg177292 - (view) Author: Berker Peksag (berker.peksag) * Date: 2012-12-10 14:32
The bug has been fixed in issue 8826.

Related changeset:

- http://hg.python.org/cpython/rev/cb231b79693e/
- Backport: http://hg.python.org/cpython/rev/84363c747c21

In Python 2.7.3:

>>> from Cookie import SimpleCookie
>>> cookies = SimpleCookie()
>>> cookies.load('foo=baz; expires=Sat, 10-Jun-1978 09:41:04 GMT')
>>> cookies
<SimpleCookie: foo='baz'>
>>> cookies['foo']['expires']
'Sat, 10-Jun-1978 09:41:04 GMT'
>>> cookies.load('foo=baz; expires=2008-06-10T09:44:45.963024')
>>> cookies['foo']['expires']
'2008-06-10T09:44:45.963024'
History
Date User Action Args
2014-01-21 17:23:00berker.peksagsetstage: patch review -> committed/rejected
2013-01-09 15:31:48serhiy.storchakasetresolution: fixed -> out of date
2013-01-09 15:25:38christian.heimessetstatus: open -> closed
resolution: fixed
2012-12-10 14:32:55berker.peksagsetnosy: + berker.peksag
messages: + msg177292
2010-07-18 12:06:08BreamoreBoysetnosy: + BreamoreBoy

messages: + msg110640
versions: + Python 3.2, - Python 2.6, Python 3.0
2010-05-20 07:59:46rushmansetnosy: + rushman
2010-04-01 16:54:41cburroughssetnosy: + cburroughs
2009-04-22 14:41:10ajaksu2setpriority: normal
stage: patch review
versions: - Python 2.5, Python 2.4
2009-02-17 00:20:40trentmsetkeywords: + patch, needs review
files: + cookie.patch
messages: + msg82288
2009-02-17 00:14:12trentmsetmessages: + msg82286
versions: + Python 2.6, Python 2.4, Python 3.0, Python 3.1, Python 2.7
2008-11-08 00:44:34mixedpuppysetnosy: + mixedpuppy
messages: + msg75625
2008-11-08 00:23:11trentmsetnosy: + trentm
2008-06-10 10:11:42rhymescreate