classification
Title: SimpleCookie fails to parse any cookie if an entry has whitespace in the name
Type: behavior Stage:
Components: Library (Lib) Versions: Python 3.6
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Adam Davis, Joel Rosdahl, infinitewarp, martin.panter, remi.lapeyre
Priority: normal Keywords:

Created on 2017-09-13 18:46 by Adam Davis, last changed 2019-11-29 06:58 by Joel Rosdahl.

Messages (5)
msg302105 - (view) Author: Adam Davis (Adam Davis) Date: 2017-09-13 18:46
```>>> from http.cookies import SimpleCookie
>>> cookie_string = "ASDF=stuff; ASDF space=more stuff"
>>> cookie = SimpleCookie()
>>> cookie.load(cookie_string)
>>> cookie.items()
dict_items([])
>>> cookie_string = "ASDF=stuff"
>>> cookie.load(cookie_string)
>>> cookie.items()
dict_items([('ASDF', <Morsel: ASDF=stuff>)])```

cookie.load should throw an error, or at least parse the cookies it can parse.
msg304102 - (view) Author: Brad Smith (infinitewarp) * Date: 2017-10-11 02:47
According to RFC-6265 (which also references RFC-2616 to define "tokens"), the space character (and whitespace in general) is not valid in cookie-names or cookie-values.

RFC-6265: https://tools.ietf.org/html/rfc6265#section-4.1.1
RFC-2616: https://tools.ietf.org/html/rfc2616#section-2.2

I think it's reasonable for Python to quietly throw away malformed NAME=VALUE pairs since web browsers are likely doing the same.
msg304104 - (view) Author: Adam Davis (Adam Davis) Date: 2017-10-11 04:06
Quietly throw out the one bad value, sure. You lose all cookies in your cookie string in this scenario. 

I'd expect "ASDF=stuff; ASDF space=more stuff" to at least kick out the values that are legal.
msg334414 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2019-01-27 03:56
The main cause of this behaviour is that whitespace (matching the ASCII RE “\s”) is treated as separation between cookie “morsels”. It looks like this has always been the behaviour, but I’m not sure it was intended.

>>> print(BaseCookie('first=morsel second=morsel'))
Set-Cookie: first=morsel
Set-Cookie: second=morsel

This could be a security problem, if an attacker managed to inject a CSRF token as the second “morsel”. This was mentioned in <https://translate.google.com/translate?u=https://habr.com/en/post/272187/>.

IMO it would be better to not split off a second morsel. Either keep it as one long morsel value with spaces in, or skip over it to the next semicolon (;).

The reason why the whole cookie string is lost is due to the behaviour of cookie morsels without equals signs:

>>> BaseCookie('cookie=lost; ignore').items()
dict_items([])

IMO it would be better to skip over these to the next semicolon as well. It looks like this is a regression in Python 3.5+ caused by Issue 22796.
msg334472 - (view) Author: Rémi Lapeyre (remi.lapeyre) * Date: 2019-01-28 15:19
It may be relevant: Ruby accept whitespaces in the name of the morsel:

➜  ~ irb
irb(main):002:0> require "cgi"
=> true
irb(main):003:0> CGI::Cookie::parse "ASDF=stuff; ASDF space=more stuff"
=> {"ASDF"=>#<CGI::Cookie: "ASDF=stuff; path=">, "ASDF space"=>#<CGI::Cookie: "ASDF space=more+stuff; path=">}
irb(main):004:0>
History
Date User Action Args
2019-11-29 06:58:01Joel Rosdahlsetnosy: + Joel Rosdahl
2019-01-28 15:19:02remi.lapeyresetmessages: + msg334472
2019-01-28 15:09:50remi.lapeyresetnosy: + remi.lapeyre
2019-01-27 03:56:33martin.pantersetnosy: + martin.panter
messages: + msg334414
2017-10-11 04:06:24Adam Davissetmessages: + msg304104
2017-10-11 02:47:10infinitewarpsetnosy: + infinitewarp
messages: + msg304102
2017-09-13 18:46:23Adam Daviscreate