Expose http.cookiejar.split_header_words() #67686

vadmium · 2015-02-22T00:58:11Z

BPO	23498
Nosy	@orsenthil, @bitdancer, @berkerpeksag, @vadmium

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = 'https://github.com/orsenthil'
closed_at = None
created_at = <Date 2015-02-22.00:58:10.903>
labels = ['type-feature', 'library']
title = 'Expose http.cookiejar.split_header_words()'
updated_at = <Date 2021-04-27.01:30:40.920>
user = 'https://github.com/vadmium'

bugs.python.org fields:

activity = <Date 2021-04-27.01:30:40.920>
actor = 'orsenthil'
assignee = 'orsenthil'
closed = False
closed_date = None
closer = None
components = ['Library (Lib)']
creation = <Date 2015-02-22.00:58:10.903>
creator = 'martin.panter'
dependencies = []
files = []
hgrepos = []
issue_num = 23498
keywords = []
message_count = 1.0
messages = ['236397']
nosy_count = 4.0
nosy_names = ['orsenthil', 'r.david.murray', 'berker.peksag', 'martin.panter']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue23498'
versions = []

vadmium · 2015-02-22T00:58:08Z

I propose to document the split_header_words() so that it can be used to parse various kinds of HTTP-based header fields. Perhaps it should live in a more general module like “http”, or “email.policy.HTTP” (hinted in bpo-3609). Perhaps there is also room for finding a better name, such as parse_header_attributes() or something, since splitting space-separated words is not its most important property.

The function takes a series of header field values, as returned from Message.get_all(failobj=()). The field values may be separate strings and may also be comma-separated. It parses space- or semicolon-separated name=value attributes from each field value. Examples:

RFC 2965 Set-Cookie2 fields:
>>> cookies = (
...     'Cookie1="VALUE";Version=1;Discard, Cookie2="Same field";Version=1',
...     'Cookie3="Separate header field";Version=1',
... )
>>> pprint(http.cookiejar.split_header_words(cookies))
[[('Cookie1', 'VALUE'), ('Version', '1'), ('Discard', None)],
 [('Cookie2', 'Same field'), ('Version', '1')],
 [('Cookie3', 'Separate header field'), ('Version', '1')]]

RTSP 1.0 (RFC 2326) Transport header field:
>>> transport = 'RTP/AVP;unicast;mode="PLAY, RECORD", RTP/AVP/TCP;interleaved=0-1'
>>> pprint(http.cookiejar.split_header_words((transport,)))
[[('RTP/AVP', None), ('unicast', None), ('mode', 'PLAY, RECORD')],
 [('RTP/AVP/TCP', None), ('interleaved', '0-1')]]

The parsing of spaces seems to be an attempt to parse headers like WWW-Authenticate, although it mixes up the parameters when given this example from RFC 7235:

>>> auth = 'Newauth realm="apps", type=1, title="Login to \\"apps\\"", Basic realm="simple"'
>>> pprint(http.cookiejar.split_header_words((auth,)))
[[('Newauth', None), ('realm', 'apps')],
 [('type', '1')],
 [('title', 'Login to "apps"')],
 [('Basic', None), ('realm', 'simple')]]

Despite that, the function is still very useful for parsing many kinds of header fields that use semicolons. All the alternatives in the standard library that I know of have disadvantages:

cgi.parse_header() does not split comma-separated values apart, and ignores any attribute without an equals sign, such as “Discard” and “unicast” above
email.message.Message.get_params() and get_param() do not split comma-separated values either, and parsing header values other than the first one in a Message object is awkward
email.headerregistry.ParameterizedMIMEHeader looks relevant, but I couldn’t figure out how to use it

vadmium added stdlib Python modules in the Lib dir type-feature A feature request or enhancement labels Feb 22, 2015

orsenthil self-assigned this Apr 27, 2021

ezio-melotti transferred this issue from another repository Apr 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expose http.cookiejar.split_header_words() #67686

Expose http.cookiejar.split_header_words() #67686

vadmium commented Feb 22, 2015

vadmium commented Feb 22, 2015

Expose http.cookiejar.split_header_words() #67686

Expose http.cookiejar.split_header_words() #67686

Comments

vadmium commented Feb 22, 2015

vadmium commented Feb 22, 2015