Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose http.cookiejar.split_header_words() #67686

Open
vadmium opened this issue Feb 22, 2015 · 1 comment
Open

Expose http.cookiejar.split_header_words() #67686

vadmium opened this issue Feb 22, 2015 · 1 comment
Assignees
Labels
stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@vadmium
Copy link
Member

vadmium commented Feb 22, 2015

BPO 23498
Nosy @orsenthil, @bitdancer, @berkerpeksag, @vadmium

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = 'https://github.com/orsenthil'
closed_at = None
created_at = <Date 2015-02-22.00:58:10.903>
labels = ['type-feature', 'library']
title = 'Expose http.cookiejar.split_header_words()'
updated_at = <Date 2021-04-27.01:30:40.920>
user = 'https://github.com/vadmium'

bugs.python.org fields:

activity = <Date 2021-04-27.01:30:40.920>
actor = 'orsenthil'
assignee = 'orsenthil'
closed = False
closed_date = None
closer = None
components = ['Library (Lib)']
creation = <Date 2015-02-22.00:58:10.903>
creator = 'martin.panter'
dependencies = []
files = []
hgrepos = []
issue_num = 23498
keywords = []
message_count = 1.0
messages = ['236397']
nosy_count = 4.0
nosy_names = ['orsenthil', 'r.david.murray', 'berker.peksag', 'martin.panter']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue23498'
versions = []

@vadmium
Copy link
Member Author

vadmium commented Feb 22, 2015

I propose to document the split_header_words() so that it can be used to parse various kinds of HTTP-based header fields. Perhaps it should live in a more general module like “http”, or “email.policy.HTTP” (hinted in bpo-3609). Perhaps there is also room for finding a better name, such as parse_header_attributes() or something, since splitting space-separated words is not its most important property.

The function takes a series of header field values, as returned from Message.get_all(failobj=()). The field values may be separate strings and may also be comma-separated. It parses space- or semicolon-separated name=value attributes from each field value. Examples:

RFC 2965 Set-Cookie2 fields:
>>> cookies = (
...     'Cookie1="VALUE";Version=1;Discard, Cookie2="Same field";Version=1',
...     'Cookie3="Separate header field";Version=1',
... )
>>> pprint(http.cookiejar.split_header_words(cookies))
[[('Cookie1', 'VALUE'), ('Version', '1'), ('Discard', None)],
 [('Cookie2', 'Same field'), ('Version', '1')],
 [('Cookie3', 'Separate header field'), ('Version', '1')]]

RTSP 1.0 (RFC 2326) Transport header field:
>>> transport = 'RTP/AVP;unicast;mode="PLAY, RECORD", RTP/AVP/TCP;interleaved=0-1'
>>> pprint(http.cookiejar.split_header_words((transport,)))
[[('RTP/AVP', None), ('unicast', None), ('mode', 'PLAY, RECORD')],
 [('RTP/AVP/TCP', None), ('interleaved', '0-1')]]

The parsing of spaces seems to be an attempt to parse headers like WWW-Authenticate, although it mixes up the parameters when given this example from RFC 7235:

>>> auth = 'Newauth realm="apps", type=1, title="Login to \\"apps\\"", Basic realm="simple"'
>>> pprint(http.cookiejar.split_header_words((auth,)))
[[('Newauth', None), ('realm', 'apps')],
 [('type', '1')],
 [('title', 'Login to "apps"')],
 [('Basic', None), ('realm', 'simple')]]

Despite that, the function is still very useful for parsing many kinds of header fields that use semicolons. All the alternatives in the standard library that I know of have disadvantages:

  • cgi.parse_header() does not split comma-separated values apart, and ignores any attribute without an equals sign, such as “Discard” and “unicast” above

  • email.message.Message.get_params() and get_param() do not split comma-separated values either, and parsing header values other than the first one in a Message object is awkward

  • email.headerregistry.ParameterizedMIMEHeader looks relevant, but I couldn’t figure out how to use it

@vadmium vadmium added stdlib Python modules in the Lib dir type-feature A feature request or enhancement labels Feb 22, 2015
@orsenthil orsenthil self-assigned this Apr 27, 2021
@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stdlib Python modules in the Lib dir type-feature A feature request or enhancement
Projects
Status: No status
Development

No branches or pull requests

2 participants