Consider this reproducer:
>>> import email.headerregistry
>>> reg = email.headerregistry.HeaderRegistry()
>>> parsed = reg('Content-Disposition', """attachment; filename="foo-ae.html"; filename*=UTF-8''foo-%c3%a4.html""")
>>> parsed.defects
(InvalidHeaderDefect('duplicate parameter name; duplicate(s) ignored'),)
>>> parsed.params['filename']
'foo-ae.html'
However, the relevant section of RFC 5987 says:
https://greenbytes.de/tech/webdav/rfc5987.html#rfc.section.4.2
> This specification suggests that a parameter using the extended syntax takes precedence. This would allow producers to use both formats without breaking recipients that do not understand the extended syntax yet.
And RFC 6266 says:
https://greenbytes.de/tech/webdav/rfc6266.html#rfc.section.4.3
> Many user agent implementations predating this specification do not understand the "filename*" parameter. Therefore, when both "filename" and "filename*" are present in a single header field value, recipients should pick "filename*" and ignore "filename". This way, senders can avoid special-casing specific user agents by sending both the more expressive "filename*" parameter, and the "filename" parameter as fallback for legacy recipients (see Section 5 for an example).
Also see the related attfnboth and attfnboth2 test cases here:
http://test.greenbytes.de/tech/tc2231/#attfnboth
I'm aware those two RFCs are specific to HTTP - but given that there's a "HTTP" policy and "utils.py" has some HTTP-specific date/time handling, I suppose correct handling of this might be in scope as well?
|