This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author r.david.murray
Recipients barry, mkaiser, r.david.murray
Date 2019-12-13.19:44:50
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1576266291.14.0.80225077233.issue39040@roundup.psfhosted.org>
In-reply-to
Content
That header is *completely* non-RFC compliant.  If gmail generated that header there is something very wrong in google-land :(

The RFC compliant formatting for that header looks like this:

Content-Disposition: attachment;
 filename*=utf-8''Schulbesuchsbest%C3%A4ttigung.pdf

You will note that this is nothing like encoded word format.  Encoded words are not valid inside quoted strings, and quoted strings can't be used in mime header attributes if there are non-ascii characters involved.  Nor can encoded words.  

Now, all that said, there is an obvious rule that can be followed to understand what that header is trying to convey, and the current parser already implements most of it (you will find comments about it in the parser, as well as defects being registered).  So, a patch to _header_value_parser to fix the error recovery will be accepted.  I've looked at the code to remind myself, but not deeply enough to be *sure* where the changes need to be made.  There are two possibilities I see off the bat (and both may need fixing): get_bare_quoted_string and get_parameter.  Either one or both of those may be forgetting that whitespace between encoded words should be dropped.
History
Date User Action Args
2019-12-13 19:44:51r.david.murraysetrecipients: + r.david.murray, barry, mkaiser
2019-12-13 19:44:51r.david.murraysetmessageid: <1576266291.14.0.80225077233.issue39040@roundup.psfhosted.org>
2019-12-13 19:44:51r.david.murraylinkissue39040 messages
2019-12-13 19:44:50r.david.murraycreate