This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author matt-davis
Recipients barry, matt-davis, r.david.murray
Date 2020-04-22.00:26:26
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1587515187.44.0.744042033479.issue40359@roundup.psfhosted.org>
In-reply-to
Content
# Summary

When parsing emails with long attachment file names, part.get_filename() often returns \n or \r\n.
It should strip those characters out.

# Steps to reproduce

I have attached a minimal working example.

The relevant part of the raw email is:

--_004_D6CEDE1EBD6645898F5643C0C6878005examplecom_
Content-Type: text/plain;
	name="an attachment with a very very very long super long file name which has
 many words and just keeps on going and going.txt"

# Expected output:

attachments = ["an attachment with a very very very long super long file name which has many words and just keeps on going and going.txt"]

Maybe I'm reading the email RFC spec wrong. My interpretation is that the parser should do something like:

raw = raw.replace('\r\n ', ' ').replace('\n ', ' ')

# Actual output

attachments = ["an attachment with a very very very long super long file name which\n has many words and just keeps on going and going.txt"]

Note that I have seen other examples where the output includes \r\n not just \n

# Impact

I'm trying to write an email bot which saves attachments to a database, and also forwards on the emails.
My both thinks that the filename includes a line break. This inevitably causes failures in my subsequent code.

# Relevant links:

The RFC for email spec is here: https://tools.ietf.org/html/rfc2822.html#section-2.2.3

This Stack Overflow answer seems relevant: https://stackoverflow.com/questions/3050298/parsing-email-with-python/3050374#3050374

Issue 3601 may be relevant, but doesn't seem exactly the same. It seems to be the reverse, *constructing* emails with long headers. My issue is *parsing* emails with long headers.
History
Date User Action Args
2020-04-22 00:26:27matt-davissetrecipients: + matt-davis, barry, r.david.murray
2020-04-22 00:26:27matt-davissetmessageid: <1587515187.44.0.744042033479.issue40359@roundup.psfhosted.org>
2020-04-22 00:26:27matt-davislinkissue40359 messages
2020-04-22 00:26:27matt-daviscreate