Issue 43530: email.parser.BytesParser failed to parse mail when it is with BOM

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/87696

classification

Title:	email.parser.BytesParser failed to parse mail when it is with BOM
Type:	behavior	Stage:
Components:	Library (Lib)	Versions:	Python 3.9

process

Status:	open	Resolution:
Dependencies:		Superseder:
Assigned To:		Nosy List:	tzing
Priority:	normal	Keywords:

Created on 2021-03-17 15:45 by tzing, last changed 2022-04-11 14:59 by admin.

Messages (1)
msg388929 - (view)	Author: Tim Shih (tzing)	Date: 2021-03-17 15:45
Python's builtin `email.parser.BytesParser` could not properly parse the message when the bytes starts with BOM. Not 100% ensured- but this issue seems cause by that `FeedParser._parsegen` could not match any of the header line after the data is decoded. Steps to reproduce: 1. get email sample. any from https://github.com/python/cpython/tree/master/Lib/test/test_email/data. I use msg_01.txt in following code 2. re-encoded the mail sample to some encoding with BOM 3. use `email.parser.BytesParser` to parse it ```py import email with open('msg_01.txt', 'rb') as fp: msg = email.parser.BytesParser().parse(fp) print(msg.get('Message-ID')) ``` Expect output `<15090.61304.110929.45684@aaa.zzz.org>`, got `None`

msg388929 - (view)

Author: Tim Shih (tzing)

Date: 2021-03-17 15:45

Python's builtin `email.parser.BytesParser` could not properly parse the message when the bytes starts with BOM.

Not 100% ensured- but this issue seems cause by that `FeedParser._parsegen` could not match any of the header line after the data is decoded.

Steps to reproduce:
1. get email sample. any from https://github.com/python/cpython/tree/master/Lib/test/test_email/data. I use msg_01.txt in following code
2. re-encoded the mail sample to some encoding with BOM
3. use `email.parser.BytesParser` to parse it

```py
import email
with open('msg_01.txt', 'rb') as fp:
    msg = email.parser.BytesParser().parse(fp)
print(msg.get('Message-ID'))
```

Expect output `<15090.61304.110929.45684@aaa.zzz.org>`, got `None`

History
Date	User	Action	Args
2022-04-11 14:59:42	admin	set	github: 87696
2021-03-17 15:45:56	tzing	create