This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: mbox From line wrongly detected
Type: behavior Stage: resolved
Components: email, Library (Lib) Versions: Python 3.7
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: Andro, barry, maxking, r.david.murray
Priority: normal Keywords:

Created on 2019-06-21 08:44 by Andro, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (3)
msg346192 - (view) Author: Andrew Bernard (Andro) Date: 2019-06-21 08:44
When parsing an mbox file, the Python mailbox library is confused by the presence of lines starting with 'From' in the body of the text. A new fragmentary message item is created, but this is wrong. The following sample code and input demonstrates this. Replacing 'From' in the message body with, say, ' From' results in correct parsing.

This defect prevents correct import of mbox files into hyperkitty for GNU Mailman 3, as one instance where this is an impediment, as the message items become corrupt.

-- Python code
import sys
import mailbox

def main():
    print('mailbox read test')
    mbox = mailbox.mbox(sys.argv[1])
    for msg in mbox:
        print('~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~')
        print(msg)
        print('~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~')
    
if __name__ == "__main__":
    main()


--- sample mbox with one message

From  Fred Nurk <fred.nurks@nowhere.org> Wed, 8 Dec 1999 14:45:02 -0400
Date:         Wed, 8 Dec 1999 14:45:02 -0400
From:         Fred Nurk <fred.nurk@inowhere.org>
Subject:      Testing mbox in Python


 Lorem ipsum dolor sit amet, consectetur adipiscing elit. Fusce semper
 tempus augue at consectetur. Morbi eu nunc magna. Nulla placerat,
 eros in mollis finibus, dui risus ultrices tortor, non tincidunt nibh
 odio at augue. Quisque quis mauris neque. Curabitur ac accumsan
 neque. Maecenas sed mauris non justo sagittis finibus vel vel
 ex. Maecenas quis rutrum libero. Curabitur ex ante, tincidunt in
 velit at, egestas lobortis quam. Praesent tempus at dui ut
 volutpat. Nullam in rhoncus massa, id malesuada tortor. Suspendisse
 at cursus ex. Phasellus vitae pulvinar eros. Ut euismod dapibus
 libero, ultricies tempor leo accumsan ac. Etiam vestibulum, urna eget
 interdum eleifend, nulla nulla eleifend lacus, at lacinia neque nisi
 non velit.

From sed vehicula venenatis dui at ultricies. Pellentesque vehicula
vulputate nibh nec aliquet. Vestibulum pretium velit id libero
porttitor, sed facilisis metus fermentum. Donec vestibulum, sapien non
convallis sodales, justo libero volutpat dui, ut luctus odio nisi eget
sapien. In viverra libero gravida arcu euismod, non sollicitudin massa
auctor. Pellentesque vitae laoreet nisi. In eros massa, pretium at
condimentum eu, molestie ut tortor. Suspendisse faucibus felis sem, et
fringilla urna consectetur molestie. Integer suscipit, orci sed
convallis maximus, velit purus tempus dui, id egestas tortor erat
auctor dui. Nulla fermentum tellus ut odio elementum, vel bibendum mi
imperdiet. Proin sed auctor purus. Orci varius natoque penatibus et
magnis dis parturient montes, nascetur ridiculus mus. Nullam non arcu
ex. Duis dapibus nunc in urna dapibus, sit amet interdum lectus
tincidunt.

Fred

--
msg346201 - (view) Author: Andrew Bernard (Andro) Date: 2019-06-21 09:57
Not really a bug. Results from problems with the loose mbix format and lack of standards. Nothing Python can do about it.
msg347591 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2019-07-10 01:27
This problem is the whole reason "mangle_from" exists in the email library...
History
Date User Action Args
2022-04-11 14:59:17adminsetgithub: 81538
2019-07-10 01:27:54r.david.murraysetmessages: + msg347591
2019-06-29 10:02:52eric.smithsetstatus: open -> closed
stage: resolved
2019-06-21 09:57:54Androsetresolution: not a bug
2019-06-21 09:57:37Androsetmessages: + msg346201
2019-06-21 08:50:25SilentGhostsetcomponents: + email
2019-06-21 08:50:17SilentGhostsetnosy: + barry, r.david.murray, maxking
2019-06-21 08:44:50Androcreate