This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Faulty behaviour in email.utils.parseaddr if square brackets in subject
Type: behavior Stage: resolved
Components: email Versions: Python 3.5
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: barry, r.david.murray, tom de wulf
Priority: normal Keywords:

Created on 2017-11-17 11:06 by tom de wulf, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Messages (4)
msg306431 - (view) Author: tom de wulf (tom de wulf) Date: 2017-11-17 11:06
Probably a parsing bug in email.utils.parseaddr.

How to recreate:

>>> import email.utils
>>> test = 'Subject: I am a bug [Random]\r\nFrom: someone <some@email.address>\r\n\r\n'
>>> email.utils.parseaddr(test)
('', 'I')
>>> email.utils.parseaddr(test.replace('[', '').replace(']',''))
('someone', 'some@email.address')

Expected behaviour: no need to remove the []'s
msg306435 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2017-11-17 14:07
parseaddr is for parsing the contents of an address header, not for parsing any additional text.  So the correct way to call it is parseaddress('someone <some@email.address>').

In any case, please look in to the new email policies, which provide a much more convenient API:

    >>> from email import message_from_bytes
    >>> from email.policy import default
    >>> m = message_from_bytes(b'Subject: I am a bug [Random]\r\nFrom: someone <some@email.address>\r\n\r\n', policy=default)
    >>> m['from']
    'someone <some@email.address>'
    >>> m['from'].addresses
    (Address(display_name='someone', username='some', domain='email.address'),)
    >>> m['from'].addresses[0].display_name
    'someone'
    >>> m['from'].addresses[0].username
    'some'
    >>> m['from'].addresses[0].addr_spec
    'some@email.address'
msg306436 - (view) Author: tom de wulf (tom de wulf) Date: 2017-11-17 14:12
I do get this data from an IMAP fetch statement, see my code below:

    rv, data = imap.fetch(num, "(BODY[HEADER.FIELDS (FROM SUBJECT)])")
    if rv != 'OK':
        logging.error("Error getting message sender and subject (" + num.decode("ascii") + ")")
        return
    logging.info("Got message " + num.decode("ascii"))

    sender_subject = data[0][1].decode("utf-8")
    sender = email.utils.parseaddr(sender_subject.replace('[', '').replace(']',''))[1].replace("\r\n", "")

Thank you for providing this new API though, I will make sure to switch to that.
msg306438 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2017-11-17 14:15
Unfortunately the imap module in the stdlib doesn't provide a whole lot in the way of tools for parsing the imap data, just for sending it back and forth to the server.
History
Date User Action Args
2022-04-11 14:58:54adminsetgithub: 76239
2017-11-17 14:15:34r.david.murraysetmessages: + msg306438
2017-11-17 14:12:07tom de wulfsetmessages: + msg306436
2017-11-17 14:07:11r.david.murraysetstatus: open -> closed
resolution: not a bug
messages: + msg306435

stage: resolved
2017-11-17 11:06:43tom de wulfcreate