This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: should email.utils.parseaddr treat a@b. as invalid email ?
Type: behavior Stage: resolved
Components: email Versions:
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: barry, eric.smith, jpic, r.david.murray
Priority: normal Keywords:

Created on 2019-07-03 10:12 by jpic, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (6)
msg347207 - (view) Author: jpic (jpic) * Date: 2019-07-03 10:12
Following up bpo-34155[0] PR#13079[1], which changes:

    >>> parseaddr('a@malicious@good')

From returning:

    ('', 'a@malicious')

To return:

    ('', '')

As such, parseaddr behaves more like documented:

email.utils.parseaddr(address)
Parse address – which should be the value of some address-containing field such as To or Cc – into its constituent realname and email address parts. Returns a tuple of that information, unless the parse fails, in which case a 2-tuple of ('', '') is returned.

The pull request discussion suggested that it would be good to open a new bpo to discuss changing the following behaviour:

    parseaddr('a@b.')

From returning:

    ('', 'a@b.')

To return a tuple of empty strings as well.

We have not found RFC to back up that `a@b.` was not a valid email, however RFC 1034 states that dots separate labels:

    When a user needs to type a domain name, the length of each label is
    omitted and the labels are separated by dots (".").

As such, my understanding is that a valid domain must not end with a dot.

[0] https://bugs.python.org/issue34155
[1] https://github.com/python/cpython/pull/13079
msg347219 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2019-07-03 10:59
RFC 1034 defines absolute domain names as ending with dot:

------------
When a user needs to type a domain name, the length of each label is omitted and the labels are separated by dots (".").  Since a complete domain name ends with the root label, this leads to a printed form which ends in a dot.  We use this property to distinguish between:

   - a character string which represents a complete domain name
     (often called "absolute").  For example, "poneria.ISI.EDU."

   - a character string that represents the starting labels of a
     domain name which is incomplete, and should be completed by
     local software using knowledge of the local domain (often
     called "relative").  For example, "poneria" used in the
     ISI.EDU domain.
------------

I'll admit that it isn't common to specify absolute domain names, and many resolvers treat a domain name with an internal dot, but no terminal dot, as an absolute name.

I doubt in practice there are any email addresses that have a TLD name. 

There's some bpo issue where this was discussed in reference to the ipaddress module. I think the issues was canonicalizing names, and it was decided not to add trailing dot to make them absolute. I realize that logic doesn't directly apply here.

In spite of "com." being a valid domain name, I think it's reasonable to reject it as the domain part of an email address. But there should be a comment in the code as such.
msg347221 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2019-07-03 11:09
Counterpoint: I just sent an email to "info@info.", and Thunderbird and my MTA (postfix) and my mail relay all accepted it. I guess it's possible that a TLD (especially one of the newer ones) could accept email addresses in the TLD itself.

It turns out that "info@info." isn't a mailbox as of right now, but I think it's a valid and accepted address, at least by the software listed above. And it could be a valid mailbox, just isn't in this particular case.

Maybe the more conservative approach is to say that "info@info." (and "a@b.", etc.) should be considered valid email addresses.

If you were actually trying to send email to a mailbox in the "info" TLD, I think most resolvers would resolve "info" as a relative domain name, which isn't what we'd want to happen: you'd have to specify the domain as "info.".
msg347225 - (view) Author: jpic (jpic) * Date: 2019-07-03 12:42
Thanks a heap Eric, I feel a bit silly I missed it.

Closing the issue as not a bug, please feel free to reopen if necessary.
msg347856 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2019-07-13 21:37
Right, those absolutely are valid addresses.  A resolver will normally look up a name with an internal dot first as if it were an FQDN, but if it does so and does not get an answer it will then look it up again as a "local" address (appending in turn the strings from the 'search' directive in resolv.conf or equivalent) *if* it does not end in a final dot.  If it does end in a final dot, no further lookup as local is done.

While it isn't *normal* to send email to a TLD using a trailing dot, it is *legal*.  In theory the address 'postmaster@com.' ought to be a valid email address (I doubt that it actually is, though). On the other hand, I will be very surprised if *all other* TLDs are without valid email addresses, especially the new ones.  It is also easy to imagine an environment using email with private single label domain names using trailing dots specifically to suppress appending of search domains for sandboxing reasons.  Thus the email library must support it as valid, both for RFC reasons and for practical reasons.
msg347858 - (view) Author: jpic (jpic) * Date: 2019-07-13 22:08
Thanks for the heads up.

There is still one last case where maybe parseaddr should return a tuple of
empty strings, currently:

>>> parseaddr('a@')
('', 'a@')

Is this worth changing ?
History
Date User Action Args
2022-04-11 14:59:17adminsetgithub: 81673
2019-07-13 22:08:56jpicsetmessages: + msg347858
2019-07-13 21:37:55r.david.murraysetmessages: + msg347856
2019-07-03 12:42:10jpicsetstatus: open -> closed
resolution: not a bug
messages: + msg347225

stage: resolved
2019-07-03 11:09:42eric.smithsetmessages: + msg347221
2019-07-03 10:59:41eric.smithsetnosy: + eric.smith
messages: + msg347219
2019-07-03 10:12:20jpiccreate