Issue497736
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2001-12-30 01:20 by eperez, last changed 2022-04-10 16:04 by admin. This issue is now closed.
Files | ||||
---|---|---|---|---|
File name | Uploaded | Description | Edit | |
python2.2-smtplib-correcthelo.diff | eperez, 2001-12-30 01:20 | python2.2-smtplib-correcthelo | ||
domainlit.diff | barry, 2002-03-25 16:00 |
Messages (17) | |||
---|---|---|---|
msg38600 - (view) | Author: Eduardo Perez Ureta (eperez) | Date: 2001-12-30 01:20 | |
If the machine from you are sending mail doesn't have a FQDN and the mail server requires a FQDN in HELO the current code will fail. Resolving the name it's a very bad idea: - It's something from other layer (DNS/IP) not from SMTP - It breaks when the name of the computer is not FQDN (as many dial-ins do) and the SMTP server does strict EHLO/HELO checking as stated before. - It breaks computers with a TCP tunnel to another host from the connection is originated if the relay does strict EHLO/HELO checking. - It breaks computers using NAT, the host that sees the server is not the one that sends the message if the relay does strict EHLO/HELO checking. - It's considered spyware as you are sending information some companies or people don't want to say: the internal structure of the network. No important mail client resolves the name. Look at netscape messenger or kmail. In fact kmail and perl's Net::SMTP does exactly what my patch does. Please don't resolve the names, as this approach works and the most used email clients do this. I send you the bugfix. |
|||
msg38601 - (view) | Author: Guido van Rossum (gvanrossum) * ![]() |
Date: 2001-12-30 02:24 | |
Logged In: YES user_id=6380 Seems reasonable to me, but I lack the SMTP knowledge to understand all the issues. Assigned to Barry Warsaw for review. (Barry: Eduardo found a similar privacy violation in ftplib, which I fixed. You might also ask Thomas Wouters for a review of the underlying idea.) |
|||
msg38602 - (view) | Author: Neil Schemenauer (nascheme) * ![]() |
Date: 2002-03-24 01:42 | |
Logged In: YES user_id=35752 This patch looks correct in theory to me. Trying to find the FQDN is wrong, IMHO. |
|||
msg38603 - (view) | Author: Guido van Rossum (gvanrossum) * ![]() |
Date: 2002-03-24 12:06 | |
Logged In: YES user_id=6380 Since Barry has not expressed any interest in this patch, reassigning to Neil, and set status to Accepted. |
|||
msg38604 - (view) | Author: Neil Schemenauer (nascheme) * ![]() |
Date: 2002-03-24 15:37 | |
Logged In: YES user_id=35752 I'm rejecting this patch. RFC 1123 requires that name sent after the HELO verb is "a valid principal host domain name for the client host". While RFC 1123 goes on to prohibit HELO-based rejections it is possible that some servers do reject mail based on HELO. Thus, changing the hostname sent to "localhost.localdomain" could potentially break scripts that currently work. The concern raised is still valid however. Finding the FQDN using gethostbyname() is unreliable. To address this concern I've added a "local_hostname" argument to the SMTP __init__ method. If provided it is used as the local hostname for the HELO and EHLO verbs. |
|||
msg38605 - (view) | Author: Eduardo Perez Ureta (eperez) | Date: 2002-03-24 18:39 | |
Logged In: YES user_id=60347 RFC 1123 was written 11 years ago when there weren't dial-ins, TCP tunnels, nor NATs. This patch fix scripts that run on computers that have the explained SMTP access, and it doesn't break any script I know about. Could you tell me cases were the current approach works and the patch proposed fails? I know the cases explained above were the current approach doesn't work and this patch works successfully. |
|||
msg38606 - (view) | Author: Neil Schemenauer (nascheme) * ![]() |
Date: 2002-03-24 21:51 | |
Logged In: YES user_id=35752 Did you read what I wrote? 220 cranky ESMTP Postfix (Debian/GNU) HELO localhost.localdomain 250 cranky MAIL FROM: <nas@arctrix.com> 250 Ok RCPT TO: <nas@arctrix.com> DATA 450 <localhost.localdomain>: Helo command rejected: Host not found 554 Error: no valid recipients Bring it up again in another few years and we will change the default. |
|||
msg38607 - (view) | Author: Barry A. Warsaw (barry) * ![]() |
Date: 2002-03-25 04:00 | |
Logged In: YES user_id=12800 Sorry to take so long to respond on this one. RFC 2821 is the latest standard that smtplib.py should adhere to. Quoting: [HELO and EHLO] are used to identify the SMTP client to the SMTP server. The argument field contains the fully-qualified domain name of the SMTP client if one is available. In situations in which the SMTP client system does not have a meaningful domain name (e.g., when its address is dynamically allocated and no reverse mapping record is available), the client SHOULD send an address literal (see section 4.1.3), optionally followed by information that will help to identify the client system. Thus, I believe that sending the FQDN is the right default, although socket.getfqdn() should be used for portability. Neil's patch is the correct one (although there's a typo in the docstring, which I'll fix). By default the fqdn is used, but the user has the option to supply the local hostname as an argument to the SMTP constructor. Since RFC 2821's admonition is that the client SHOULD use a domain literal if the fqdn isn't available, I'm happy to leave it up to the client to get any supplied argument right. If we wanted to be more RFC-compliant, SMTP.__init__() could possibly check socket.getfqdn() to see if the return value was indeed fully-qualified, and if not, craft a domain literal for the HELO/EHLO. Since this is a SHOULD and not a MUST, I'm happy with the current behavior, but if you want to provide a patch for better RFC compliance here, I'd be happy to review it. |
|||
msg38608 - (view) | Author: Guido van Rossum (gvanrossum) * ![]() |
Date: 2002-03-25 15:23 | |
Logged In: YES user_id=6380 Sorry, but what's a domain literal? I think that it's better not to get the client involved in getting this right; for example, someone might write a useful tool that sends email around, and then someone else might try to use this tool from a machine that doesn't have a fqdn. The author might not have thought of this (rather uncommon) situation; the user might not have enough Python whizz to know how to fix it. I'd like to hear also what you think of Eduardo's opinion that sending the fqdn is a privacy violation of the same kind as ftplib defaulting to sending username@hostname as the default password for anonymous login (which we did fix). If *you* (Barry) think this is without merit, it must be without merit. :-) |
|||
msg38609 - (view) | Author: Barry A. Warsaw (barry) * ![]() |
Date: 2002-03-25 16:00 | |
Logged In: YES user_id=12800 Oh sorry. A domain literal is something like [192.168.1.2] IOW, the IP address octets surrounded by square brackets. Should be easy enough to calculate. Attached is a proposed patch. As for the privacy violation, I don't think it's on the same level as the ftp issue because we're not divulging any information about the user. It could be argued that leaking the hostname might be enough to link the information to a specific user, and I might buy that argument, although it personally doesn't bother me too much (the IP address might be just as sufficient for linking and even NAT'd or DHCP'd addresses might be static enough to guess -- witness your own supposedly dynamic IP address :). And the IP will always be available via the socket peer. OTOH, Eduardo's claim isn't totally without merit. I'd like to be able to retain the ability to be properly RFC compliant, but could accept that the default be localhost.localdomain. If you (Guido) have a suggestion for an appropriate API for both these requirements, that would be great. |
|||
msg38610 - (view) | Author: Neil Schemenauer (nascheme) * ![]() |
Date: 2002-03-25 16:10 | |
Logged In: YES user_id=35752 There is no way that smtplib can automatically and reliably find the FQDN. socket.getfqdn() is a hack, IMHO. It doesn't really matter though. The chances of an email server rejecting email based on the domain name following the HELO verb is very small. I recall seeing only one in actual use. I still think the code is fine as it is. socket.getfqdn() aways returns something. Most mail servers don't care what it returns. Changing the default to 'localhost.localdomain' doesn't really solve anything. In your example, the script would still not work for the user trying to send email through a misconfigured server. It would reject 'localhost.localdomain' just like it rejected whatever socket.getfqdn() returned. The only possible arguments for using 'localhost.localdomain' are that it's faster (doesn't require a DNS lookup) and that it gives away less information. It doesn't give away much information though. The remote server already has the sender's IP address. The hostname shouldn't mean very much. If someone is that paranoid they can pass 'localhost.localdomain' to SMTP.__init__. Eventually we should make 'localhost.localdomain' the default. Like I said, getfqdn() is a hack. We could probably make the change now and no one would care. I'm just being very conservative. |
|||
msg38611 - (view) | Author: Guido van Rossum (gvanrossum) * ![]() |
Date: 2002-03-25 17:16 | |
Logged In: YES user_id=6380 Neil: coping with a misconfigured server wasn't part of my scenario; only coping with a client that simply doesn't have a fqdn was. Some questions remain: (1) why can't we use localhost.localdomain today? (2) Why is getfqdn() a hack? (Apart from it being in the wrong module.) Hm, I just thought of something. Why shouldn't gethostname() be used as the default? Why bother with getfqdn() at all? At least when gethostname() returms something inappropriate for a particular server, it can be fixed locally by root by fixing the hostname. (This may explain why you think getfqdn() is a hack.) Barry: an appropriate API could be to change the default for local_hostname in __init__ to "localhost.localdomain" but to leave the code that sticks in socket.getfqdn() (or maybe just socket.gethostname()) if the value is explicitly given as None or empty. |
|||
msg38612 - (view) | Author: Neil Schemenauer (nascheme) * ![]() |
Date: 2002-03-25 17:31 | |
Logged In: YES user_id=35752 So much discussion for such a little issue. :-) A misconfigured server must be part of your scenario. It's the only case were the hostname makes any difference. Using localhost.localdomain will work find on 99.99% of mail servers. For the remaining 0.01%, using socket.getfqdn() has a higher chance of working than using localhost.localdomain. If socket.getfqdn() can find a hostname that resolves back to the IP of the client side of the connection then it works. Using localhost.localdomain in that case will not work. If socket.getfqdn() cannot find the FQDN (due to NAT, tunnelling or whatever) things work just as well as if localhost.localdomain was used a default. Changing the default to localhost.localdomain fixes nothing! getfqdn() is a hack because it's relies on DNS. People always screw that up. :-) Regarding your suggested API change, I don't see how it would help. I doubt any code actually passes socket.getfqdn() to SMPT.helo(). |
|||
msg38613 - (view) | Author: Guido van Rossum (gvanrossum) * ![]() |
Date: 2002-03-25 17:41 | |
Logged In: YES user_id=6380 OK. So is socket.gethostname() better than socket.getfqdn() or not? |
|||
msg38614 - (view) | Author: Barry A. Warsaw (barry) * ![]() |
Date: 2002-03-25 18:04 | |
Logged In: YES user_id=12800 Hold on. We're conflating issues here. To address the privacy issue, "localhost.localdomain" should be used. I don't see anything else being an appropriate defense against identity leakage (but IMHO, it's a limited defense anyway because you'll *always* leak your IP address) To be "correct" IMO means adhering to RFC 2821 as closely as is possible. Which means use the fqdn if available, otherwise use the domain literal. See attached patch for that. If we don't want to be RFC-correct but we want to be liberal enough to handle misconfigured client systems, then gethostname() is probably fine, but so would be localhost.localdomain. If we want to be robust in the face of overly strict smtp servers, then I think you're in a losing battle because they may only accept fqdn's that are reverse resolvable. But that may be impossible for the (perhaps misconfigured) client to calculate. And if that's the case, then the client likely has bigger problems. My preference would be for the default to be RFC-correct (i.e. fqdn w/domain literal fallback), and allow overrides via method arguments, as the code with my proposed patch would implement. |
|||
msg38615 - (view) | Author: Guido van Rossum (gvanrossum) * ![]() |
Date: 2002-03-25 18:41 | |
Logged In: YES user_id=6380 I'm skeptical about the effectiveness of providing overrides through defaulted arguments; this is something the author of the program using smtplib must anticipate and give its user an option to override. (And because of that, I'm at best -0 on adding the local_hostname argument to the constructor, as Neil checked in.) I now agree that leaking the fqdn isn't much of a provacy breach. I agree that fqdn w/domain literal fallback is the best compromise. |
|||
msg38616 - (view) | Author: Barry A. Warsaw (barry) * ![]() |
Date: 2002-03-25 18:56 | |
Logged In: YES user_id=12800 Cool, I will apply my patch and update the documentation. I'll leave the default argument as Neil implemented. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-10 16:04:50 | admin | set | github: 35847 |
2001-12-30 01:20:06 | eperez | create |