Message 114754 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	baikie
Recipients	baikie, ezio.melotti, lemburg, loewis, vstinner
Date	2010-08-23.22:48:14
SpamBayes Score	9.515553e-09
Marked as misclassified	No
Message-id	<20100823224813.GB5500@dbwatson.ukfsn.org>
In-reply-to	<1282514606.26.0.972107218307.issue9377@psf.upfronthosting.co.za>

Content
> Is this patch in response to an actual problem, or a theoretical problem? > If "actual problem": what was the specific application, and what was the specific host name? It's about environments, not applications - the local network may be configured with non-ASCII bytes in hostnames (either in the local DNS or a different lookup mechanism - I mentioned /etc/hosts as a simple example), or someone might deliberately connect from a garbage hostname as a denial of service attack against a server which tries to look it up with gethostbyaddr() or whatever (this may require a "non-strict" resolver library, as noted above). > If theoretical, I recommend to close it as "won't fix". I find it perfectly reasonable if Python's socket module gives an error if the hostname can't be clearly decoded. Applications that run into it as a result of gethostbyaddr should treat that as "no reverse name available". There are two points here. One is that the decoding can fail; I do think that programmers will find this surprising, and the fact that Python refuses to return what was actually received is a regression compared to 2.x. The other is that the encoding and decoding are not symmetric - hostnames are being decoded with UTF-8 but encoded with IDNA. That means that when a decoded hostname contains a non-ASCII character which is not prohibited by IDNA/Nameprep, that string will, when used in a subsequent call, not refer to the hostname that was actually received, because it will be re-encoded using a different codec. Attaching a refreshed version of try-surrogateescape-first.diff. I've separated out the change to getnameinfo() as it may be superfluous (issue #1027206).

> Is this patch in response to an actual problem, or a theoretical problem?
> If "actual problem": what was the specific application, and what was the specific host name?

It's about environments, not applications - the local network may
be configured with non-ASCII bytes in hostnames (either in the
local DNS *or* a different lookup mechanism - I mentioned
/etc/hosts as a simple example), or someone might deliberately
connect from a garbage hostname as a denial of service attack
against a server which tries to look it up with gethostbyaddr()
or whatever (this may require a "non-strict" resolver library, as
noted above).

> If theoretical, I recommend to close it as "won't fix". I find it perfectly reasonable if Python's socket module gives an error if the hostname can't be clearly decoded. Applications that run into it as a result of gethostbyaddr should treat that as "no reverse name available".

There are two points here.  One is that the decoding can fail; I
do think that programmers will find this surprising, and the fact
that Python refuses to return what was actually received is a
regression compared to 2.x.

The other is that the encoding and decoding are not symmetric -
hostnames are being decoded with UTF-8 but encoded with IDNA.
That means that when a decoded hostname contains a non-ASCII
character which is not prohibited by IDNA/Nameprep, that string
will, when used in a subsequent call, not refer to the hostname
that was actually received, because it will be re-encoded using a
different codec.

Attaching a refreshed version of try-surrogateescape-first.diff.
I've separated out the change to getnameinfo() as it may be
superfluous (issue #1027206).

Files
File name	Uploaded
try-surrogateescape-first-4.diff	baikie, 2010-08-23.22:48:10
try-surrogateescape-first-getnameinfo-4.diff	baikie, 2010-08-23.22:48:13

History
Date	User	Action	Args
2010-08-23 22:48:16	baikie	set	recipients: + baikie, lemburg, loewis, vstinner, ezio.melotti
2010-08-23 22:48:15	baikie	link	issue9377 messages
2010-08-23 22:48:14	baikie	create