This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author vstinner
Recipients abarry, eryksun, ezio.melotti, paul.moore, serhiy.storchaka, steve.dower, tim.golden, vstinner, zach.ware
Date 2016-01-28.09:41:02
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <>
> Added comments on Rietveld.

Crap. It's easy to miss a compilation error on extensions :-/

I used "make && ./python -m test -v test_socket" to validate  gethostbyaddr_encoding-2.patch and it succeded.

Maybe we should to *fail* if an extension failed to be compiled?

New patch should have less typos :-) I also checked for reference leak using ./python -m test -R 3:3 test_socket => no leak.

> Why not use PyUnicode_DecodeFSDefault on all platforms? It is used in
gethostname() on Unix.

I don't know which encoding is the best choice on UNIX. I prefer to move step by step and fix an obvious bug on Windows blocking Émanuel (see his issue #26226). (Émanuel uses Émanuel-PC for its hostname, an non-ASCII hostname ;-))

I guess that UTF-8 works in most cases on UNIX, whereas using the locale encoding can introduce regressions if the hostname is non-ASCII. For example, decoding non-ASCII hostname would fail with LANG=C which forces an ASCII locale encoding.

The issue #9377 proposes a more advanced code to choose the encoding to decode hostnames. Sorry, I didn't follow this issue recently, so I don't know if it proposes to use surrogateescape and/or IDNA.

I prefer to discuss the encoding used on UNIX in a new issue (or better continue the existing discussion on issue #9377?).
Date User Action Args
2016-01-28 09:41:03vstinnersetrecipients: + vstinner, paul.moore, tim.golden, ezio.melotti, zach.ware, serhiy.storchaka, eryksun, steve.dower, abarry
2016-01-28 09:41:03vstinnersetmessageid: <>
2016-01-28 09:41:03vstinnerlinkissue26227 messages
2016-01-28 09:41:02vstinnercreate