sock_decode_hostname() of socketmodule.c currently uses PyUnicode_DecodeFSDefault() on Windows. PyUnicode_DecodeFSDefault() uses UTF-8 by default (PEP 529).

I understand that the ANSI code page should be used instead of UTF-8.

Would it work to use PyUnicode_DecodeLocale(name, "surrogatepass")? It's implemented with mbstowcs(), but I don't recall which encoding it uses on Windows.

Or can we use PyUnicode_DecodeMBCS(name, strlen(name), "surrogatepass")?


I understand that setting PYTHONLEGACYWINDOWSFSENCODING environment variable to 1 should work around the issue.
