Issue25023
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2015-09-08 05:32 by sy LEE, last changed 2022-04-11 14:58 by admin. This issue is now closed.
Messages (5) | |||
---|---|---|---|
msg250157 - (view) | Author: grizlupo (sy LEE) | Date: 2015-09-08 05:32 | |
>>> locale.setlocale(locale.LC_ALL, 'en') 'en' >>> time.strftime('%a') 'Tue' >>> locale.setlocale(locale.LC_ALL, 'ko') 'ko' >>> time.strftime('%a') Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: embedded null byte >>> |
|||
msg250160 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2015-09-08 06:06 | |
What is your OS? On Ubuntu: >>> import locale, time >>> locale.setlocale(locale.LC_ALL, 'ko_KR.UTF-8') 'ko_KR.UTF-8' >>> time.strftime('%a') '화' >>> locale.setlocale(locale.LC_ALL, 'ko_KR.eucKR') 'ko_KR.eucKR' >>> time.strftime('%a') '화' >>> locale.setlocale(locale.LC_ALL, 'ko_KR') Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/serhiy/py/cpython/Lib/locale.py", line 595, in setlocale return _setlocale(category, locale) locale.Error: unsupported locale setting >>> locale.setlocale(locale.LC_ALL, 'ko') Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/serhiy/py/cpython/Lib/locale.py", line 595, in setlocale return _setlocale(category, locale) locale.Error: unsupported locale setting |
|||
msg250176 - (view) | Author: Eryk Sun (eryksun) * | Date: 2015-09-08 09:55 | |
It seems VC 14 has a bug here. In the new C runtime, strftime is implemented by calling wcsftime as follows: size_t const result = _Wcsftime_l(wstring.get(), maxsize, wformat.get(), timeptr, lc_time_arg, locale); if (result == 0) return 0; // Copy output from wide char string if (!WideCharToMultiByte(lc_time_cp, 0, wstring.get(), -1, string, static_cast<int>(maxsize), nullptr, nullptr)) { __acrt_errno_map_os_error(GetLastError()); return 0; } return result; The WideCharToMultiByte call returns the number of bytes in the converted string, but strftime doesn't update the value of "result". This worked correctly in the old CRT. For example, in 3.4 built with VC 10: >>> sys.version_info[:2] (3, 4) >>> locale.setlocale(locale.LC_ALL, 'kor_kor') 'Korean_Korea.949' >>> time.strftime('%a') '\ud654' Here's an overview of the problem in 3.5, stepped through in the debugger: >>> sys.version_info[:2] (3, 5) >>> locale.setlocale(locale.LC_ALL, 'ko') 'ko' >>> time.strftime('%a') Breakpoint 0 hit ucrtbase!Wcsftime_l: 000007fe`e9e6fd74 48895c2410 mov qword ptr [rsp+10h],rbx ss:00000000`003df6d8=0000000000666ce0 wcsftime returns the output buffer length in wide characters: 0:000> pt; r rax rax=0000000000000001 WideCharToMultiByte is called to convert the wide-character string to the locale encoding: 0:000> pc ucrtbase!Strftime_l+0x17f: 000007fe`e9e6c383 ff15dfa00200 call qword ptr [ucrtbase!_imp_WideCharToMultiByte (000007fe`e9e96468)] ds:000007fe` e9e96468={KERNELBASE!WideCharToMultiByte (000007fe`fd631be0)} 0:000> p ucrtbase!Strftime_l+0x185: 000007fe`e9e6c389 85c0 test eax,eax This returns the length of the converted string (including the null): 0:000> r rax rax=0000000000000003 But strftime ignores this value, and instead returns the wide-character string length, which gets passed to PyUnicode_DecodeLocaleAndSize: 0:000> bp python35!PyUnicode_DecodeLocaleAndSize 0:000> g Breakpoint 1 hit python35!PyUnicode_DecodeLocaleAndSize: 00000000`5ec15160 4053 push rbx 0:000> r rdx rdx=0000000000000001 U+D654 was converted correctly to '\xc8\cad' (codepaged 949): 0:000> db @rcx l3 00000000`007e5d20 c8 ad 00 ... However, since (str[len] != '\0'), PyUnicode_DecodeLocaleAndSize errors out as follows: 0:000> bd 0,1; g Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: embedded null byte It works as expected if the length is manually changed to 2: >>> time.strftime('%a') Breakpoint 1 hit python35!PyUnicode_DecodeLocaleAndSize: 00000000`5ec15160 4053 push rbx 0:000> r rdx=2 0:000> g '\ud654' The string is null-terminated, so can time_strftime simply substitute PyUnicode_DecodeLocale in place of PyUnicode_DecodeLocaleAndSize? |
|||
msg253020 - (view) | Author: Steve Dower (steve.dower) * | Date: 2015-10-14 21:29 | |
I can confirm that this is fixed in an upcoming Windows update: Python 3.5.0 (v3.5.0:374f501f4567, Sep 13 2015, 02:27:37) [MSC v.1900 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import locale, time >>> locale.setlocale(locale.LC_ALL, 'ko') 'ko' >>> time.strftime('%a') '\uc218' >>> |
|||
msg295208 - (view) | Author: Shaun Walbridge (scw) * | Date: 2017-06-05 19:56 | |
For reference if anyone else still runs into this issue: the affected DLL is ucrtbase.dll, and the faulty version is 10.0.10240.0, which shipped with the 1507 release of Windows 10, the Windows 10 SDK, and Visual Studio 2015 RTM. This issue was resolved at the 1511 ( 10.0.10586.212) release and later, along with Visual Studio 2015 Update 3, which can be installed on Windows 10 via Windows Update. On Windows 7 and 8.1, Windows update may update the files, but you also need to check for any local copies of the DLL in the same directory as the Python executable, as on these platforms per-application installs have priority over the copy within the Windows installation. Currently, the Python distributed with Conda environments (where Py3.5+ is used) are affected by this issue[1] because of their app-local deployments of these DLLs on Windows 7/8.1. Any application which similarly bundles the UCRT DLLs alongside its runtime will be also be affected. 1. Conda issue filed at: https://github.com/ContinuumIO/anaconda-issues/issues/1974 |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:58:20 | admin | set | github: 69211 |
2017-06-05 19:56:59 | scw | set | nosy:
+ scw messages: + msg295208 |
2015-10-14 21:29:07 | steve.dower | set | status: open -> closed resolution: third party messages: + msg253020 stage: resolved |
2015-09-08 09:59:32 | serhiy.storchaka | set | nosy: + belopolsky, - serhiy.storchaka |
2015-09-08 09:55:47 | eryksun | set | nosy:
+ paul.moore, tim.golden, eryksun, zach.ware, steve.dower messages: + msg250176 components: + Windows |
2015-09-08 06:06:58 | serhiy.storchaka | set | nosy:
+ serhiy.storchaka, lemburg, loewis messages: + msg250160 components: + Library (Lib) type: crash -> behavior |
2015-09-08 05:32:07 | sy LEE | create |