classification
Title: Python 3.6 cannot reopen .pyc file with non-ASCII path
Type: behavior Stage: patch review
Components: Interpreter Core, Unicode, Windows Versions: Python 3.9, Python 3.8, Python 3.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Tianjg, ZackerySpytz, eryksun, ezio.melotti, paul.moore, steve.dower, tianjg, tim.golden, vstinner, zach.ware
Priority: normal Keywords: 3.6regression, patch

Created on 2017-12-20 06:07 by tianjg, last changed 2019-07-14 09:06 by vstinner.

Files
File name Uploaded Description Edit
20171218111240.jpg tianjg, 2017-12-20 06:07
Pull Requests
URL Status Linked Edit
PR 14699 open ZackerySpytz, 2019-07-11 04:52
Messages (7)
msg308705 - (view) Author: (tianjg) Date: 2017-12-20 06:07
have a problem that python3.6 can not reopen .pyc file with Chinese path, and python3.5 can reopen the same pyc file. As shown in the picture
msg308707 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2017-12-20 07:35
run_file encodes the file path via PyUnicode_EncodeFSDefault, which encodes as UTF-8 in Windows, starting with 3.6. PyRun_SimpleFileExFlags subsequently tries to open this encoded path via _Py_fopen, which calls fopen. The CRT expects an ANSI encoded path, so only the common ASCII subset will work. Non-ASCII paths will fail.

This could be addressed in _Py_fopen by decoding the path and calling _wfopen instead of fopen. 

Executing a .pyc also fails in 3.5 if the wide-character path can't be encoded as ANSI, but the 3.5 branch only accepts security fixes.
msg308709 - (view) Author: (Tianjg) Date: 2017-12-20 09:53
Thanks a lot. What should I do to  reopen .pyc file with non-ASCII path use
python3.6 in cmd?Could you give me* some **code examples*.Thank you again,
and I look forward to hearing from you

2017-12-20 15:35 GMT+08:00 Eryk Sun <report@bugs.python.org>:

>
> Eryk Sun <eryksun@gmail.com> added the comment:
>
> run_file encodes the file path via PyUnicode_EncodeFSDefault, which
> encodes as UTF-8 in Windows, starting with 3.6. PyRun_SimpleFileExFlags
> subsequently tries to open this encoded path via _Py_fopen, which calls
> fopen. The CRT expects an ANSI encoded path, so only the common ASCII
> subset will work. Non-ASCII paths will fail.
>
> This could be addressed in _Py_fopen by decoding the path and calling
> _wfopen instead of fopen.
>
> Executing a .pyc also fails in 3.5 if the wide-character path can't be
> encoded as ANSI, but the 3.5 branch only accepts security fixes.
>
> ----------
> components: +Interpreter Core, Unicode
> nosy: +eryksun, ezio.melotti, vstinner
> stage:  -> test needed
> title: python3.6 can not reopen .pyc file with Chinese path -> Python 3.6
> cannot reopen .pyc file with non-ASCII path
> type: compile error -> behavior
> versions: +Python 3.7
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <https://bugs.python.org/issue32381>
> _______________________________________
>
msg308723 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2017-12-20 11:31
Workarounds: (1) force 3.6 to use the legacy ANSI filesystem encoding by setting the environment variable PYTHONLEGACYWINDOWSFSENCODING. (2) Use 8.3 DOS names, if creating them is enabled on the volume. You can check their value in CMD via `dir /x`. (3) Create alternative directory symbolic links or junctions with ASCII names via CMD's `mklink` command.
msg308831 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-12-20 22:35
run_file() gets a wchar_t* string which comes from wmain() argv.

run_file() encodes the wchar_t* using PyUnicode_EncodeFSDefault().

Later, PyRun_SimpleFileExFlags() calls indirectly fopen() with the encoded filename.

> This could be addressed in _Py_fopen by decoding the path and calling _wfopen instead of fopen. 

I agree that it's the correct fix.

I would make _Py_fopen() more compatible with the PEP 529.
msg308832 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-12-20 22:35
> I would make _Py_fopen() more compatible with the PEP 529.

Typo: It* would
msg347896 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-07-14 09:06
Hum. In fact, this problem can be fixed differently: modify PyRun_xxx() functions to pass the filename as an Unicode string. Maybe pass it as a wchar_t* string or even a Python str object.
History
Date User Action Args
2019-07-14 09:06:09vstinnersetmessages: + msg347896
2019-07-11 06:24:20ZackerySpytzsetnosy: + ZackerySpytz

versions: + Python 3.8, Python 3.9, - Python 3.6
2019-07-11 04:52:57ZackerySpytzsetkeywords: + patch
stage: test needed -> patch review
pull_requests: + pull_request14498
2017-12-20 22:35:31vstinnersetmessages: + msg308832
2017-12-20 22:35:02vstinnersetmessages: + msg308831
2017-12-20 14:24:03r.david.murraysetkeywords: + 3.6regression
2017-12-20 11:31:49eryksunsetmessages: + msg308723
2017-12-20 09:53:43Tianjgsetnosy: + Tianjg
messages: + msg308709
2017-12-20 07:35:55eryksunsettype: compile error -> behavior
title: python3.6 can not reopen .pyc file with Chinese path -> Python 3.6 cannot reopen .pyc file with non-ASCII path
components: + Interpreter Core, Unicode

nosy: + ezio.melotti, eryksun, vstinner
versions: + Python 3.7
messages: + msg308707
stage: test needed
2017-12-20 06:07:24tianjgcreate