New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
On Windows, don't encode filenames in the import machinery #55828
Comments
With bpo-3080, Python 3.3 does now manipulate module paths and names as Unicode in the import machinery. But in 3 remaining places, it does encode filenames (to the ANSI code page): a) _PyImport_LoadDynamicModule() It should pass directly the PyObject* (instead of a char*) to _PyImport_GetDynLoadFunc(), but only on Windows (we may change the function name for Windows). _PyImport_GetDynLoadFunc() of dynload_win.c has to be patched to use the Unicode API (eg. LoadLibraryEx => LoadLibraryExW). b) write_compiled_module() The problem is to implement open_exclusive() for Windows using Unicode. open_exclusive() uses open() on Windows, but open() expects the filename as a byte string. We may use _Py_fopen() (_wfopen), but this function doesn't have an option to open the file in exclusive mode (O_EXCL flag). GNU has an extension: "x" flag in the file mode, but Windows doesn't support it. The file is passed to marshal functions like PyMarshal_WriteLongToFile(), and so the file have to be a FILE*. c) parse_source_module() => covered by the issue bpo-10785. |
open_exclusive() was created by: changeset: 14708:89b2aee43e0b |
dynload_win.patch: Fix part (a), _PyImport_LoadDynamicModule(). |
New changeset 1b7f484bab6e by Victor Stinner in branch 'default': |
New changeset e4e92d68ba3a by Victor Stinner in branch 'default': |
Issue bpo-10785 didn't change parse_source_module(): it does still encode the filename. We need Unicode version of PyParser_ASTFromFile() and PyAST_Compile(): a new version of these functions accepting a filename as a Unicode string. For PyParser_ASTFromFile(): bpo-10785 prepared the work. For PyAST_Compile(): struct compiler stores the filename as a byte string, the filename should be stored as Unicode. |
compile_filename.patch:
The patch prepares the work to pass the filename to the compiler directly as Unicode. |
Another huge patch to support Unicode filenames: parser_unicode.patch Doc/c-api/exceptions.rst | 26 +++++++++++--- It creates new functions of the following functions which are undocumented:
We might remove these functions, but they are part of the public API (but they are undocumented). |
The patch is really huge for such a very rare use case, so I prefer to close the issue as wont fix. Common cases with non-ASCII names are already handled correctly in Python 3.3. |
Is there a chance this will be fixed at least in Python 4? |
This may be a small use case, but a use case none-the less. In my situation, I am distributing a frozen python package and it runs under the users home directory. If the user's name has international characters, this will fail. I expect we will have similar problems when dealing with our application which embeds python and is also running from within the user directory... |
I reopen the issue because some users are now requesting this feature. I updated parser_unicode.patch to the last Python version. The new patch has just a minor nit: test_symtable does crash :-D I will investigate the crash later. |
Fixed in new patch: parser_unicode-3.patch |
New changeset df2fdd42b375 by Victor Stinner in branch 'default': |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: