classification
Title: Crash on Windows if Python runs from a directory with umlauts
Type: crash Stage:
Components: Interpreter Core Versions: Python 3.0
process
Status: closed Resolution: accepted
Dependencies: Superseder:
Assigned To: amaury.forgeotdarc Nosy List: amaury.forgeotdarc, christian.heimes, gvanrossum, loewis, nnorwitz, pitrou, rasky
Priority: critical Keywords: patch

Created on 2007-10-27 03:40 by christian.heimes, last changed 2008-06-11 18:26 by amaury.forgeotdarc. This issue is now closed.

Files
File name Uploaded Description Edit
py3k_more_win_fsencoding.patch christian.heimes, 2007-10-27 14:03
py3k_win_nonascii.patch christian.heimes, 2007-11-13 15:37
win_nonascii.patch amaury.forgeotdarc, 2008-06-10 20:04 decode paths with FileSystemDefaultEncoding
Messages (17)
msg56841 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2007-10-27 03:40
Python 3.0 doesn't run from a directory with umlauts and possible other
non ASCII chars.

I renamed my development folder from C:\dev\ to c:\test äöüß name\.
Python crashes after a few moments before it can reach its shell.

 	python30.dll!PyErr_SetObject(_object * exception=0x1e1b9888, _object *
value=0x00a0b8a0)  Zeile 56 + 0xb Bytes	C
 	python30.dll!PyErr_SetString(_object * exception=0x1e1b9888, const
char * string=0x1e18c358)  Zeile 77 + 0xd Bytes	C
 	python30.dll!find_module(char * fullname=0x0021fcc0, char *
subname=0x00000000, _object * path=0x00000000, char * buf=0x0021fb70,
unsigned int buflen=257, _iobuf * * p_fp=0x0021fb64, _object * *
p_loader=0x0021fb68)  Zeile 1228 + 0x10 Bytes	C
 	python30.dll!import_submodule(_object * mod=0x1e1c6a88, char *
subname=0x0021fcc0, char * fullname=0x00000000)  Zeile 2313 + 0x27 Bytes	C
 	python30.dll!load_next(_object * mod=0x1e1c6a88, _object *
altmod=0x1e1c6a88, char * * p_name=0x00000000, char * buf=0x0021fcc0,
int * p_buflen=0x0021fcbc)  Zeile 2127 + 0x15 Bytes	C
 	python30.dll!import_module_level(char * name=0x00000000, _object *
globals=0x00000000, _object * locals=0x1e069ec3, _object *
fromlist=0x00000000, int level=0)  Zeile 1908 + 0x1a Bytes	C
 	python30.dll!PyImport_ImportModuleLevel(char * name=0x1e184b04,
_object * globals=0x00000000, _object * locals=0x00000000, _object *
fromlist=0x00000000, int level=0)  Zeile 1979 + 0x18 Bytes	C
 	python30.dll!_PyCodecRegistry_Init()  Zeile 841 + 0x12 Bytes	C
 	python30.dll!PyCodec_LookupError(const char * name=0x00000000)  Zeile
436 + 0xc Bytes	C
 	python30.dll!unicode_decode_call_errorhandler(const char *
errors=0x00000000, _object * * errorHandler=0x00000009, const char *
encoding=0x1e1979ec, const char * reason=0x00000000, const char * *
input=0x0021fe80, const char * * inend=0x0021fe84, int *
startinpos=0x0021fe6c, int * endinpos=0x0021fe68, _object * *
exceptionObject=0x00000000, const char * * inptr=0x0021fe90, _object * *
output=0x0021fe70, int * outpos=0x0021fe88, unsigned short * *
outptr=0x0021fe74)  Zeile 1384 + 0xa Bytes	C
 	python30.dll!PyUnicodeUCS2_DecodeUTF8Stateful(const char *
s=0x1e1dd010, int size=48, const char * errors=0x00000000, int *
consumed=0x00000000)  Zeile 1967 + 0x47 Bytes	C
 	python30.dll!PyUnicodeUCS2_FromStringAndSize(const char *
u=0x1e1dd008, int size=48)  Zeile 464 + 0xb Bytes	C
 	python30.dll!PyUnicodeUCS2_FromString(const char * u=0x1e1dd008) 
Zeile 482 + 0x7 Bytes	C
 	python30.dll!_PySys_Init()  Zeile 1084 + 0xb Bytes	C
 	python30.dll!Py_InitializeEx(int install_sigs=1)  Zeile 220	C
 	python30.dll!Py_Initialize()  Zeile 292 + 0x7 Bytes	C
 	python30.dll!Py_Main(int argc=2, char * * argv=0x00000001)  Zeile 432	C
>	python.exe!mainCRTStartup()  Zeile 398 + 0xe Bytes	C
 	kernel32.dll!7c816fd7()
msg56852 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2007-10-27 14:03
The patch fixes parts of the problem. At least Python doesn't crash any
more when run from a directory with non ASCII chars. It just fails with
an import error in initstdio().
msg56853 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2007-10-27 14:13
I've added a fprintf(stderr, "%s", path) to makepathobject(). I suspect
that PC/getpathp.c doesn't handle non ASCII chars correctly. It's using
char instead of w_char all over the place. Could that be related to the
issue, Neal?

Microsoft Windows XP [Version 5.1.2600]
(C) Copyright 1985-2001 Microsoft Corp.

c:\testäöü\PCBuild8\win32release>set PYTHONPATH=c:\testäöü\Lib

c:\testäöü\PCBuild8\win32release>python
c:\testõ÷³\Lib;c:\testõ÷³\PCBuild8\win32release\python30.zip;c:\testõ÷³\DLLs;c:\
testõ÷³\lib;c:\testõ÷³\lib\plat-win;c:\testõ÷³\lib\lib-tk;c:\testõ÷³\PCBuild8\wi
n32releaseFatal Python error: Py_Initialize: can't initialize sys
standard strea
ms
object  : ImportError('No module named encodings.utf_8',)
type    : ImportError
refcount: 4
address : 00A43540
lost sys.stderr

This application has requested the Runtime to terminate it in an unusual
way.
Please contact the application's support team for more information.
msg56865 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2007-10-27 22:09
The bug is related to http://bugs.python.org/issue1262
msg56979 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2007-10-30 19:00
Hi Martin!

Thomas Wouters said on #python that you have the Windows Fu to fix the
problem. Parts of the Python API for file paths, sys.path and os.environ
have to be reimplemented using the wide char API.
msg57094 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2007-11-04 11:51
I've checked in part of the patch in r58837. It doesn't solve the
problem but at least it prevents Python from seg faulting on Windows.
msg57456 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2007-11-13 15:37
I like to move _PyExc_Init() before _PySys_Init() and set sys.prefix,
exec_prefix and executable with PyUnicode_DecodeFSDefault().

Without the changes Python is seg faulting on Windows when the path
contains non ASCII chars. With the patch it is failing with a fatal
error which is a tiny bit nicer.
msg57466 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2007-11-13 18:37
If this doesn't cause any problems on other platforms, go for it.
msg57474 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2007-11-14 00:35
I'm setting the priority to normal. The issue isn't resolved but it's
not critical for the next alpha release. By the way what's your ETA for
the next alpha, Guido?
msg57681 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2007-11-20 01:31
Assign to myself.
Among the things to do, use Py_FileSystemDefaultEncoding (=mbcs on
Windows) to encode sys.path items; likewise in NullImporter_init and
other functions.
So many places to change, we need serious testcases.
msg57687 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2007-11-20 03:04
Please don't use the FileSystemEncoding on Windows for sys.path items.
Instead, it should use the wide API to perform all system calls. Py3k
shouldn't ever use the file system encoding for anything on Windows.
msg57698 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2007-11-20 09:27
Agreed. I will try to stay with PyObjects* until really needed by a system call.
msg66441 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2008-05-08 20:38
I'm increasing the severity of the bug. It's a still a major show
stopper for non-English Windows users. For example see #2780
msg66460 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-05-09 08:55
There are some problems under Linux too:

$ pwd
/home/antoine/py3k/héhé
$ ./python
Fatal Python error: Py_Initialize: can't initialize sys standard streams
Traceback (most recent call last):
  File "/home/antoine/py3k/pristine/Lib/encodings/__init__.py", line 32,
in <module>
TypeError: zipimporter() argument 1 must be string without null bytes,
not str
Abandon
msg66462 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-05-09 10:05
See #2798 for the non-Windows case, with a patch.
msg67915 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-06-10 20:04
Here is a quick fix, that decodes filenames using 
Py_FileSystemDefaultEncoding, to let the release pass.

I am still working on a version that keep PyObjects* as long as
possible, but it will be a major change.
msg68006 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-06-11 18:26
Fixed as r64126, using Py_FileSystemDefaultEncoding.

I close this issue, and open issue3080 to rewrite all functions in
import.c with full unicode in mind.
History
Date User Action Args
2008-06-11 18:26:49amaury.forgeotdarcsetstatus: open -> closed
messages: + msg68006
2008-06-10 20:04:48amaury.forgeotdarcsetpriority: release blocker -> critical
keywords: + patch
messages: + msg67915
files: + win_nonascii.patch
2008-05-09 10:05:47pitrousetmessages: + msg66462
2008-05-09 08:55:43pitrousetnosy: + pitrou
messages: + msg66460
2008-05-08 20:38:36christian.heimeslinkissue2780 superseder
2008-05-08 20:38:05christian.heimessetpriority: normal -> release blocker
messages: + msg66441
2008-01-26 12:57:35raskysetnosy: + rasky
2008-01-06 22:29:45adminsetkeywords: - py3k
versions: Python 3.0
2007-11-25 00:27:50georg.brandlsetkeywords: - rfe
2007-11-20 09:27:52amaury.forgeotdarcsetmessages: + msg57698
2007-11-20 03:04:32loewissetmessages: + msg57687
2007-11-20 01:31:56amaury.forgeotdarcsetassignee: amaury.forgeotdarc
messages: + msg57681
nosy: + amaury.forgeotdarc
2007-11-14 00:35:32christian.heimessetpriority: high -> normal
messages: + msg57474
2007-11-13 18:37:26gvanrossumsetmessages: + msg57466
2007-11-13 15:37:47christian.heimessetfiles: + py3k_win_nonascii.patch
messages: + msg57456
2007-11-04 11:51:54christian.heimessetpriority: high
resolution: accepted
messages: + msg57094
keywords: + py3k, rfe
2007-11-04 11:20:14christian.heimeslinkissue1377 superseder
2007-10-30 19:00:57christian.heimessetnosy: + loewis
messages: + msg56979
2007-10-27 22:09:42christian.heimessetnosy: + gvanrossum
messages: + msg56865
severity: normal -> urgent
2007-10-27 14:13:22christian.heimessetnosy: + nnorwitz
messages: + msg56853
2007-10-27 14:03:17christian.heimessetfiles: + py3k_more_win_fsencoding.patch
messages: + msg56852
2007-10-27 03:40:35christian.heimescreate