classification
Title: 3.0.1 crashes in unicode path
Type: crash Stage:
Components: Interpreter Core, Unicode, Windows Versions: Python 3.0, Python 3.1
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: miwa, ocean-city, pitrou
Priority: normal Keywords: patch

Created on 2009-02-15 11:26 by miwa, last changed 2010-04-27 20:32 by loewis. This issue is now closed.

Files
File name Uploaded Description Edit
fix_import.patch ocean-city, 2009-02-15 18:26 for py3k branch
Messages (9)
msg82150 - (view) Author: Musashi Tamura (miwa) Date: 2009-02-15 11:26
In unicode path Python 3.0.1 crashes when importing compiled module.
This does not happen on Python 3.0, new in 3.0.1.

Detailed Situation:
OS: win2000
current pathname contains Japanese characters.
./a.py contains only a statement "import b".
./b.py is empty.
> python a.py
(nothing is happen but b.pyc is created)
> python a.py
Traceback (most recent call last):
  File "a.py", line 1, in <module>
    import b
UnicodeDecodeError: 'utf8' codec can't decode byte 0x82 in position 3:
unexpected code byte
msg82152 - (view) Author: Hirokazu Yamamoto (ocean-city) * (Python committer) Date: 2009-02-15 13:54
Quick observation. This bug was introduces in r68363.

import.c(994)
	newname = PyUnicode_FromString(pathname);

pathname is mbcs on windows, but PyUnicode_FromString assumes it as UTF8.
msg82153 - (view) Author: Hirokazu Yamamoto (ocean-city) * (Python committer) Date: 2009-02-15 14:03
Here is a patch.
msg82154 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-02-15 14:21
Gasp. Sorry for the bug.
Should PyUnicode_CompareWithASCIIString() be replaced with something
else as well?
msg82159 - (view) Author: Hirokazu Yamamoto (ocean-city) * (Python committer) Date: 2009-02-15 15:36
I'm not sure. Even my patch might not be correct anyway.

On my VC6 Debugger,
update_compiled_module(PyCodeObject *co, char *pathname)
pathname is surely mbcs.

But its caller load_source_module is calling

	if (fstat(fileno(fp), &st) != 0) {
		PyErr_Format(PyExc_RuntimeError,
			     "unable to get file status from '%s'",
			     pathname);
		return NULL;
	}

I've looked into PyErr_Format code, it seems %s assumes utf-8. Anway,
it's difficult to know char* is utf-8 or filesystem encoding. :-(
msg82160 - (view) Author: Hirokazu Yamamoto (ocean-city) * (Python committer) Date: 2009-02-15 16:29
I tracked down, and I found this mbcs path is set in Python/import.c(1394) 
find_module.

	if (PyUnicode_Check(v)) {
		v = PyUnicode_AsEncodedString(v, 
		    Py_FileSystemDefaultEncoding, NULL);
		if (v == NULL)
			return NULL;
	}

And this was introduced in r64126 to fix segfault mentioned in
issue1342. I'm not understanding why segfault happened but, I feel this
issue is the part of big problem. (issue3080)
msg82164 - (view) Author: Hirokazu Yamamoto (ocean-city) * (Python committer) Date: 2009-02-15 18:26
>Should PyUnicode_CompareWithASCIIString() be replaced with something
>else as well?

I hope revised patch will fix this too. There seems to be no function to
compare unicode object and file system encoded string, so I moved
unicode creation before comparation. This might increase overhead a bit.

Issue3080 is big issue, so this is minimal solution for this issue. I
confirmed test_import.py passed.
msg83110 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-03-03 23:53
I cannot say anything except that the patch looks ok. If it doesn't make
anything worse and solves the present problem, I guess you can commit it.
msg83113 - (view) Author: Hirokazu Yamamoto (ocean-city) * (Python committer) Date: 2009-03-04 01:58
Thanks, fixed in r70157(py3k) and r70158(release30-maint)
History
Date User Action Args
2010-04-27 20:32:23loewissetpriority: normal
2009-03-05 13:47:12ocean-citylinkissue5422 superseder
2009-03-05 13:47:12ocean-cityunlinkissue5422 dependencies
2009-03-05 13:46:30ocean-citylinkissue5422 dependencies
2009-03-04 01:58:45ocean-citysetstatus: open -> closed
priority: release blocker -> (no value)
messages: + msg83113
resolution: fixed
2009-03-03 23:53:04pitrousetmessages: + msg83110
2009-03-03 23:21:49ocean-citysetpriority: release blocker
2009-02-15 18:26:53ocean-citysetfiles: - fix_import.patch
2009-02-15 18:26:41ocean-citysetfiles: + fix_import.patch
dependencies: - Full unicode import system
messages: + msg82164
2009-02-15 16:29:14ocean-citysetdependencies: + Full unicode import system
messages: + msg82160
2009-02-15 15:36:07ocean-citysetmessages: + msg82159
2009-02-15 14:21:10pitrousetnosy: + pitrou
messages: + msg82154
2009-02-15 14:03:59ocean-citysetfiles: + fix_import.patch
keywords: + patch
messages: + msg82153
components: + Interpreter Core, Unicode
versions: + Python 3.1
2009-02-15 13:54:17ocean-citysetnosy: + ocean-city
messages: + msg82152
2009-02-15 11:26:12miwacreate