Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

3.0.1 crashes in unicode path #49523

Closed
miwa mannequin opened this issue Feb 15, 2009 · 9 comments
Closed

3.0.1 crashes in unicode path #49523

miwa mannequin opened this issue Feb 15, 2009 · 9 comments
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) OS-windows topic-unicode type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@miwa
Copy link
Mannequin

miwa mannequin commented Feb 15, 2009

BPO 5273
Nosy @pitrou
Files
  • fix_import.patch: for py3k branch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2009-03-04.01:58:45.260>
    created_at = <Date 2009-02-15.11:26:12.722>
    labels = ['interpreter-core', 'expert-unicode', 'OS-windows', 'type-crash']
    title = '3.0.1 crashes in unicode path'
    updated_at = <Date 2010-04-27.20:32:23.571>
    user = 'https://bugs.python.org/miwa'

    bugs.python.org fields:

    activity = <Date 2010-04-27.20:32:23.571>
    actor = 'loewis'
    assignee = 'none'
    closed = True
    closed_date = <Date 2009-03-04.01:58:45.260>
    closer = 'ocean-city'
    components = ['Interpreter Core', 'Unicode', 'Windows']
    creation = <Date 2009-02-15.11:26:12.722>
    creator = 'miwa'
    dependencies = []
    files = ['13101']
    hgrepos = []
    issue_num = 5273
    keywords = ['patch']
    message_count = 9.0
    messages = ['82150', '82152', '82153', '82154', '82159', '82160', '82164', '83110', '83113']
    nosy_count = 3.0
    nosy_names = ['pitrou', 'ocean-city', 'miwa']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = None
    status = 'closed'
    superseder = None
    type = 'crash'
    url = 'https://bugs.python.org/issue5273'
    versions = ['Python 3.0', 'Python 3.1']

    @miwa
    Copy link
    Mannequin Author

    miwa mannequin commented Feb 15, 2009

    In unicode path Python 3.0.1 crashes when importing compiled module.
    This does not happen on Python 3.0, new in 3.0.1.

    Detailed Situation:
    OS: win2000
    current pathname contains Japanese characters.
    ./a.py contains only a statement "import b".
    ./b.py is empty.
    > python a.py
    (nothing is happen but b.pyc is created)
    > python a.py
    Traceback (most recent call last):
      File "a.py", line 1, in <module>
        import b
    UnicodeDecodeError: 'utf8' codec can't decode byte 0x82 in position 3:
    unexpected code byte

    @miwa miwa mannequin added OS-windows type-crash A hard crash of the interpreter, possibly with a core dump labels Feb 15, 2009
    @ocean-city
    Copy link
    Mannequin

    ocean-city mannequin commented Feb 15, 2009

    Quick observation. This bug was introduces in r68363.

    import.c(994)
    	newname = PyUnicode_FromString(pathname);

    pathname is mbcs on windows, but PyUnicode_FromString assumes it as UTF8.

    @ocean-city
    Copy link
    Mannequin

    ocean-city mannequin commented Feb 15, 2009

    Here is a patch.

    @ocean-city ocean-city mannequin added interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-unicode labels Feb 15, 2009
    @pitrou
    Copy link
    Member

    pitrou commented Feb 15, 2009

    Gasp. Sorry for the bug.
    Should PyUnicode_CompareWithASCIIString() be replaced with something
    else as well?

    @ocean-city
    Copy link
    Mannequin

    ocean-city mannequin commented Feb 15, 2009

    I'm not sure. Even my patch might not be correct anyway.

    On my VC6 Debugger,
    update_compiled_module(PyCodeObject *co, char *pathname)
    pathname is surely mbcs.

    But its caller load_source_module is calling

    	if (fstat(fileno(fp), &st) != 0) {
    		PyErr_Format(PyExc_RuntimeError,
    			     "unable to get file status from '%s'",
    			     pathname);
    		return NULL;
    	}

    I've looked into PyErr_Format code, it seems %s assumes utf-8. Anway,
    it's difficult to know char* is utf-8 or filesystem encoding. :-(

    @ocean-city
    Copy link
    Mannequin

    ocean-city mannequin commented Feb 15, 2009

    I tracked down, and I found this mbcs path is set in Python/import.c(1394)
    find_module.

    	if (PyUnicode_Check(v)) {
    		v = PyUnicode_AsEncodedString(v, 
    		    Py_FileSystemDefaultEncoding, NULL);
    		if (v == NULL)
    			return NULL;
    	}

    And this was introduced in r64126 to fix segfault mentioned in
    bpo-1342. I'm not understanding why segfault happened but, I feel this
    issue is the part of big problem. (bpo-3080)

    @ocean-city
    Copy link
    Mannequin

    ocean-city mannequin commented Feb 15, 2009

    Should PyUnicode_CompareWithASCIIString() be replaced with something
    else as well?

    I hope revised patch will fix this too. There seems to be no function to
    compare unicode object and file system encoded string, so I moved
    unicode creation before comparation. This might increase overhead a bit.

    bpo-3080 is big issue, so this is minimal solution for this issue. I
    confirmed test_import.py passed.

    @ocean-city ocean-city mannequin added the release-blocker label Mar 3, 2009
    @pitrou
    Copy link
    Member

    pitrou commented Mar 3, 2009

    I cannot say anything except that the patch looks ok. If it doesn't make
    anything worse and solves the present problem, I guess you can commit it.

    @ocean-city
    Copy link
    Mannequin

    ocean-city mannequin commented Mar 4, 2009

    Thanks, fixed in r70157(py3k) and r70158(release30-maint)

    @ocean-city ocean-city mannequin removed the release-blocker label Mar 4, 2009
    @ocean-city ocean-city mannequin closed this as completed Mar 4, 2009
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) OS-windows topic-unicode type-crash A hard crash of the interpreter, possibly with a core dump
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant