Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

python -v crashes in nonencodable directory #69369

Closed
serhiy-storchaka opened this issue Sep 19, 2015 · 12 comments
Closed

python -v crashes in nonencodable directory #69369

serhiy-storchaka opened this issue Sep 19, 2015 · 12 comments
Assignees
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@serhiy-storchaka
Copy link
Member

BPO 25182
Nosy @brettcannon, @ncoghlan, @vstinner, @ericsnowcurrently, @serhiy-storchaka
Files
  • stdprinter_backslashreplace.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/serhiy-storchaka'
    closed_at = <Date 2015-09-30.12:54:21.837>
    created_at = <Date 2015-09-19.20:48:45.982>
    labels = ['interpreter-core', 'type-crash']
    title = 'python -v crashes in nonencodable directory'
    updated_at = <Date 2015-09-30.17:45:29.583>
    user = 'https://github.com/serhiy-storchaka'

    bugs.python.org fields:

    activity = <Date 2015-09-30.17:45:29.583>
    actor = 'serhiy.storchaka'
    assignee = 'serhiy.storchaka'
    closed = True
    closed_date = <Date 2015-09-30.12:54:21.837>
    closer = 'serhiy.storchaka'
    components = ['Interpreter Core']
    creation = <Date 2015-09-19.20:48:45.982>
    creator = 'serhiy.storchaka'
    dependencies = []
    files = ['40625']
    hgrepos = []
    issue_num = 25182
    keywords = ['patch']
    message_count = 12.0
    messages = ['251115', '251122', '251123', '251918', '251921', '251924', '251925', '251926', '251927', '251928', '251940', '251957']
    nosy_count = 7.0
    nosy_names = ['brett.cannon', 'ncoghlan', 'vstinner', 'Arfrever', 'python-dev', 'eric.snow', 'serhiy.storchaka']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'crash'
    url = 'https://bugs.python.org/issue25182'
    versions = ['Python 3.4', 'Python 3.5', 'Python 3.6']

    @serhiy-storchaka
    Copy link
    Member Author

    $ pwd
    /home/serhiy/py/cpy�thon-3.5
    $ ./python -v
    import _frozen_importlib # frozen
    import _imp # builtin
    import sys # builtin
    import '_warnings' # <class '_frozen_importlib.BuiltinImporter'>
    import '_thread' # <class '_frozen_importlib.BuiltinImporter'>
    import '_weakref' # <class '_frozen_importlib.BuiltinImporter'>
    import '_frozen_importlib_external' # <class '_frozen_importlib.FrozenImporter'>
    import '_io' # <class '_frozen_importlib.BuiltinImporter'>
    import 'marshal' # <class '_frozen_importlib.BuiltinImporter'>
    import 'posix' # <class '_frozen_importlib.BuiltinImporter'>
    import _thread # previously loaded ('_thread')
    import '_thread' # <class '_frozen_importlib.BuiltinImporter'>
    import _weakref # previously loaded ('_weakref')
    import '_weakref' # <class '_frozen_importlib.BuiltinImporter'>
    # installing zipimport hook
    import 'zipimport' # <class '_frozen_importlib.BuiltinImporter'>
    # installed zipimport hook
    Fatal Python error: Py_Initialize: Unable to get the locale encoding
    Traceback (most recent call last):
      File "<frozen importlib._bootstrap>", line 969, in _find_and_load
    # destroy io
      File "<frozen importlib._bootstrap>", line 958, in _find_and_load_unlocked
    # destroy io
      File "<frozen importlib._bootstrap>", line 673, in _load_unlocked
    # destroy io
      File "<frozen importlib._bootstrap_external>", line 658, in exec_module
    # destroy io
      File "<frozen importlib._bootstrap_external>", line 759, in get_code
    # destroy io
      File "<frozen importlib._bootstrap_external>", line 368, in _verbose_message
    # destroy io
    UnicodeEncodeError: 'utf-8' codec can't encode character '\udcff' in position 21: surrogates not allowed
    # destroy encodings
    Aborted (core dumped)

    @serhiy-storchaka serhiy-storchaka added the type-crash A hard crash of the interpreter, possibly with a core dump label Sep 19, 2015
    @brettcannon
    Copy link
    Member

    And what happens if you leave -v off? Since the failure is in Py_Initialize I want to know if that Py_FatalError trigger is avoided without -v.

    A possible fix to test is to simply modify importlib._bootstrap._verbose_message to catch UnicodeDecodeError and then print some message saying that there was some undecodable string and just swallow the exception. I just don't know if this is in the right place to prevent Py_Initialize from erroring out.

    @serhiy-storchaka
    Copy link
    Member Author

    python without -v is not failed.

    If wrap message in _bootstrap_external._verbose_message with '!%a'% and in _bootstrap._verbose_message with '%a'% (why _verbose_message is duplicated in _bootstrap and _bootstrap_external?), the output is:

    ...
    # installing zipimport hook
    @"import 'zipimport' # <class '_frozen_importlib.BuiltinImporter'>"
    # installed zipimport hook
    !'# /home/serhiy/py/cpy\udcffthon-3.5/Lib/encodings/pycache/init.cpython-35.pyc matches /home/serhiy/py/cpy\udcffthon-3.5/Lib/encodings/init.py'
    !"# code object from '/home/serhiy/py/cpy\\udcffthon-3.5/Lib/encodings/pycache/init.cpython-35.pyc'"
    !'# /home/serhiy/py/cpy\udcffthon-3.5/Lib/pycache/codecs.cpython-35.pyc matches /home/serhiy/py/cpy\udcffthon-3.5/Lib/codecs.py'
    !"# code object from '/home/serhiy/py/cpy\\udcffthon-3.5/Lib/pycache/codecs.cpython-35.pyc'"
    @"import '_codecs' # <class '_frozen_importlib.BuiltinImporter'>"
    @"import 'codecs' # <_frozen_importlib_external.SourceFileLoader object at 0xb70b9aac>"
    !'# /home/serhiy/py/cpy\udcffthon-3.5/Lib/encodings/pycache/aliases.cpython-35.pyc matches /home/serhiy/py/cpy\udcffthon-3.5/Lib/encodings/aliases.py'
    !"# code object from '/home/serhiy/py/cpy\\udcffthon-3.5/Lib/encodings/pycache/aliases.cpython-35.pyc'"
    @"import 'encodings.aliases' # <_frozen_importlib_external.SourceFileLoader object at 0xb70c81ac>"
    @"import 'encodings' # <_frozen_importlib_external.SourceFileLoader object at 0xb70b96cc>"
    !'# /home/serhiy/py/cpy\udcffthon-3.5/Lib/encodings/pycache/utf_8.cpython-35.pyc matches /home/serhiy/py/cpy\udcffthon-3.5/Lib/encodings/utf_8.py'
    !"# code object from '/home/serhiy/py/cpy\\udcffthon-3.5/Lib/encodings/pycache/utf_8.cpython-35.pyc'"
    @"import 'encodings.utf_8' # <_frozen_importlib_external.SourceFileLoader object at 0xb70ccd2c>"
    @"import '_signal' # <class '_frozen_importlib.BuiltinImporter'>"
    !'# /home/serhiy/py/cpy\udcffthon-3.5/Lib/encodings/pycache/latin_1.cpython-35.pyc matches /home/serhiy/py/cpy\udcffthon-3.5/Lib/encodings/latin_1.py'
    !"# code object from '/home/serhiy/py/cpy\\udcffthon-3.5/Lib/encodings/pycache/latin_1.cpython-35.pyc'"
    @"import 'encodings.latin_1' # <_frozen_importlib_external.SourceFileLoader object at 0xb70cf54c>"
    !'# /home/serhiy/py/cpy\udcffthon-3.5/Lib/pycache/io.cpython-35.pyc matches /home/serhiy/py/cpy\udcffthon-3.5/Lib/io.py'
    !"# code object from '/home/serhiy/py/cpy\\udcffthon-3.5/Lib/pycache/io.cpython-35.pyc'"
    ...

    Verbose non-ascii message is written before importing codecs.

    @serhiy-storchaka
    Copy link
    Member Author

    Before importing the io module sys.stderr is stdprinter. It always encodes written string to UTF-8. Proposed patch makes it to use the backslashreplace error handler.

    In future perhaps we could implement stdprinter in Python.

    @serhiy-storchaka serhiy-storchaka added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label Sep 30, 2015
    @vstinner
    Copy link
    Member

    Before importing the io module sys.stderr is stdprinter. It always encodes written string to UTF-8. Proposed patch makes it to use the backslashreplace error handler.

    I like this solution. stdprinter is supposed to be replaced quickly.

    But we may need something else to escape non-encodable characters in the filename when sys.stdout is a TextIOWrapper using the strict error handler.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Sep 30, 2015

    New changeset 6347b154dd67 by Serhiy Storchaka in branch '3.4':
    Issue bpo-25182: The stdprinter (used as sys.stderr before the io module is
    https://hg.python.org/cpython/rev/6347b154dd67

    New changeset e8b6c6c433a4 by Serhiy Storchaka in branch '3.5':
    Issue bpo-25182: The stdprinter (used as sys.stderr before the io module is
    https://hg.python.org/cpython/rev/e8b6c6c433a4

    New changeset 0b0945c8de36 by Serhiy Storchaka in branch 'default':
    Issue bpo-25182: The stdprinter (used as sys.stderr before the io module is
    https://hg.python.org/cpython/rev/0b0945c8de36

    @serhiy-storchaka
    Copy link
    Member Author

    Thank you for the review Victor.

    But we may need something else to escape non-encodable characters in the filename when sys.stdout is a TextIOWrapper using the strict error handler.

    This is not related to this issue. sys.stderr uses backslashreplace.

    @vstinner
    Copy link
    Member

    This is not related to this issue. sys.stderr uses backslashreplace.

    Ok, fine.

    ---
    + _errno = errno;
    Py_END_ALLOW_THREADS
    + Py_XDECREF(bytes);

         if (n < 0) {
    -        if (errno == EAGAIN)
    +        if (_errno == EAGAIN)
                 Py_RETURN_NONE;
             PyErr_SetFromErrno(PyExc_IOError);

    Hum, if you expect that _errno can be modified by Py_XDECREF(bytes), you must restore the previous errno value before calling PyErr_SetFromErrno(). This strategy is used in Python/fileutils.c.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Sep 30, 2015

    New changeset 2652c1798f7d by Victor Stinner in branch '3.4':
    Issue bpo-25182: Fix compilation on Windows
    https://hg.python.org/cpython/rev/2652c1798f7d

    New changeset 0eb26a4d5ffa by Victor Stinner in branch '3.5':
    (Merge 3.4) Issue bpo-25182: Fix compilation on Windows
    https://hg.python.org/cpython/rev/0eb26a4d5ffa

    New changeset d1090d733d39 by Victor Stinner in branch 'default':
    (Merge 3.5) Issue bpo-25182: Fix compilation on Windows
    https://hg.python.org/cpython/rev/d1090d733d39

    @vstinner
    Copy link
    Member

    Hum, the code didn't compile anymore on Windows. I took the opportunity to fix the errno issue that I saw.

    Note: In fact, Python/fileutils.c is a a little bit different. Functions like _Py_write() save errno to restore it later because the caller expects errno to be set.

    @serhiy-storchaka
    Copy link
    Member Author

    Hum, the code didn't compile anymore on Windows. I took the opportunity to
    fix the errno issue that I saw.

    Thank you Victor.

    @serhiy-storchaka
    Copy link
    Member Author

    > This is not related to this issue. sys.stderr uses backslashreplace.
    Ok, fine.

    But is related to bpo-25183.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants