Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple type confusions in unicode error handlers #68290

Closed
pkt mannequin opened this issue May 1, 2015 · 9 comments
Closed

Multiple type confusions in unicode error handlers #68290

pkt mannequin opened this issue May 1, 2015 · 9 comments
Assignees
Labels
extension-modules C modules in the Modules dir interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-unicode type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@pkt
Copy link
Mannequin

pkt mannequin commented May 1, 2015

BPO 24102
Nosy @malemburg, @doerwalter, @vstinner, @tiran, @ezio-melotti, @serhiy-storchaka
Files
  • poc_unicode_errors.py
  • codecs_error_handlers_issubclass.patch
  • codecs_error_handlers_issubclass_2.patch
  • codecs_error_handlers_issubclass_3.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/serhiy-storchaka'
    closed_at = <Date 2015-05-18.13:13:20.206>
    created_at = <Date 2015-05-01.14:14:06.688>
    labels = ['extension-modules', 'interpreter-core', 'expert-unicode', 'type-crash']
    title = 'Multiple type confusions in unicode error handlers'
    updated_at = <Date 2015-05-18.13:13:38.627>
    user = 'https://bugs.python.org/pkt'

    bugs.python.org fields:

    activity = <Date 2015-05-18.13:13:38.627>
    actor = 'serhiy.storchaka'
    assignee = 'serhiy.storchaka'
    closed = True
    closed_date = <Date 2015-05-18.13:13:20.206>
    closer = 'serhiy.storchaka'
    components = ['Extension Modules', 'Interpreter Core', 'Unicode']
    creation = <Date 2015-05-01.14:14:06.688>
    creator = 'pkt'
    dependencies = []
    files = ['39253', '39265', '39266', '39287']
    hgrepos = []
    issue_num = 24102
    keywords = ['patch']
    message_count = 9.0
    messages = ['242319', '242391', '242393', '242395', '242397', '242398', '242556', '243476', '243477']
    nosy_count = 9.0
    nosy_names = ['lemburg', 'doerwalter', 'vstinner', 'christian.heimes', 'ezio.melotti', 'Arfrever', 'python-dev', 'serhiy.storchaka', 'pkt']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'crash'
    url = 'https://bugs.python.org/issue24102'
    versions = ['Python 2.7', 'Python 3.4', 'Python 3.5']

    @pkt
    Copy link
    Mannequin Author

    pkt mannequin commented May 1, 2015

    # Breakpoint 1, PyUnicodeEncodeError_GetEnd (exc=<X at remote 0x405730e4>, end=0xbf9e8f7c) at Objects/exceptions.c:1643
    \bpo-1643 PyObject *obj = get_unicode(((PyUnicodeErrorObject *)exc)->object,
    # (gdb) s
    # get_unicode (attr=<unknown at remote 0x8c6a120>, name=0x82765ea "object") at Objects/exceptions.c:1516
    \bpo-1516 if (!attr) {
    # (gdb) print *attr
    # $4 = {_ob_next = 0xfefefefe, _ob_prev = 0xfefefefe, ob_refcnt = -16843010, ob_type = 0xfefefefe}
    # (gdb) c
    # Continuing.
    #
    # Program received signal SIGSEGV, Segmentation fault.
    # 0x080bc7d9 in get_unicode (attr=<unknown at remote 0x8cbe250>, name=0x82765ea "object") at Objects/exceptions.c:1521
    \bpo-1521 if (!PyUnicode_Check(attr)) {

    # Type confusion. IsInstance check is ineffective because of custom
    # __getattribute__ method. Contents of string instance is interpreted as
    # an exception object.

    @pkt pkt mannequin added the type-crash A hard crash of the interpreter, possibly with a core dump label May 1, 2015
    @tiran tiran added the extension-modules C modules in the Modules dir label May 1, 2015
    @serhiy-storchaka serhiy-storchaka added interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-unicode labels May 2, 2015
    @serhiy-storchaka
    Copy link
    Member

    Here is simpler reproducer:

    import codecs
    
    class X(str):
        __class__ = UnicodeEncodeError
    
    codecs.ignore_errors(X())

    The problem is that PyObject_IsInstance() is fooled by custom __class__, but then builtin error handlers handle error object as having UnicodeEncodeError layout, while it doesn't.

    Here is a patch that fixes the issue by using PyObject_IsSubclass() of exc->ob_type instead of PyObject_IsInstance().

    @serhiy-storchaka serhiy-storchaka self-assigned this May 2, 2015
    @doerwalter
    Copy link
    Contributor

    The patch does indeed fix the segmentation fault. However the exception message looks confusing:

    TypeError: don't know how to handle UnicodeEncodeError in error callback

    @serhiy-storchaka
    Copy link
    Member

    Here is a patch that makes error message consistent with type checking.

    @doerwalter
    Copy link
    Contributor

    Looks much better. However shouldn't:

    exc->ob_type->tp_name

    be:

    Py_TYPE(exc)->tp_name

    (although there are still many spots in the source that still use ob_type->tp_name)

    @serhiy-storchaka
    Copy link
    Member

    Py_TYPE() is necessary when the argument is not of type PyObject* (e.g. PyUnicodeObject*).

    @serhiy-storchaka
    Copy link
    Member

    Also fixed handling errors of PyObject_IsSubclass() (bpo-24115) in the _codecs module.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented May 18, 2015

    New changeset 547bc11e3357 by Serhiy Storchaka in branch '2.7':
    Issue bpo-24102: Fixed exception type checking in standard error handlers.
    https://hg.python.org/cpython/rev/547bc11e3357

    New changeset 68eaa9409818 by Serhiy Storchaka in branch '3.4':
    Issue bpo-24102: Fixed exception type checking in standard error handlers.
    https://hg.python.org/cpython/rev/68eaa9409818

    New changeset 510819e5855e by Serhiy Storchaka in branch 'default':
    Issue bpo-24102: Fixed exception type checking in standard error handlers.
    https://hg.python.org/cpython/rev/510819e5855e

    @serhiy-storchaka
    Copy link
    Member

    Greg Ewing suggested to use PyObject_TypeCheck (http://permalink.gmane.org/gmane.comp.python.devel/153216).

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    extension-modules C modules in the Modules dir interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-unicode type-crash A hard crash of the interpreter, possibly with a core dump
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants