Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve unpickling errors handling #69947

Closed
serhiy-storchaka opened this issue Nov 29, 2015 · 7 comments
Closed

Improve unpickling errors handling #69947

serhiy-storchaka opened this issue Nov 29, 2015 · 7 comments
Assignees
Labels
extension-modules C modules in the Modules dir stdlib Python modules in the Lib dir tests Tests in the Lib/test dir type-feature A feature request or enhancement

Comments

@serhiy-storchaka
Copy link
Member

BPO 25761
Nosy @pitrou, @avassalotti, @serhiy-storchaka
Files
  • unpickling_mark_errors.patch
  • unpickling_eof_errors.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/serhiy-storchaka'
    closed_at = <Date 2016-09-08.08:07:23.589>
    created_at = <Date 2015-11-29.10:59:25.708>
    labels = ['extension-modules', 'tests', 'type-feature', 'library']
    title = 'Improve unpickling errors handling'
    updated_at = <Date 2016-09-08.08:07:23.586>
    user = 'https://github.com/serhiy-storchaka'

    bugs.python.org fields:

    activity = <Date 2016-09-08.08:07:23.586>
    actor = 'serhiy.storchaka'
    assignee = 'serhiy.storchaka'
    closed = True
    closed_date = <Date 2016-09-08.08:07:23.589>
    closer = 'serhiy.storchaka'
    components = ['Extension Modules', 'Library (Lib)', 'Tests']
    creation = <Date 2015-11-29.10:59:25.708>
    creator = 'serhiy.storchaka'
    dependencies = []
    files = ['41188', '41257']
    hgrepos = []
    issue_num = 25761
    keywords = ['patch']
    message_count = 7.0
    messages = ['255567', '255569', '255571', '256028', '256032', '256055', '274626']
    nosy_count = 4.0
    nosy_names = ['pitrou', 'alexandre.vassalotti', 'python-dev', 'serhiy.storchaka']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue25761'
    versions = ['Python 3.6']

    @serhiy-storchaka
    Copy link
    Member Author

    This issue is for better detecting and reporting errors in broken pickle data.

    @serhiy-storchaka serhiy-storchaka self-assigned this Nov 29, 2015
    @serhiy-storchaka serhiy-storchaka added extension-modules C modules in the Modules dir stdlib Python modules in the Lib dir tests Tests in the Lib/test dir type-feature A feature request or enhancement labels Nov 29, 2015
    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Nov 29, 2015

    New changeset d47e5b162072 by Serhiy Storchaka in branch '3.4':
    Issue bpo-25761: Added more test cases for testing unpickling broken data.
    https://hg.python.org/cpython/rev/d47e5b162072

    New changeset c7e7d77ef8bf by Serhiy Storchaka in branch '2.7':
    Issue bpo-25761: Added more test cases for testing unpickling broken data.
    https://hg.python.org/cpython/rev/c7e7d77ef8bf

    New changeset 4897438543da by Serhiy Storchaka in branch '3.5':
    Issue bpo-25761: Added more test cases for testing unpickling broken data.
    https://hg.python.org/cpython/rev/4897438543da

    New changeset c852c7d8d681 by Serhiy Storchaka in branch 'default':
    Issue bpo-25761: Added more test cases for testing unpickling broken data.
    https://hg.python.org/cpython/rev/c852c7d8d681

    @serhiy-storchaka
    Copy link
    Member Author

    Proposed patch allows unpickler to detect errors related to reading a stack behind a mark.

    For now Python implementation just pops a sentinel used as a mark. This can cause TypeError, AttributeError or UnpicklingError besides IndexError:

    >>> pickle._loads(b'}(NNs.')
    Traceback (most recent call last):
      ...
    TypeError: 'object' object does not support item assignment
    >>> pickle._loads(b'](Na.')
    Traceback (most recent call last):
      ...
    AttributeError: 'object' object has no attribute 'append'

    Or can silently expose the mark object:

    >>> pickle._loads(b')(.')
    <object object at 0xb71084b0>
    >>> pickle._loads(b']](a.')
    [<object object at 0xb71084d8>]

    C implementation just ignores incorrect mark:

    >>> pickle.loads(b'}(NNs.')
    {None: None}
    >>> pickle.loads(b'](Na.')
    [None]
    >>> pickle.loads(b')(.')
    ()
    >>> pickle.loads(b']](a.')
    [[]]

    But in case of complex data this can cause errors later.

    With the patch C implementation always raises UnpicklingError with relevant message and Python implementation always raises IndexError.

    >>> pickle.loads(b'}(NNs.')
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    NameError: name 'pickle' is not defined
    >>> import pickle, pickletools
    >>> pickle.loads(b'}(NNs.')
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    _pickle.UnpicklingError: unexpected MARK found
    >>> pickle.loads(b'](Na.')
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    _pickle.UnpicklingError: unexpected MARK found
    >>> pickle.loads(b')(.')
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    _pickle.UnpicklingError: unexpected MARK found
    >>> pickle.loads(b']](a.')
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    _pickle.UnpicklingError: unexpected MARK found
    >>> pickle._loads(b'}(NNs.')
    Traceback (most recent call last):
      ...
    IndexError: list index out of range
    >>> pickle._loads(b'](Na.')
    Traceback (most recent call last):
      ...
    IndexError: list index out of range
    >>> pickle._loads(b')(.')
    Traceback (most recent call last):
      ...
    IndexError: pop from empty list
    >>> pickle._loads(b']](a.')
    Traceback (most recent call last):
      ...
    IndexError: pop from empty list

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Dec 6, 2015

    New changeset 5c670af0100f by Serhiy Storchaka in branch 'default':
    Issue bpo-25761: Improved detecting errors in broken pickle data.
    https://hg.python.org/cpython/rev/5c670af0100f

    @serhiy-storchaka
    Copy link
    Member Author

    When pickle stream is unexpectedly ended, different exceptions can be raised. EOFError("Ran out of input") is raised when the stream is unexpectedly ended without the STOP opcode. But it can be raised also when the data for the opcode is incomplete. Other possible exceptions are UnpicklingError, AttributeError and ValueError.

    Examples:

    >>> pickle.loads(b'L')
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    _pickle.UnpicklingError: pickle data was truncated
    >>> pickle.loads(b'L10')
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    EOFError: Ran out of input
    >>> pickle.loads(b'L10L')
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    ValueError: invalid literal for int() with base 10: '10L'
    >>> pickle.loads(b"S'abc'")
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    _pickle.UnpicklingError: the STRING opcode argument must be quoted
    >>> pickle.loads(b'(cbuiltins\nlist')
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    AttributeError: Can't get attribute 'lis' on <module 'builtins' (built-in)>

    Following patch makes C implementation of unpickler always raise UnpicklingError("pickle data was truncated") if the data for the opcode is truncated (above examples). EOFError("Ran out of input") is raised when the stream is unexpectedly ended without the STOP opcode, as before.

    I'm not sure, may be always raise UnpicklingError or EOFError? Or change error message?

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Dec 7, 2015

    New changeset 001514146c21 by Serhiy Storchaka in branch 'default':
    Issue bpo-25761: Fixed reference leak added in previous changeset (5c670af0100f).
    https://hg.python.org/cpython/rev/001514146c21

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Sep 6, 2016

    New changeset 231f578dfd3d by Serhiy Storchaka in branch 'default':
    Issue bpo-25761: Improved error reporting about truncated pickle data in
    https://hg.python.org/cpython/rev/231f578dfd3d

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    extension-modules C modules in the Modules dir stdlib Python modules in the Lib dir tests Tests in the Lib/test dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant