Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues with BINUNICODE8 and BINBYTES8 opcodes in pickle #69449

Closed
serhiy-storchaka opened this issue Sep 29, 2015 · 5 comments
Closed

Issues with BINUNICODE8 and BINBYTES8 opcodes in pickle #69449

serhiy-storchaka opened this issue Sep 29, 2015 · 5 comments
Assignees
Labels
extension-modules C modules in the Modules dir stdlib Python modules in the Lib dir tests Tests in the Lib/test dir type-bug An unexpected behavior, bug, or error

Comments

@serhiy-storchaka
Copy link
Member

BPO 25262
Nosy @pitrou, @vstinner, @avassalotti, @serhiy-storchaka
Files
  • pickle_binbytes8.patch
  • pickle_binbytes8_2.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/serhiy-storchaka'
    closed_at = <Date 2015-09-29.19:15:57.766>
    created_at = <Date 2015-09-29.06:14:22.762>
    labels = ['extension-modules', 'tests', 'type-bug', 'library']
    title = 'Issues with BINUNICODE8 and BINBYTES8 opcodes in pickle'
    updated_at = <Date 2015-09-29.19:15:57.765>
    user = 'https://github.com/serhiy-storchaka'

    bugs.python.org fields:

    activity = <Date 2015-09-29.19:15:57.765>
    actor = 'serhiy.storchaka'
    assignee = 'serhiy.storchaka'
    closed = True
    closed_date = <Date 2015-09-29.19:15:57.766>
    closer = 'serhiy.storchaka'
    components = ['Extension Modules', 'Library (Lib)', 'Tests']
    creation = <Date 2015-09-29.06:14:22.762>
    creator = 'serhiy.storchaka'
    dependencies = []
    files = ['40615', '40621']
    hgrepos = []
    issue_num = 25262
    keywords = ['patch']
    message_count = 5.0
    messages = ['251823', '251826', '251856', '251876', '251877']
    nosy_count = 5.0
    nosy_names = ['pitrou', 'vstinner', 'alexandre.vassalotti', 'python-dev', 'serhiy.storchaka']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue25262'
    versions = ['Python 3.4', 'Python 3.5', 'Python 3.6']

    @serhiy-storchaka
    Copy link
    Member Author

    There are issues with BINUNICODE8 and BINBYTES8 opcodes in pickle.

    1. Unpickling BINBYTES8 is not implemented in Python implementation.

    2. Highest 32 bits of 64-bit size are silently ignored in C implementation on 32-bit platforms.

    3. There are no tests for BINUNICODE8 and BINBYTES8.

    Proposed patch fixes these issues.

    @serhiy-storchaka serhiy-storchaka self-assigned this Sep 29, 2015
    @serhiy-storchaka serhiy-storchaka added extension-modules C modules in the Modules dir stdlib Python modules in the Lib dir tests Tests in the Lib/test dir type-bug An unexpected behavior, bug, or error labels Sep 29, 2015
    @vstinner
    Copy link
    Member

    To access an array item, the type of the index variable must be size_t (or a type with the same size), otherwise the compiler produce less efficient machine code:
    http://www.viva64.com/en/a/0050/

    => please keep Py_ssize_t type for i

    (I didn't check for the specific case of Py_ssize_t: it's signed. Does GCC emit the most efficient machine code for it?)

    diff -r 16c8278c03f6 Modules/_pickle.c
    --- a/Modules/_pickle.c
    +++ b/Modules/_pickle.c
    @@ -4606,10 +4606,17 @@ static Py_ssize_t
     calc_binsize(char *bytes, int nbytes)
     {
         unsigned char *s = (unsigned char *)bytes;
    -    Py_ssize_t i;
    +    int i;
         size_t x = 0;
     
    -    for (i = 0; i < nbytes && (size_t)i < sizeof(size_t); i++) {
    +    if (nbytes > (int)sizeof(size_t)) {
    +        for (i = (int)sizeof(size_t); i < nbytes; i++) {
    +            if (s[i])
    +                return -1;
    +        }
    +        nbytes = (int)sizeof(size_t);
    +    }

    Please add a comment here to explain that the first loop check for integer overflow, it's not obvious at the first read.

    Does the Python implementation of pickle produce BINBYTES8? If not: why not?

    Note: the patch is probably based on a private Mercurial revision, so it didn't get the [Review] button.

    @serhiy-storchaka
    Copy link
    Member Author

    To access an array item, the type of the index variable must be size_t (or a
    type with the same size), otherwise the compiler produce less efficient
    machine code: http://www.viva64.com/en/a/0050/

    => please keep Py_ssize_t type for i

    i is small integer between 0 and 8. I don't think that this article is
    related. Py_ssize_t is signed, and i is compared with int in a loop. And isn't
    Viva64 work only with Visual C++? At least I found statements that can be not
    correct on platforms not supported by Microsoft.

    Please add a comment here to explain that the first loop check for integer
    overflow, it's not obvious at the first read.

    Done.

    Does the Python implementation of pickle produce BINBYTES8? If not: why not?

    Yes, it produces BINBYTES8 on 64-bit platform.

    Note: the patch is probably based on a private Mercurial revision, so it
    didn't get the [Review] button.

    Yes, I had made some refactoring for unpickling tests and forgot to push they.
    Here is rebased patch.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Sep 29, 2015

    New changeset d4f8316d0860 by Serhiy Storchaka in branch '3.4':
    Issue bpo-25262. Added support for BINBYTES8 opcode in Python implementation of
    https://hg.python.org/cpython/rev/d4f8316d0860

    New changeset da9ad20dd470 by Serhiy Storchaka in branch '3.5':
    Issue bpo-25262. Added support for BINBYTES8 opcode in Python implementation of
    https://hg.python.org/cpython/rev/da9ad20dd470

    New changeset 8de1967edfdb by Serhiy Storchaka in branch 'default':
    Issue bpo-25262. Added support for BINBYTES8 opcode in Python implementation of
    https://hg.python.org/cpython/rev/8de1967edfdb

    @serhiy-storchaka
    Copy link
    Member Author

    Thank you Victor and Antoine for your reviews.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    extension-modules C modules in the Modules dir stdlib Python modules in the Lib dir tests Tests in the Lib/test dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants