Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Idle: disable pickleability of user code objects #66185

Closed
pppery mannequin opened this issue Jul 16, 2014 · 20 comments
Closed

Idle: disable pickleability of user code objects #66185

pppery mannequin opened this issue Jul 16, 2014 · 20 comments
Assignees
Labels
topic-IDLE type-bug An unexpected behavior, bug, or error

Comments

@pppery
Copy link
Mannequin

pppery mannequin commented Jul 16, 2014

BPO 21986
Nosy @loewis, @terryjreedy, @kbkaiser, @pitrou, @serwy, @PCManticore, @serhiy-storchaka, @pppery
Files
  • idlelib.patch
  • issue21986.patch
  • test.py
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/terryjreedy'
    closed_at = <Date 2014-10-10.23:39:54.202>
    created_at = <Date 2014-07-16.00:13:43.806>
    labels = ['expert-IDLE', 'type-bug']
    title = 'Idle: disable pickleability of user code objects'
    updated_at = <Date 2014-10-13.07:21:34.837>
    user = 'https://github.com/pppery'

    bugs.python.org fields:

    activity = <Date 2014-10-13.07:21:34.837>
    actor = 'serhiy.storchaka'
    assignee = 'terry.reedy'
    closed = True
    closed_date = <Date 2014-10-10.23:39:54.202>
    closer = 'terry.reedy'
    components = ['IDLE']
    creation = <Date 2014-07-16.00:13:43.806>
    creator = 'ppperry'
    dependencies = []
    files = ['35968', '35969', '36855']
    hgrepos = []
    issue_num = 21986
    keywords = ['patch']
    message_count = 20.0
    messages = ['223161', '223173', '223176', '223193', '223201', '223205', '223208', '223243', '223244', '223255', '228887', '228906', '228912', '228917', '228932', '229046', '229048', '229139', '229214', '229225']
    nosy_count = 9.0
    nosy_names = ['loewis', 'terry.reedy', 'kbk', 'pitrou', 'roger.serwy', 'Claudiu.Popa', 'python-dev', 'serhiy.storchaka', 'ppperry']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue21986'
    versions = ['Python 3.4', 'Python 3.5']

    @pppery
    Copy link
    Mannequin Author

    pppery mannequin commented Jul 16, 2014

    In IDLE:

    >>> code = compile("dummy_code", "<test>", "exec")
    >>> pickle.dumps(code)
    "cidlelib.rpc\nunpickle_code\np0\n(S'c\\x00\\x00\\x00\\x00\\x00\\x00 \\x00\\x00\\x01\\x00\\x00\\x00@\\x00\\x00\\x00s\\x08\\x00\\x00\\x00e\\x00\\x00\\x01d\\x00\\x00S(\\x01\\x00\\x00\\x00N(\\x01\\x00\\x00\\x00t\\n\\x00\\x00\\x00dummy_code(\\x00\\x00\\x00\\x00(\\x00\\x00\\x00\\x00(\\x00\\x00\\x00\\x00s\\x06\\x00\\x00\\x00<test>t\\x08\\x00\\x00\\x00<module>\\x01\\x00\\x00\\x00s\\x00\\x00\\x00\\x00'\np1\ntp2\nRp3\n."
    
    
    Outside of IDLE:
    >>> code = compile("dummy_code", "<test>", "exec")
    >>> pickle.dumps(code)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "C:\Python27\lib\pickle.py", line 1374, in dumps
        Pickler(file, protocol).dump(obj)
      File "C:\Python27\lib\pickle.py", line 224, in dump
        self.save(obj)
      File "C:\Python27\lib\pickle.py", line 306, in save
        rv = reduce(self.proto)
      File "C:\Python27\lib\copy_reg.py", line 70, in _reduce_ex
        raise TypeError, "can't pickle %s objects" % base.__name__
    TypeError: can't pickle code objects

    Also, the error probably should be a pickle.PicklingError, not a TypeError.

    @pppery pppery mannequin added topic-IDLE type-bug An unexpected behavior, bug, or error labels Jul 16, 2014
    @terryjreedy
    Copy link
    Member

    Code is really a code object, so the compile does not seem to be the problem. In 2.7, on Win7, I get exactly the same output after removing the possibly spurious space in the string posted. (ppperry), what system (OS) are you using. (In the future, please report both system and Python version).

    3.4 gives a mild variation
    b'\x80\x03cidlelib.rpc\nunpickle_code\nq\x00CZ\xe3\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00@\x00\x00\x00s\x08\x00\x00\x00e\x00\x00\x01d\x00\x00S)\x01N)\x01Z\ndummy_code\xa9\x00r\x01\x00\x00\x00r\x01\x00\x00\x00z\x06<test>\xda\x08<module>\x01\x00\x00\x00s\x00\x00\x00\x00q\x01\x85q\x02Rq\x03.'

    I ran both test_pickle and test_pickletools from both the command line

    python -m test -v test_pickletools
    and from an Idle editor with F5, with no errors and apparently identical output.

    import pickle
    code = compile("dummy_code", "this is a file name", "exec")
    print(type(code))
    print('pickle: ', pickle.dumps(code))
    >>> produces
    <class 'code'>
    pickle:  b'\x80\x03cidlelib.rpc ... \x13this is a file name\...'

    which shows that the bytes come from picke, not from a garbled traceback.

    Since the bytes look like an actual pickle, I tried unpickling and it works:

    import pickle
    code = compile("'a string'", "", "eval")
    pick = pickle.dumps(code)
    code2 = pickle.loads(pick)
    print(eval(code), eval(code2))
    >>>
    a string a string
    
    import pickle
    code1 = compile("print('a string')", "", "exec")
    pick = pickle.dumps(code1)
    code2 = pickle.loads(pick)
    exec(code1)
    exec(code2)
    >>> 
    a string
    a string

    I do not see any Idle bug here, so this might be closed unless you want to make this a pickle enhancement issue to work on code objects in the standard interpreter. The 3.4 interpreter traceback gives more info on why the pickle fails there.

    >>> code0 = compile("'abc'", '', 'eval')
    >>> pick = pickle.dumps(code0)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    _pickle.PicklingError: Can't pickle <class 'code'>: attribute lookup code on builtins failed

    In Idle, __builtins__.code fails, just as in the interpreter, so pickle is doing something different, and more successful, in the slightly altered Idle user-process execution environment.

    @terryjreedy terryjreedy added stdlib Python modules in the Lib dir and removed topic-IDLE labels Jul 16, 2014
    @PCManticore
    Copy link
    Mannequin

    PCManticore mannequin commented Jul 16, 2014

    It works in IDLE because it registers a custom pickling for code objects, in idlelib.rpc:

      copyreg.pickle(types.CodeType, pickle_code, unpickle_code)

    where pickle_code / unpickle_code calls marshal.dumps/loads.

    Although, I admit that this is weird. If idlelib.rpc is using this for transferring data between RPC instances, that's okay, but leaking the behaviour in the IDLE's interactive interpreter is not that okay, because leads to different results and expectancies between IDLE and Python's interactive interpreter.

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Jul 16, 2014

    I agree with Claudiu. IDLE should pickle with a private dispatch_table.

    @pppery pppery mannequin added the OS-windows label Jul 16, 2014
    @PCManticore
    Copy link
    Mannequin

    PCManticore mannequin commented Jul 16, 2014

    Maybe something like the attached patch. It doesn't have tests, though, I didn't find any tests for the idlelib.rpc anyway.

    @serhiy-storchaka
    Copy link
    Member

    Instead of copying dispatch_table, use ChainMap.

    @PCManticore
    Copy link
    Mannequin

    PCManticore mannequin commented Jul 16, 2014

    Thanks, Serhiy.

    @terryjreedy
    Copy link
    Member

    There is no unittest module for rpc yet.

    Should the pickle test include a test that pickling a code object fails, and with the proper exception? Is ppperry correct about PicklingError?

    Chainmap is new in 3.3 so 2.7 would need the copy version of the patch.

    @terryjreedy terryjreedy added topic-IDLE and removed stdlib Python modules in the Lib dir OS-windows labels Jul 16, 2014
    @terryjreedy
    Copy link
    Member

    ppperry: Component Windows (or Macintosh) means Windows (or Mac) specific. The rpc code is general to all systems.

    @PCManticore
    Copy link
    Mannequin

    PCManticore mannequin commented Jul 16, 2014

    TypeError is raised only in Python 2, in Python 3 it's PicklingError.

    @PCManticore
    Copy link
    Mannequin

    PCManticore mannequin commented Oct 9, 2014

    Terry, can I do something to move this issue forward?

    @terryjreedy
    Copy link
    Member

    Here are the issues for me.

    1. Except for your interest, this is a lower priority for me that most all of the other 125 Idle issues.

    2. I am not familiar with pickling and wonder about the following in the patch. It deletes a statememt
      copyreg.pickle(types.CodeType, pickle_code, unpickle_code)
      that registers two functions. But the new code only registers pickle_code. Will
      message = pickle.loads(packet)
      then work? Maybe, since pickle_code has this.
      return unpickle_code, (ms,)
      Is this normal? Is registring unpickle_code not needed? Would it make any sense to wrap code objects in a CodePickler class with __getstate__ and __setstate__ methods?

    3. The interprocess communication implemented in rpc.py is central to Idle's execution of user code. So I would want to be confident that a patch does not break it. What I would like is a test script (executed by hand is ok) that works now, fails with the deletion of the copyreg statement, and works again with the rest of the patch.

    4. I have no idea if or from where a code object is sent. One way to search would be to grep idlelib for rpc calls. Another would be to add a print statement to pickle_code, start Idle from the console with 'python -m idlelib', and run through all the menu options that involve the user process while watching the console for messages.

    So you could try one of the searches methods to find a test. Or you can wait for someone who currently understands pickle well enough to review and apply the patches without a test.

    @terryjreedy terryjreedy changed the title Pickleability of code objects is inconsistent Idle: disable pickleability of user code objects Oct 9, 2014
    @serhiy-storchaka
    Copy link
    Member

    1. It is normal. The third argument of copyreg.pickle() is not used now.

    The patch LGTM.

    @serhiy-storchaka serhiy-storchaka changed the title Idle: disable pickleability of user code objects Pickleability of code objects is inconsistent Oct 9, 2014
    @serhiy-storchaka serhiy-storchaka changed the title Pickleability of code objects is inconsistent Idle: disable pickleability of user code objects Oct 9, 2014
    @PCManticore
    Copy link
    Mannequin

    PCManticore mannequin commented Oct 9, 2014

    Thank you for your feedback, Terry.

    1. IDLE is behaving differently than the builtin interpreter. It should be higher priority, because it
      leads beginners into believing that code objects are picklable.

    2. No, wrapping code objects in a CodePickler with __getstate__ et al will not work. What is important in my implementation of CodePickler is that the dispatch_table is private, so it will not leak private reduction function, as it happens right now.

    3. That doesn't make sense. What works now is pickling of code objects, removing the copyreg call, without my patch, will break the rpc as well. To test this, comment the copyreg.pickle call in idlelib.rpc and just define a function in IDLE, you'll get a PicklingError. I added a small test file, which can be used to test what happens now and what happens after my patch.

    4. Better, here's a traceback which can explain from where codes are sent. The traceback is from inside pickle_code, using code.InteractiveInterpreter.

    File "C:\Python34\lib\idlelib\PyShell.py", line 1602, in main
    root.mainloop()
    File "C:\Python34\lib\tkinter\init.py", line 1069, in mainloop
    self.tk.mainloop(n)
    File "C:\Python34\lib\tkinter\init.py", line 1487, in __call__
    return self.func(*args)
    File "C:\Python34\lib\idlelib\MultiCall.py", line 179, in handler
    r = li
    File "C:\Python34\lib\idlelib\PyShell.py", line 1188, in enter_callback
    self.runit()
    File "C:\Python34\lib\idlelib\PyShell.py", line 1229, in runit
    more = self.interp.runsource(line)
    File "C:\Python34\lib\idlelib\PyShell.py", line 671, in runsource
    return InteractiveInterpreter.runsource(self, source, filename)
    File "C:\Python34\lib\code.py", line 74, in runsource
    self.runcode(code)
    File "C:\Python34\lib\idlelib\PyShell.py", line 762, in runcode
    (code,), {})
    File "C:\Python34\lib\idlelib\rpc.py", line 256, in asyncqueue
    self.putmessage((seq, request))
    File "C:\Python34\lib\idlelib\rpc.py", line 348, in putmessage
    s = pickle.dumps(message)

    @terryjreedy
    Copy link
    Member

    Wanting to make sure that a patch does not break Idle makes perfect sense to me. As it turns out, running *any* code from the editor (even empty) works for these patches. Idle compiles the code to a code object *in the Idle process* and if no compile errors, ships the code object to the user process via rpc to be executed in the user process. I comfirmed this with a print inside pickle_code. With this clear, I have downloaded both patches to test on 2.7 and 3.4.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Oct 10, 2014

    New changeset 90c62e1f3658 by Terry Jan Reedy in branch '3.4':
    Issue bpo-21986: Idle now matches interpreter in not pickling user code objects.
    https://hg.python.org/cpython/rev/90c62e1f3658

    New changeset cb94764bf8be by Terry Jan Reedy in branch 'default':
    Merge with 3.4: bpo-21986, don't pickle user code objects.
    https://hg.python.org/cpython/rev/cb94764bf8be

    @terryjreedy
    Copy link
    Member

    Code objects get wrapped in 3 tuple layers before being 'pickled', so a private dispatch is the easiet solution. Since copyreg.dispatch_table has only two items to copy, I pushed a version of the first patch. Since private dispatch tables are new in 3.3 and not in 2.7, I withdraw the idea of patching 2.7.

    @terryjreedy terryjreedy self-assigned this Oct 10, 2014
    @serhiy-storchaka
    Copy link
    Member

    It would be good to add a test of rpc.dumps().

    Current code copies copyreg.dispatch_table at the moment of rpc import. But it
    can be changed later and these change will not affect private copy. For example
    the re module adds pickleability of compiled regex patterns. I think the use
    of ChainMap is more safe solution.

    @terryjreedy
    Copy link
    Member

    I agree that a test for dumps would be a good idea. Ditto for hundreds of other untested functions. I don't see this one as a priority.

    PyShell imports non-idlelib modules, including re, before idlelib modules, such as rpc. So the re addition is there, though I strongly doubt that compiled regexs are every sent to the user process. Since about 10 different messages of different types are sent as part of startup, the fact that Idle does start is strong evidence that dumps and much of the rest of rpc is ok. I would start a test of rpc by documenting the protocol and listing the messages types so at least one of each could be tested.

    @serhiy-storchaka
    Copy link
    Member

    PyShell imports non-idlelib modules, including re, before idlelib modules,
    such as rpc. So the re addition is there, though I strongly doubt that
    compiled regexs are every sent to the user process.

    But we can't guarantee that this alway will be so. Other stdlib modules can
    register picklers, and IDLE can import them in future. This code is not
    protected against future changes in other places of the code.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    topic-IDLE type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants