classification
Title: segfault with pickle if 4th or 5th item of tuple returned by __reduce__ is not an iterator
Type: crash Stage:
Components: Extension Modules Versions: Python 3.0, Python 3.1, Python 2.7, Python 2.6, Python 2.5
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: alexandre.vassalotti Nosy List: alexandre.vassalotti, amaury.forgeotdarc, ocean-city
Priority: normal Keywords: patch

Created on 2008-10-22 18:17 by ocean-city, last changed 2008-10-31 21:05 by alexandre.vassalotti. This issue is now closed.

Files
File name Uploaded Description Edit
save_reduce.patch amaury.forgeotdarc, 2008-10-22 22:34
fix_pickle_consistency_on_py3k.patch ocean-city, 2008-10-31 06:48 py3k
fix_pickle_consistency_on_trunk.patch ocean-city, 2008-10-31 08:00 trunk
Messages (12)
msg75094 - (view) Author: Hirokazu Yamamoto (ocean-city) * (Python committer) Date: 2008-10-22 18:17
Following code crashes. (See issue4170)

trunk/Modules/cPickle.c (save() or save_reduce()) needs more checks of
returned value from __reduce__.

class C(object):
    def __reduce__(self):
        return C, (), None, None, [] # 5th item is not an iterator
class D(object):
    def __reduce__(self):
        return D, (), None, [], None # 4th item is not an iterator

import sys
if sys.version_info[0] == 3:
    import pickle
else:
    import cPickle as pickle
pickle.dumps(C()) # crash
pickle.dumps(D()) # crash
msg75096 - (view) Author: Hirokazu Yamamoto (ocean-city) * (Python committer) Date: 2008-10-22 18:25
save() and save_reduce() seems to check same thing.
- tuple size
- 2nd item is tuple or not

Is this duplicated or intended? If we adds check for 4th/5th item,
should them be in both functions?
msg75118 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-10-22 22:34
The attached patch adds a test and corrects the crash.
msg75129 - (view) Author: Hirokazu Yamamoto (ocean-city) * (Python committer) Date: 2008-10-23 00:31
I think the patch is fine.
msg75388 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-10-30 22:25
Committed r67048 (trunk), r67056 (2.5), r67053 (2.6), r67059 (py3k).
msg75401 - (view) Author: Alexandre Vassalotti (alexandre.vassalotti) * (Python committer) Date: 2008-10-31 01:04
Thank you, Amaury, for fixing this. However, you forgot to also patch
the Python implementation of pickle, which makes the following test fail:

======================================================================
FAIL: test_reduce_bad_iterator (__main__.PyPicklerTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/alex/src/python.org/py3k/Lib/test/pickletester.py", line
892, in test_reduce_bad_iterator
    self.assertRaises(pickle.PickleError, self.dumps, C(), proto)
AssertionError: PickleError not raised by dumps


Also, I am not sure if moving the type and length check of the reduce
value into save_reduce() was a good idea. In pickle.py, the check must 
be done before calling save_reduce(), since the method is called with
the star-syntax "self.save_reduce(obj=obj, *rv)". In _pickle, there's no
such requirement; however I don't like the idea of adding needless
differences.
msg75407 - (view) Author: Hirokazu Yamamoto (ocean-city) * (Python committer) Date: 2008-10-31 06:48
>the Python implementation of pickle
This was out of my mind. I hope fix_pickle_consistency_on_py3k.patch can
fix the buildbot error. ;-)
msg75412 - (view) Author: Alexandre Vassalotti (alexandre.vassalotti) * (Python committer) Date: 2008-10-31 12:13
Iterators only need to provide a __next__() method. So, you don't have
to check __iter__.

Other than that, the patch looks good.
msg75413 - (view) Author: Hirokazu Yamamoto (ocean-city) * (Python committer) Date: 2008-10-31 12:52
Thank you for review.

>Iterators only need to provide a __next__() method. So, you don't have
>to check __iter__.

Hmm, but python document says,
(http://docs.python.org/library/stdtypes.html#typeiter)

>The iterator objects themselves are required to support the
>following two methods, which together form the iterator protocol:

Is this false information?
msg75425 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-10-31 17:55
I corrected the test in the py3k branch.
I don't know why I did not make it similar to the one in trunk; I guess 
I ran the test suite on the wrong directory... Sorry for the 
inconvenience.

Regarding the consistency between python and C, I don't think it is 
necessary to be so strict. If the python version can allow a weaker 
version of the "iterator" concept, let be it.
OTOH, in 3.0 the C version was rewritten to be quasi identical to the 
Python one, so I do understand your desire to have similar code.
msg75428 - (view) Author: Alexandre Vassalotti (alexandre.vassalotti) * (Python committer) Date: 2008-10-31 20:47
Hirokazu Yamamoto wrote
> Hmm, but python document says,
> (http://docs.python.org/library/stdtypes.html#typeiter)
>
> >The iterator objects themselves are required to support the
> >following two methods, which together form the iterator protocol:
>
> Is this false information?

Oh, you are right. I got confused by looking at the implementation of
PyIter_Check(), which only verifies that tp_next is not NULL. Anyway, 
_batch_appends() and _batch_setitems() in pickle.py calls iter() on the
iterators; so, the iterators must provide __iter__ too.

Personally, I wouldn't bother about checking for __iter__ to avoid
adding a new method to Pickler and also to save a few hasattr() calls. 
But I have to admit, it is really just nitpicking at this point.
msg75430 - (view) Author: Alexandre Vassalotti (alexandre.vassalotti) * (Python committer) Date: 2008-10-31 21:05
Amaury Forgeot d'Arc wrote:
> Regarding the consistency between python and C, I don't think it is 
> necessary to be so strict. If the python version can allow a weaker 
> version of the "iterator" concept, let be it.

I guess you're right. However, my only fear is that people will write
code for pickle.py and then, when they run it on _pickle, their code
will fail due to the stricter requirements. In addition, I think having
the same requirements make it easier to write tests for both
implementations.
History
Date User Action Args
2008-10-31 21:05:53alexandre.vassalottisetmessages: + msg75430
2008-10-31 20:47:31alexandre.vassalottisetmessages: + msg75428
2008-10-31 17:55:47amaury.forgeotdarcsetmessages: + msg75425
2008-10-31 12:52:17ocean-citysetmessages: + msg75413
2008-10-31 12:14:00alexandre.vassalottisetmessages: + msg75412
2008-10-31 08:00:23ocean-citysetfiles: + fix_pickle_consistency_on_trunk.patch
2008-10-31 06:48:59ocean-citysetkeywords: - needs review
files: + fix_pickle_consistency_on_py3k.patch
messages: + msg75407
2008-10-31 01:04:49alexandre.vassalottisetmessages: + msg75401
2008-10-30 22:25:56amaury.forgeotdarcsetstatus: open -> closed
resolution: fixed
messages: + msg75388
2008-10-23 00:31:55ocean-citysetmessages: + msg75129
2008-10-22 22:34:05amaury.forgeotdarcsetkeywords: + needs review, patch
files: + save_reduce.patch
messages: + msg75118
nosy: + amaury.forgeotdarc
2008-10-22 21:02:02benjamin.petersonsetassignee: alexandre.vassalotti
nosy: + alexandre.vassalotti
2008-10-22 18:25:47ocean-citysetmessages: + msg75096
2008-10-22 18:17:37ocean-citycreate