|
msg68262 - (view) |
Author: Aaron Gallagher (habnabit) |
Date: 2008-06-16 04:06 |
Currently, pickle.py in the stdlib is limited by the python call stack.
For deeply recursive data structures, the default recursion limit of 1000
is not enough. The patch attached modifies pickle.py to instead use a
deque object as a call stack. Pickler.save and other methods that increase
the recursion depth are now generators which may yield either another
generator or None, where yielding a generator adds it to the call stack.
|
|
msg69007 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2008-06-30 14:46 |
Interesting. Why do you use a deque? You never seem to prepend or pop
from the beginning of it, so using a plain list should be as fast.
Also, your patch should add one or several unit tests to enforce the new
behaviour.
|
|
msg69118 - (view) |
Author: Aaron Gallagher (habnabit) |
Date: 2008-07-02 20:43 |
Ah, I didn't know that a list would be as fast for appending and popping.
I knew that lists were optimized for .append() and .pop(), but I didn't
know that a list would be just as fast as a deque if it was just used as a
stack.
And I'll be happy to write unit tests if it can be pointed out to me how
exactly they can be written. Should it just test to make sure pickling a
deeply nested object hierarchy can be pickled without raising a
RuntimeError? I tried to make this as transparent as possible of a change.
|
|
msg69126 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2008-07-02 21:10 |
Hi,
Le mercredi 02 juillet 2008 à 20:43 +0000, Aaron Gallagher a écrit :
> Aaron Gallagher <habnabit@gmail.com> added the comment:
>
> Ah, I didn't know that a list would be as fast for appending and popping.
> I knew that lists were optimized for .append() and .pop(), but I didn't
> know that a list would be just as fast as a deque if it was just used as a
> stack.
If you have a few minutes to spare, the best would be to measure the
speed of both possibilities, and opt for the fastest.
> And I'll be happy to write unit tests if it can be pointed out to me how
> exactly they can be written. Should it just test to make sure pickling a
> deeply nested object hierarchy can be pickled without raising a
> RuntimeError?
Yes, exactly. Just create a very deeply nested object (a list should be
sufficient I suppose), pickle it, unpickle it, and check that the result
is equal to the original. No need to do more. Don't bother catching the
eventual RuntimeError, if it is raised the test will be failed anyway.
Since your patch only regards the pickle module (not cPickle) I suppose
the test could go directly into Lib/test/test_pickle.py.
(oh and don't forget to check the test case fails without your
patch :-))
Thanks !
|
|
msg69620 - (view) |
Author: Alexandre Vassalotti (alexandre.vassalotti) *  |
Date: 2008-07-13 20:26 |
On some benchmark of my own, I get a good 2x slowdown when this patch is
applied. The benchmark consists of loading ~1.4 millions objects (mostly
dict, list, str and int) from a pickle string. It is basically a torture
test for the inner function calls overhead.
Anyway, the slowdown doesn't bother me much. I think a "stackless"
pickle module would be well appreciated by many Python users.
There is a few things that stops me from applying this patch. First, the
patch will break subclasses of Pickler that relies on the save() method
(Although the method is an implementation detail that shouldn't used).
Therefore, any patches that modifies the API of save() is straight out
for 2.x. And for Python 3.0, your patch should also remove recursion
from the _pickle module (the transparent C accelerator for pickle).
|
|
msg71510 - (view) |
Author: Aaron Gallagher (habnabit) |
Date: 2008-08-20 04:50 |
Alright, sorry this took so long. Hopefully this can still be included in
3.0.
Included is a patch that no longer uses collections.deque and also adds a
test case to test/test_pickle.py. The test catches RuntimeError and fails
the unittest. I didn't see anything that would behave like the opposite of
assertRaises except letting the exception propagate.
|
|
msg85761 - (view) |
Author: Tennessee Leeuwenburg (tleeuwenburg@gmail.com) |
Date: 2009-04-08 01:09 |
Aaron,
Could you please upload another patch against the current trunk, then
the issue could be flagged as requiring a patch review?
Thanks,
-Tennessee
|
|
msg85786 - (view) |
Author: Aaron Gallagher (habnabit) |
Date: 2009-04-08 22:41 |
Okay, here's a new version for the py3k trunk. I'm assuming that this is
not going to make it into 2.x at all, because of the API changes. This
patch only touches the python version of the code and adds a unit test
for testing whether pickle works with arbitrary nesting depth in
test.pickletest. The test is disabled for the C pickle tests currently,
since it causes python to run out of stack space and crash.
So, does _pickle.Pickler's API need to be changed as well? Right now,
pickle._Pickler.dump expects save (and save expects the other save_*
methods) to either return None or an iterable, where the iterable yields
either None or further iterables. I'm sure it wouldn't be hard to make
_pickle.Pickler.dump work the same way, but since C doesn't have
generators, I don't know how exactly the API would translate.
|
|
msg101451 - (view) |
Author: Sean Reifschneider (jafo) *  |
Date: 2010-03-21 20:16 |
Sorry for the delay in getting to this patch.
I've reviewed this patch and it seems fine to me. The only thing I see outstanding is the recommendation made by Alexandre about changing the C module to also implement this.
Aaron: You may want to take your question on the implementation to the python-dev mailing list: Whether it is necessary to implement it in the C module, and if so suggestions on how or help on doing it. Is that something you can do, Aaron?
|
|
msg101452 - (view) |
Author: Sean Reifschneider (jafo) *  |
Date: 2010-03-21 20:19 |
Ugh, I forgot to check the output of my test run before submitting the last reply. With the patch applied, "make test" fails with:
test_pickle
Fatal Python error: Cannot recover from stack overflow.
Fatal Python error: Cannot recover from stack overflow.
make: *** [test] Aborted (core dumped)
This is on Python 3 trunk, with the pickle3.patch applied (which applied cleanly).
For the Misc/NEWS I propose (in the Library section):
- Issue #3119: pickle.py can now handle deeply-nested data-structures
without reaching the Python call stack limit. NOTE: the pickle save()
method is now a generator, though sub-classes of Pickler shouldn't
override save().
|
|
msg110642 - (view) |
Author: Mark Lawrence (BreamoreBoy) |
Date: 2010-07-18 12:14 |
Sean has stated that the patch seems fine, what needs to be done to take this forward?
|
|
msg110834 - (view) |
Author: Aaron Gallagher (habnabit) |
Date: 2010-07-19 23:12 |
Here's a patch that fixes the unit tests. A new test was added that tried to test the C implementation of this though none exists.
|
|
| Date |
User |
Action |
Args |
| 2011-02-12 16:06:10 | eric.araujo | set | nosy:
jafo, jcea, pitrou, alexandre.vassalotti, schmir, habnabit, tleeuwenburg@gmail.com, BreamoreBoy superseder: eliminate recursion in pickling -> |
| 2010-07-19 23:12:43 | habnabit | set | files:
+ pickle4.patch keywords:
+ patch messages:
+ msg110834
|
| 2010-07-18 12:14:36 | BreamoreBoy | set | nosy:
+ BreamoreBoy messages:
+ msg110642
|
| 2010-03-21 20:19:21 | jafo | set | messages:
+ msg101452 |
| 2010-03-21 20:16:39 | jafo | set | priority: normal nosy:
+ jafo messages:
+ msg101451
|
| 2009-06-29 00:19:33 | alexandre.vassalotti | link | issue2480 superseder |
| 2009-06-29 00:19:33 | alexandre.vassalotti | unlink | issue2480 dependencies |
| 2009-04-17 02:57:47 | tleeuwenburg@gmail.com | set | keywords:
+ needs review, - patch stage: patch review |
| 2009-04-08 22:41:22 | habnabit | set | files:
+ pickle3.patch
messages:
+ msg85786 versions:
+ Python 3.1, - Python 2.6, Python 3.0 |
| 2009-04-08 01:09:44 | tleeuwenburg@gmail.com | set | nosy:
+ tleeuwenburg@gmail.com messages:
+ msg85761
|
| 2008-08-20 04:50:19 | habnabit | set | files:
+ pickle2.patch messages:
+ msg71510 |
| 2008-07-13 20:26:25 | alexandre.vassalotti | set | nosy:
+ alexandre.vassalotti messages:
+ msg69620 |
| 2008-07-07 05:16:58 | schmir | set | nosy:
+ schmir |
| 2008-07-07 04:48:05 | gregory.p.smith | link | issue2480 dependencies |
| 2008-07-07 04:47:48 | gregory.p.smith | set | superseder: eliminate recursion in pickling |
| 2008-07-02 21:10:04 | pitrou | set | messages:
+ msg69126 |
| 2008-07-02 20:43:52 | habnabit | set | messages:
+ msg69118 |
| 2008-06-30 14:46:04 | pitrou | set | nosy:
+ pitrou messages:
+ msg69007 |
| 2008-06-30 11:23:43 | jcea | set | nosy:
+ jcea |
| 2008-06-16 04:06:24 | habnabit | create | |