classification
Title: test_cpickle crash on AMD64 Windows build
Type: crash Stage:
Components: Library (Lib), Tests Versions: Python 2.5.3, Python 2.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: amaury.forgeotdarc Nosy List: alexandre.vassalotti, amaury.forgeotdarc, jcea, loewis, mhammond, pitrou, taleinat
Priority: release blocker Keywords: 64bit, patch

Created on 2008-08-21 23:21 by mhammond, last changed 2008-11-17 22:57 by loewis. This issue is now closed.

Files
File name Uploaded Description Edit
cpickle.patch amaury.forgeotdarc, 2008-08-22 17:39
find_recursion_limit.patch pitrou, 2008-08-22 18:13
cpickle-3.0.patch amaury.forgeotdarc, 2008-08-23 12:08
Messages (16)
msg71697 - (view) Author: Mark Hammond (mhammond) * (Python committer) Date: 2008-08-21 23:21
[from python-dev]
I've found a recursion related crash in test_cpickle on 64bit builds. 
Specifically, the 'cPickleDeepRecursive' is causing a stack overflow,
and the debugger tells me the actual recursion depth was 629 at the crash.

The reason the 64bit build doesn't see more crashes is apparently due to
another regression in 2.6 - http://bugs.python.org/issue3373.  It
appears that in some cases, the recursion counter is actually
incremented *twice* for each entry, thereby causing the "maximum
recursion depth exceeded" exception to appear at a true recursion limit
of 500.  However, test_cpickle takes a different path and doesn't see
this doubling of the count - therefore dieing at the depth of 629 that I
can see.

My solution to this was to simply double the stack size for the
executables in 64bit builds, from 2MB to 4MB (2.1 and 4.2 for debug
builds.)  Is this an appropriate fix?
msg71732 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-08-22 09:18
In my opinion this is a bug in cPickle. It is completely wrong to
consume more than 3KB of stack space for each recursion level, and it
should be fixed.
msg71758 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-08-22 16:56
The problem comes from this local variable in batch_list() and batch_dict:
    PyObject *slice[BATCHSIZE];
and BATCHSIZE=1000...
msg71760 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-08-22 17:39
Attached patch removes the big array, items are saved when they are
fetched from the container.
I preserved the the special case when there is only one item.

The patch does not merge cleanly into py3k, but the functions batch_list
and batch_dict look very similar and suffer the same problem.
msg71764 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-08-22 18:13
Here is an additional patch which adds a cpickle test to
Misc/find_recursion_limit.py. It helps show that Amaury's patch does fix
the issue. +1 for applying it.
msg71799 - (view) Author: Mark Hammond (mhammond) * (Python committer) Date: 2008-08-23 01:26
Sorry for the initial noise - your analysis is correct, mine was flawed
:)  Simple recursion to a depth of 1000 does work fine on a 64bit build.

cpickle.patch does make test_cpickle pass for me.  FWIW,
find_recursionlimit.py now causes a segfault on 32 and 64bit Windows
(the comments imply a MemoryError is expected there), and the 32bit
version dies at a depth of 5900, while the 64bit version dies at 3800
(both doing an 'add') - so it does seem there is still a discrepancy -
but not one we need to care about now.

I'm +1 in general on this, but I'm not sure that I'm +1 on this patch
for 1.6 without more careful review than I am able to offer at the
moment; it seems fairly unlikely to be hit "in the wild", but I'll
obviously defer that to others.  OTOH, having the test suite segfault
isn't a good look, so we probably need to do *something* for 1.6
msg71804 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-08-23 09:18
Le samedi 23 août 2008 à 01:26 +0000, Mark Hammond a écrit :
> cpickle.patch does make test_cpickle pass for me.  FWIW,
> find_recursionlimit.py now causes a segfault on 32 and 64bit Windows
> (the comments imply a MemoryError is expected there), and the 32bit
> version dies at a depth of 5900, while the 64bit version dies at 3800
> (both doing an 'add') - so it does seem there is still a discrepancy -
> but not one we need to care about now.

Well, the discrepancy is expected, since pointers are bigger in 64-bit
mode than in 32-bit mode, and most function calls inside the interpreter
involve pushing a bunch of PyObject pointers on the stack.
msg71805 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-08-23 12:08
Adapted the patch to python 3.0
msg72990 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-09-10 22:20
Can someone have a look at this patch?
It is not 64bit nor Windows specific; its goal is to reduce the size of 
local variables in batch_list(), which is easy to verify.

The point is to check the correctness of the code, specially around 
errors checks and reference counting.
msg73008 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2008-09-11 07:19
The patch is fine, please apply. My only question is why the 3.0 patch
integrates the find_recursionlimit change, whereas the 2.6 patch does not.
msg73014 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-09-11 11:05
- cpickle.patch and find_recursion_limit.patch are for 2.6
- cpickle.patch-3.0.patch groups the two previous patches, adapted to 3.0.
msg73015 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2008-09-11 12:10
Ah, ok. It's all fine, then.
msg73024 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-09-11 13:36
By the way, this isn't caused by the present issue, but you'll notice
that in the latest trunk, some tests in find_recursion_limit.py fail
with an AttributeError instead of RuntimeError. This is likely caused by
a PyDict_GetItem() call discarding the RuntimeError somewhere and
returning NULL, which in turn raises an AttributeError (as I explained
on the mailing-list).

A quick way to fix it would be to use "except (AttributeError,
RuntimeError)" in the testing func of find_recursion_limit.py.
msg73058 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-09-11 21:05
Committed as r66390 and r66391.
msg73132 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-09-12 20:07
I've opened #3850 for the find_recursion_limit problem.
msg75974 - (view) Author: Tal Einat (taleinat) * (Python committer) Date: 2008-11-17 16:55
Will this be back-ported to 2.5.3? (please say yes...)
History
Date User Action Args
2009-05-16 20:35:06ajaksu2linkissue3338 dependencies
2008-11-17 22:57:49loewissetversions: + Python 2.5.3
2008-11-17 16:55:21taleinatsetnosy: + taleinat
messages: + msg75974
2008-09-12 20:07:45pitrousetmessages: + msg73132
2008-09-11 21:05:17amaury.forgeotdarcsetstatus: open -> closed
resolution: accepted -> fixed
messages: + msg73058
2008-09-11 13:36:44pitrousetmessages: + msg73024
2008-09-11 12:10:13loewissetkeywords: - needs review
messages: + msg73015
2008-09-11 11:05:15amaury.forgeotdarcsetmessages: + msg73014
2008-09-11 07:19:01loewissetassignee: mhammond -> amaury.forgeotdarc
resolution: accepted
messages: + msg73008
nosy: + loewis
2008-09-10 22:32:42benjamin.petersonsetnosy: + alexandre.vassalotti
2008-09-10 22:20:27amaury.forgeotdarcsetmessages: + msg72990
2008-09-06 04:08:40jceasetnosy: + jcea
2008-08-23 12:08:26amaury.forgeotdarcsetfiles: + cpickle-3.0.patch
messages: + msg71805
2008-08-23 09:18:47pitrousetmessages: + msg71804
2008-08-23 01:26:54mhammondsetmessages: + msg71799
2008-08-22 18:13:26pitrousetfiles: + find_recursion_limit.patch
messages: + msg71764
2008-08-22 17:40:00amaury.forgeotdarcsetkeywords: + needs review, patch
files: + cpickle.patch
messages: + msg71760
2008-08-22 16:56:36amaury.forgeotdarcsetnosy: + amaury.forgeotdarc
messages: + msg71758
2008-08-22 09:18:46pitrousetpriority: release blocker
messages: + msg71732
components: + Library (Lib)
2008-08-21 23:22:00mhammondsetkeywords: + 64bit
assignee: mhammond
2008-08-21 23:21:42mhammondcreate