Author rhettinger
Recipients bob.ippolito, pitrou, rhettinger, swalker
Date 2009-08-05.19:35:42
SpamBayes Score 5.5233e-11
Marked as misclassified No
Message-id <1249500945.78.0.078222141291.issue6594@psf.upfronthosting.co.za>
In-reply-to
Content
Are you sure that recursion depth is the issue?  Have you tried the same
number and kind of objects listed serially (unnested)?  This would help
rule-out memory allocation issues and would instead confirm that it has
something to do with the C stack.

It would be helpful if you uploaded your test data strings and timing
suite.  Are you able to run a C profile so we can tell where the hotspot
is?  Can you run PyYAML over the same data to see if it is similarly
afflicted (yaml is a superset of json).

Also, try timing a repr() serialization of the same data,
x=repr(rootobj).  The repr code also uses recursion and it has to build
a big string in memory.  It has to visit every node, so it will reveal
whether memory cache misses are the culprit.  

Try your timings with GC turned-off so that we can rule that out.

Do you have some option to compile with an alternate memory allocator
(such as dlmalloc).  A crummy memory allocator may be the issue since
serialization entails creating many small strings, then joining and
resizing them.

Also, try serializing to /dev/null so that we can exclude fileio issues
(buffering and whatnot).

Sorry for all the requests, but there are many possible culprits and I
think it unlikely that recursion is the cause (much of the code in
Python works recursively -- everything from repr to gc -- so if that
were the problem, everything would run slower, not just json serialization).
History
Date User Action Args
2009-08-05 19:35:46rhettingersetrecipients: + rhettinger, bob.ippolito, pitrou, swalker
2009-08-05 19:35:45rhettingersetmessageid: <1249500945.78.0.078222141291.issue6594@psf.upfronthosting.co.za>
2009-08-05 19:35:44rhettingerlinkissue6594 messages
2009-08-05 19:35:42rhettingercreate