This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author vkuznet
Recipients bob.ippolito, pitrou, rhettinger, swalker, vkuznet
Date 2009-11-20.02:22:58
SpamBayes Score 3.2549444e-05
Marked as misclassified No
Message-id <1258683780.63.0.931565526551.issue6594@psf.upfronthosting.co.za>
In-reply-to
Content
Hi,
I just found this bug and would like to add my experience with 
performance of large JSON docs. I have a few JSON docs about 180MB in 
size which I read from data-services. I use python2.6, run on Linux, 64-
bit node w/ 16GB of RAM and 8 core CPU, Intel Xeon 2.33GHz each. I used 
both json and cjson modules to parse my documents. My observation that 
the amount of RAM used to parse such docs is about 2GB, which is a way 
too much. The total time spent about 30 seconds (using cjson). The 
content of my docs are very mixed, lists, strings, other dicts. I can 
provide them if it will be required, but it's 200MB :)

For comparison, I got the same data in XML and using 
cElementTree.iterparse I stay w/ 300MB RAM usage per doc, which is 
really reasonable to me.

I can provide some benchmarks and perform such tests if it will be 
required.
History
Date User Action Args
2009-11-20 02:23:00vkuznetsetrecipients: + vkuznet, rhettinger, bob.ippolito, pitrou, swalker
2009-11-20 02:23:00vkuznetsetmessageid: <1258683780.63.0.931565526551.issue6594@psf.upfronthosting.co.za>
2009-11-20 02:22:59vkuznetlinkissue6594 messages
2009-11-20 02:22:58vkuznetcreate