Author swalker
Recipients bob.ippolito, pitrou, rhettinger, swalker
Date 2009-08-06.18:46:47
SpamBayes Score 0.0
Marked as misclassified No
Message-id <1249584410.51.0.0369251240652.issue6594@psf.upfronthosting.co.za>
In-reply-to
Content
First, I want to apologise for not providing more detail initially. 
Notably, one thing you may want to be aware of is that I'm using python
2.4.4 with the latest version of simplejson.  So my timings and
assumptions here are based on the fact that simplejson was adopted as
the 'json' module in python, and I filed the bug here as it appeared
that is where bugs are being tracked for the json module.

To answer your questions though, no, I can't say with certainty that
recursion depth is the issue.  That's just a theory proposed by a
developer intimately familiar with SPARC architecture, who said register
windows on SPARC tend to cause recursive call structures to execute
poorly.  It also seemed to play itself out empirically throughout
testing I performed where any reduction in the depth of the structure
would shave seconds off the write times on the SPARC systems I tested.

I'm also willing to try many of the other things you listed, but I will
have to get back to you on that as I have a project coming due soon.

With that said, I can provide sample data soon, and will do so.  I'll
attach the resulting gzip'd JSON file to make it easy to read and dump.

I would also note that:

* I have tried serialising using cStringIO, which made no significant
difference in performance.

* I have tried different memory allocators, which only seemed to make
things slower, or made little difference.

* Writing roughly the same amount of data (in terms of megabytes), but
in a flatter structure, also increased the performance of the serializer.

* In my testing, it seemed dict serialisation in particular was
problematic from a performance standpoint.

* If I recall correctly from the profile I did, iterencode_dict was
where most of the time was eaten, but I can redo the profile for a more
accurate analysis.

As for Antoine's comments:

I'd like to believe Python is very useful software, and any platform it
it runs on means that the respective market capitalization of the
platform is irrelevant; better performing Python is always good.
History
Date User Action Args
2009-08-06 18:46:50swalkersetrecipients: + swalker, rhettinger, bob.ippolito, pitrou
2009-08-06 18:46:50swalkersetmessageid: <1249584410.51.0.0369251240652.issue6594@psf.upfronthosting.co.za>
2009-08-06 18:46:49swalkerlinkissue6594 messages
2009-08-06 18:46:47swalkercreate