Message67401
First and foremost: do not use XML for bulk data transport. It is
HORRIBLY inefficient.
I've been playing with this on Linux and OS X with various (trunk 2.6,
release25-maint and 2.5.2) pythons:
I was never able to reproduce the malloc failures on my systems, testing
with data sizes up to 100mb. It likely takes a specific set of
conditions to reproduce exactly that problem but I do understand how it
could happen.
Anyways one -likely- source of such problems was the socket module
_fileobject.recv() code's long lived over-allocated+realloced strings.
This was "fixed" in release25-maint [to become 2.5.3] (actually it
caused a perf regression in other code) and the fix was fixed to solve
the perf regression in trunk and will be backported... Too much history
to sum up there. See http://bugs.python.org/issue2632 and the older
issues it links to for details.
I cannot claim that the above solves this problem because the bulk of
the actual memory used is the XML parser's fault:
Instrumenting the SimpleXMLRPCServer do_POST code I see the following:
The majority of the memory bloat to handle a request (bloat appears to
be 5-10x the size of the Binary data blob in question!) comes from the
XML parser called by xmlrpclib.loads() from SimpleXMLRPCServer's
_marshaled_dispatch() method.
Why? Its XML. On top of that it is not being parsed and decoded as a
stream. |
|
Date |
User |
Action |
Args |
2008-05-27 00:10:06 | gregory.p.smith | set | spambayes_score: 0.0333561 -> 0.033356097 recipients:
+ gregory.p.smith, hwaara |
2008-05-27 00:10:06 | gregory.p.smith | set | spambayes_score: 0.0333561 -> 0.0333561 messageid: <1211847006.25.0.0588132551762.issue2901@psf.upfronthosting.co.za> |
2008-05-27 00:10:05 | gregory.p.smith | link | issue2901 messages |
2008-05-27 00:10:04 | gregory.p.smith | create | |
|