This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Poor performance when reading large xmlrpc data
Type: performance Stage: patch review
Components: XML Versions: Python 3.6
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Oscar Andres Alvarez Montero, ced, loewis, pokoli, resteve, serhiy.storchaka, terry.reedy
Priority: normal Keywords: patch

Created on 2016-01-08 13:36 by pokoli, last changed 2022-04-11 14:58 by admin.

Files
File name Uploaded Description Edit
python27.patch pokoli, 2016-01-08 14:32 review
default.patch pokoli, 2016-01-08 16:05 review
default.patch ced, 2016-01-09 10:26 review
client.py ced, 2016-02-28 09:45
server.py ced, 2016-02-28 09:46
Messages (16)
msg257756 - (view) Author: Sergi Almacellas Abellana (pokoli) * Date: 2016-01-08 13:36
By default, python xmlrpclib parser reads data by chunks of 1024 bytes [1], which leads to a lot of data concatenations when reading large data, which is very slow in python.

Increasing the chuck size from 1024 bytes to a higher value makes improve in performance. 

On the same machine, we have tested with 20MB of data and we've got the following results: 

Chucks of 1024: 1min 48.933491sec
Chucks of 10 * 1024 * 1024: 0.245282sec

We have chosen 10 * 1024 * 1024, as it is the same value used in issue792570

We have done our tests on python2.7, but same code exists for default branch [2] (and 3.x branches also [3][4][5][6]), so I belive all the versions are affected. 

I can work on a patch if you think this change is acceptable.

IMHO it's logical to allow the developer to customize the chunk size instead of using a hard coded one. 

[1] https://hg.python.org/cpython/file/2.7/Lib/xmlrpclib.py#l1479
[2] https://hg.python.org/cpython/file/default/Lib/xmlrpc/client.py#l1310
[3] https://hg.python.org/cpython/file/3.5/Lib/xmlrpc/client.py#l1310
[4] https://hg.python.org/cpython/file/3.4/Lib/xmlrpc/client.py#l1324
[5] https://hg.python.org/cpython/file/3.3/Lib/xmlrpc/client.py#l1316
[6] https://hg.python.org/cpython/file/3.2/Lib/xmlrpc/client.py#l1301
msg257757 - (view) Author: Cédric Krier (ced) * Date: 2016-01-08 13:52
I don't think it is necessary to allow to customize the chunk size. Indeed Python should provide a good value by default that works for all platforms.
msg257759 - (view) Author: Sergi Almacellas Abellana (pokoli) * Date: 2016-01-08 14:32
I attach a patch to fix on default series
msg257760 - (view) Author: Sergi Almacellas Abellana (pokoli) * Date: 2016-01-08 14:32
And Also another patch for 2.7 branches
msg257764 - (view) Author: Sergi Almacellas Abellana (pokoli) * Date: 2016-01-08 16:05
I added a new patch which fixed comments
msg257774 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2016-01-08 19:14
I believe performance enhancements are, by default, limited to 'default', but this one might be a good candidate for backport.
msg257811 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-01-09 09:38
Reading with 10 MB limit allocates 10 MB buffer. It may be better to start with 1024 bytes limit, and increase it by 2 times on every iteration.
msg257815 - (view) Author: Cédric Krier (ced) * Date: 2016-01-09 10:00
Will it not be better indeed to just stream.read() without any argument?
Because HTTPResponse will call _safe_read with just the length of the header.
msg257816 - (view) Author: Cédric Krier (ced) * Date: 2016-01-09 10:26
Answering to myself, it is better to read by chunk to feed the parser also by chunk.
So here is a version of the patch which increases by 2 on every loop.
msg260444 - (view) Author: Cédric Krier (ced) * Date: 2016-02-18 09:33
ping
msg260767 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-02-24 08:24
Could you please provide a microbenchmark that exposes the benefit of this optimization?
msg260768 - (view) Author: Cédric Krier (ced) * Date: 2016-02-24 08:39
Is there an infrastructure already in place for such microbenchmark?
msg260791 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-02-24 11:13
There is no special infrastructure. You can just provide two files. One script should implement simple server that produces large output. Other script should implement the client that makes a number of requests and measure the time. Of course you can implement this in one file using multiprocessing if you prefer.
msg260970 - (view) Author: Cédric Krier (ced) * Date: 2016-02-28 09:46
Here is the client/server scripts.
I don't measure a big performance improvement with it.
I think the improvement measured in msg257756 are linked to the way xmlrpclib is overriden in Tryton.
msg260976 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-02-28 13:36
Thanks Cédric.

There is no improvement for the first request. But if repeat requests, there is small (about 5%) improvement. This is too small.
msg260978 - (view) Author: Cédric Krier (ced) * Date: 2016-02-28 13:52
One advantage, I see, is when xmlrpclib is overridden to use an other marshaller which is can not be feed chunk by chunk. So reducing the number of call to feed will have a bigger impact.
But I don't know if this is enough for Python.
History
Date User Action Args
2022-04-11 14:58:25adminsetgithub: 70237
2016-05-18 14:01:38Oscar Andres Alvarez Monterosetnosy: + Oscar Andres Alvarez Montero
2016-02-28 13:52:38cedsetmessages: + msg260978
2016-02-28 13:36:43serhiy.storchakasetmessages: + msg260976
2016-02-28 09:46:30cedsetfiles: + server.py

messages: + msg260970
2016-02-28 09:45:02cedsetfiles: + client.py
2016-02-24 11:13:19serhiy.storchakasetmessages: + msg260791
2016-02-24 08:39:21cedsetmessages: + msg260768
2016-02-24 08:24:07serhiy.storchakasetmessages: + msg260767
2016-02-18 09:33:44cedsetmessages: + msg260444
2016-01-22 09:33:07restevesetnosy: + resteve
2016-01-16 18:46:38serhiy.storchakasetstage: patch review
2016-01-09 10:26:45cedsetfiles: + default.patch

messages: + msg257816
2016-01-09 10:00:35cedsetmessages: + msg257815
2016-01-09 09:38:04serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg257811
2016-01-08 19:14:55terry.reedysetnosy: + loewis, terry.reedy

messages: + msg257774
versions: + Python 3.6
2016-01-08 16:05:49pokolisetfiles: - default.patch
2016-01-08 16:05:39pokolisetfiles: + default.patch

messages: + msg257764
2016-01-08 14:32:54pokolisetfiles: + python27.patch

messages: + msg257760
2016-01-08 14:32:35pokolisetfiles: + default.patch
keywords: + patch
messages: + msg257759
2016-01-08 13:52:28cedsettype: performance

messages: + msg257757
nosy: + ced
2016-01-08 13:36:38pokolicreate