Message 336081 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	bmerry
Recipients	bmerry
Date	2019-02-20.12:55:43
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1550667343.87.0.589252473583.issue36050@roundup.psfhosted.org>
In-reply-to

Content
While investigating poor HTTP read performance I discovered that reading all the data from a response with a content-length goes via _safe_read, which in turn reads in chunks of at most MAXAMOUNT (1MB) before stitching them together with b"".join. This can really hurt performance for responses larger than MAXAMOUNT, because (a) the data has to be copied an additional time; and (b) the join operation doesn't drop the GIL, so this limits multi-threaded scaling. I'm struggling to see any advantage in doing this chunking - it's not saving memory either (in fact it is wasting it). To give an idea of the performance impact, changing MAXAMOUNT to a very large value made a multithreaded test of mine go from 800MB/s to 2.5GB/s (which is limited by the network speed).

While investigating poor HTTP read performance I discovered that reading all the data from a response with a content-length goes via _safe_read, which in turn reads in chunks of at most MAXAMOUNT (1MB) before stitching them together with b"".join. This can really hurt performance for responses larger than MAXAMOUNT, because
(a) the data has to be copied an additional time; and
(b) the join operation doesn't drop the GIL, so this limits multi-threaded scaling.

I'm struggling to see any advantage in doing this chunking - it's not saving memory either (in fact it is wasting it).

To give an idea of the performance impact, changing MAXAMOUNT to a very large value made a multithreaded test of mine go from 800MB/s to 2.5GB/s (which is limited by the network speed).

History
Date	User	Action	Args
2019-02-20 12:55:43	bmerry	set	recipients: + bmerry
2019-02-20 12:55:43	bmerry	set	messageid: <1550667343.87.0.589252473583.issue36050@roundup.psfhosted.org>
2019-02-20 12:55:43	bmerry	link	issue36050 messages
2019-02-20 12:55:43	bmerry	create