classification
Title: Faster marshalling
Type: performance Stage: resolved
Components: Interpreter Core Versions: Python 3.5
process
Status: closed Resolution: fixed
Dependencies: 23392 Superseder:
Assigned To: serhiy.storchaka Nosy List: kristjan.jonsson, python-dev, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2015-01-28 22:22 by serhiy.storchaka, last changed 2015-02-11 15:51 by serhiy.storchaka. This issue is now closed.

Files
File name Uploaded Description Edit
marshal_faster_write.patch serhiy.storchaka, 2015-01-28 22:22 review
Messages (4)
msg234920 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-01-28 22:22
Currently writing marshalled data to buffer is not very efficient. Data is written byte by byte with testing conditions p->fp != NULL and p->ptr != p->end for every byte. Proposed patch makes writing to buffer faster.

Benchmark results:

$ ./python -m timeit -s "import marshal; d = compile(open('Lib/_pydecimal.py').read(), '_pydecimal.py', 'exec')" -- "marshal.dumps(d)"
Unpatched: 100 loops, best of 3: 4.64 msec per loop
Patched: 100 loops, best of 3: 3.39 msec per loop

$ ./python -m timeit -s "import marshal; a = ['%010x' % i for i in range(10**4)]" -- "marshal.dumps(a)"
Unpatched: 1000 loops, best of 3: 1.96 msec per loop
Patched: 1000 loops, best of 3: 1.32 msec per loop

$ ./python -m timeit -s "import marshal; a = ['%0100x' % i for i in range(10**4)]" -- "marshal.dumps(a)"
Unpatched: 100 loops, best of 3: 10.3 msec per loop
Patched: 100 loops, best of 3: 3.39 msec per loop
msg235347 - (view) Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) Date: 2015-02-03 15:37
looks good to me, although it has been pointed out that marshal _write_ speed is less critical than read speed :)
msg235747 - (view) Author: Roundup Robot (python-dev) Date: 2015-02-11 13:55
New changeset bb05f845e7dc by Serhiy Storchaka in branch 'default':
Issue #23344: marshal.dumps() is now 20-25% faster on average.
https://hg.python.org/cpython/rev/bb05f845e7dc
msg235752 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-02-11 15:51
Thank you for your review Kristján. Together with issue20416 this increases dumping speed almost twice for typical module data and up to 5x for some data.
History
Date User Action Args
2015-02-11 15:51:27serhiy.storchakasetstatus: open -> closed
resolution: fixed
messages: + msg235752

stage: patch review -> resolved
2015-02-11 13:55:58python-devsetnosy: + python-dev
messages: + msg235747
2015-02-04 07:34:24serhiy.storchakasetassignee: serhiy.storchaka
dependencies: + Add tests for marshal FILE* API
2015-02-03 15:37:13kristjan.jonssonsetmessages: + msg235347
2015-01-29 16:43:41serhiy.storchakasetnosy: + kristjan.jonsson
2015-01-28 22:22:06serhiy.storchakacreate