This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Rewrite plistlib with functional style
Type: performance Stage: patch review
Components: Library (Lib) Versions: Python 3.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: ronaldoussoren, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2017-12-01 19:28 by serhiy.storchaka, last changed 2022-04-11 14:58 by admin.

Pull Requests
URL Status Linked Edit
PR 4671 open serhiy.storchaka, 2017-12-01 21:18
Messages (3)
msg307404 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-12-01 19:28
The proposed PR rewrites the plistlib module using a functional style. This speeds up loading and saving plist files at least by 10%. Saving plist files in XML format have sped up almost twice.


$ ./python -m timeit -s 'import plistlib; a = list(range(100))' -- 'plistlib.dumps(a, fmt=plistlib.FMT_XML)'
Unpatched:  1000 loops, best of 5: 228 usec per loop
Patched:    1000 loops, best of 5: 204 usec per loop

$ ./python -m timeit -s 'import plistlib; a = list(range(100))' -- 'plistlib.dumps(a, fmt=plistlib.FMT_BINARY)'
Unpatched:  1000 loops, best of 5: 234 usec per loop
Patched:    1000 loops, best of 5: 203 usec per loop

$ ./python -m timeit -s 'import plistlib; a = list(range(100)); p = plistlib.dumps(a, fmt=plistlib.FMT_XML)' -- 'plistlib.loads(p)'
Unpatched:  1000 loops, best of 5: 308 usec per loop
Patched:    2000 loops, best of 5: 155 usec per loop

$ ./python -m timeit -s 'import plistlib; a = list(range(100)); p = plistlib.dumps(a, fmt=plistlib.FMT_BINARY)' -- 'plistlib.loads(p)'
Unpatched:  2000 loops, best of 5: 116 usec per loop
Patched:    5000 loops, best of 5: 94.6 usec per loop


$ ./python -m timeit -s 'import plistlib; a = {"a%d" % i: i for i in range(100)}' -- 'plistlib.dumps(a, fmt=plistlib.FMT_XML)'
Unpatched:  500 loops, best of 5: 433 usec per loop
Patched:    1000 loops, best of 5: 384 usec per loop

$ ./python -m timeit -s 'import plistlib; a = {"a%d" % i: i for i in range(100)}' -- 'plistlib.dumps(a, fmt=plistlib.FMT_BINARY)'
Unpatched:  500 loops, best of 5: 616 usec per loop
Patched:    500 loops, best of 5: 560 usec per loop

$ ./python -m timeit -s 'import plistlib; a = {"a%d" % i: i for i in range(100)}; p = plistlib.dumps(a, fmt=plistlib.FMT_XML)' -- 'plistlib.loads(p)'
Unpatched:  500 loops, best of 5: 578 usec per loop
Patched:    1000 loops, best of 5: 308 usec per loop

$ ./python -m timeit -s 'import plistlib; a = {"a%d" % i: i for i in range(100)}; p = plistlib.dumps(a, fmt=plistlib.FMT_BINARY)' -- 'plistlib.loads(p)'
Unpatched:  1000 loops, best of 5: 257 usec per loop
Patched:    1000 loops, best of 5: 208 usec per loop
msg307508 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2017-12-03 11:44
I don't have time to perform a review right now, I'm trying to get PEP 447 through review and that takes most of my available time at the moment.

I'm not convinced that the speedup of plistlib is relevant for real-world code, plist files are intended as simple configuration files and tend to contain little data and should be read/written only sporadically.

That said some people appear to abuse plistlib to process other files which are probably NSKeyedArchiver archives, and those can be a lot larger. But I'm opposed to explicitly supporting that use case, because the format of NSKeyedArchiver files is completely undocumented.
msg309058 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-12-26 10:47
I have made this PR because functional style looks to me more for this kind of tasks. For every serialization or deseralization we have a distinct set of functions with common state. The state can be passes between functions as attributes of a one-time object or as non-local variables. The latter looks syntactically cleaner to me and, as a side effect, is faster.
History
Date User Action Args
2022-04-11 14:58:55adminsetgithub: 76377
2017-12-26 10:47:13serhiy.storchakasetmessages: + msg309058
2017-12-03 11:44:05ronaldoussorensetmessages: + msg307508
2017-12-01 21:18:51serhiy.storchakasetkeywords: + patch
stage: patch review
pull_requests: + pull_request4579
2017-12-01 19:28:36serhiy.storchakacreate