This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: pickle: constant propagation in _Unpickler_Read()
Type: performance Stage: commit review
Components: Versions: Python 3.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: vstinner Nosy List: python-dev, serhiy.storchaka, vstinner
Priority: normal Keywords: patch

Created on 2016-05-19 12:02 by vstinner, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
unpickle_read.patch vstinner, 2016-05-19 12:02 review
unpickle_read-2.patch vstinner, 2016-05-20 09:26 review
Messages (9)
msg265852 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-05-19 12:02
According to Linux perf, the unpickle_list benchmark (of the CPython benchmark suite) heavily depends on the performance of load() and _Unpickler_Read() functions. While running benchmarks with PGO+LTO compilation, I noticed a difference around 5% because one build uses constant propation on _Unpickler_Read() (fast), wheras another build doesn't (slow).

Depending on the result of the PGO, depending on the OS and the compiler, you may or may not get this nice optimization. I propose to implement it manually to not depend on the compiler.

Attached short patch implements manually the optimization. It looks to have a big impact on unpickle_list, but no impact (benchmark is not significant) on fastunpickle and pickle_list:
---
$ taskset -c 3 python3 perf.py ../ref_default/rel/python ../default/rel/python -b unpickle_list,fastunpickle,pickle_list --rigorous -v 
(...)
Report on Linux smithers 4.4.9-300.fc23.x86_64 #1 SMP Wed May 4 23:56:27 UTC 2016 x86_64 x86_64
Total CPU cores: 4

### fastunpickle ###
Avg: 0.527359 +/- 0.005932 -> 0.518548 +/- 0.00953: 1.02x faster
Not significant


### pickle_list ###
Avg: 0.269307 +/- 0.017465 -> 0.266015 +/- 0.00423: 1.01x faster
Not significant


### unpickle_list ###
Avg: 0.255805 +/- 0.006942 -> 0.231444 +/- 0.00394: 1.11x faster
Significant (t=21.58)
---

It would be interesting to also evaluate the computed goto optimization for the load() function. (And also try computed goto for the re/_sre module, but that's a different issue.)

I tuned my system and patched perf.py (of the CPython benchmark suite) to get stable benchmark results.
msg265858 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-05-19 13:38
+1. Similar optimization is used in marshal.

Added comments on Rietveld.
msg265920 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-05-20 09:26
Updated patch 2 to address Serhiy's comments.
msg265922 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-05-20 09:37
I forgot to note that the comment before _Unpickler_Read() becomes not correct, but you already addressed this in the second patch. unpickle_read-2.patch LGTM.

Few months ago I tried to optimize reading in pickle, but didn't see significant benefit. After committing your patch I'm going to revive my patch.
msg265925 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016-05-20 09:44
New changeset f9b85b47f9c8 by Victor Stinner in branch 'default':
Optimize pickle.load() and pickle.loads()
https://hg.python.org/cpython/rev/f9b85b47f9c8
msg265926 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-05-20 09:45
Serhiy Storchaka: "I forgot to note that the comment before _Unpickler_Read() becomes not correct, but you already addressed this in the second patch. unpickle_read-2.patch LGTM."

Cool, pushed.

"Few months ago I tried to optimize reading in pickle, but didn't see significant benefit. After committing your patch I'm going to revive my patch."

Cool too, keep me in touch.
msg265946 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-05-20 14:07
I think that integer overflow in _Unpickler_Read() is possible. n is read from file and can be arbitrary (up to PY_SSIZE_T_MAX). This likely cause raising an exception later, but integer overflow itself causes undefined behavior, and we should avoid it.
msg265952 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016-05-20 19:17
New changeset 3d7b7aa89437 by Victor Stinner in branch 'default':
Issue #27056: Fix _Unpickler_Read() to avoid integer overflow
https://hg.python.org/cpython/rev/3d7b7aa89437
msg265953 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-05-20 19:19
Serhiy Storchaka:
> I think that integer overflow in _Unpickler_Read() is possible. n is read from file and can be arbitrary (up to PY_SSIZE_T_MAX). This likely cause raising an exception later, but integer overflow itself causes undefined behavior, and we should avoid it.

Hum, I understood that it's ok since numbers should be signed, but in
fact I'm not confident that n is always signed. You are right, it's
better to use your code to avoid the integer overflow. I pushed a fix.
History
Date User Action Args
2022-04-11 14:58:31adminsetgithub: 71243
2016-05-20 19:19:20vstinnersetmessages: + msg265953
2016-05-20 19:17:50python-devsetmessages: + msg265952
2016-05-20 14:07:49serhiy.storchakasetmessages: + msg265946
2016-05-20 09:45:57vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg265926
2016-05-20 09:44:03python-devsetnosy: + python-dev
messages: + msg265925
2016-05-20 09:37:27serhiy.storchakasetassignee: vstinner
messages: + msg265922
stage: commit review
2016-05-20 09:26:27vstinnersetfiles: + unpickle_read-2.patch

messages: + msg265920
2016-05-19 13:38:10serhiy.storchakasetmessages: + msg265858
2016-05-19 12:02:29vstinnercreate