Title: [WIP] PEP 510: Specialize functions with guards
Type: Stage:
Components: Versions: Python 3.6
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Jim Fasarakis-Hilliard, haypo
Priority: normal Keywords: patch

Created on 2016-01-13 12:52 by haypo, last changed 2017-06-28 01:12 by haypo.

File name Uploaded Description Edit
specialize.patch haypo, 2016-01-13 12:52 review
specialize-2.patch haypo, 2016-01-13 14:12 review
specialize-3.patch haypo, 2016-01-19 12:51 review
specialize-4.patch haypo, 2016-01-23 13:02 review
specialize-5.patch haypo, 2016-01-27 10:54 review
specialize-6.patch haypo, 2016-02-03 00:01 review
specialize-7.patch haypo, 2016-02-05 10:25 review
specialize-8.patch haypo, 2016-02-05 10:29 review
Pull Requests
URL Status Linked Edit
PR 2354 open haypo, 2017-06-23 11:56
Messages (12)
msg258141 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2016-01-13 12:52
Attached patch implements the PEP 510 "Specialize functions with guards".

Changes on the C API are described in the PEP:

Additions of the patch:

* Add func_specialize() and func_get_specialized() to _testcapi
* Add _testcapi.PyGuard: Python wrapper to the Guard C API
* Add Lib/test/
msg258143 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2016-01-13 14:12
Patch version 2 fixes some bugs and add more tests.

More notes about the patch:

* RuntimeError is raised if guard check() result is greater than 2
* RuntimeError is raised if guard init() result is greater than 1
* (hum, maybe 'res < 0' check must be replaced with 'res == -1', but I'm not sure that that it's worth it.)
* If PyFunction_Specialize() is called with a code object or a Python code, it creates a new code object and copies the code name and first line number in the new code object to ease debugging

TODO: keywords are currently not supported in PyGuard.__call__().
msg258147 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2016-01-13 15:24
An unit test is needed on pickle serialization to ensure that the specialize code and guards are ignored.
msg258591 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2016-01-19 12:51
Patch version 3:

* guards are now tracked by the garbage collector. That's a very important requirements to not change the Python semantics at exit, when a guard keeps a strong reference to the global namespace:

* add more tests: call specialize() with invalid types, set __code__
msg258868 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2016-01-23 13:02
Patch version 4:

* Keywords are now supported everywhere and tested by unit tests
* Inline specode_check() into PyFunction_GetSpecializedCode()
msg259012 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2016-01-27 10:54
Patch version 5: implement PyFunction_RemoveSpecialized() and PyFunction_RemoveAllSpecialized() functions (with unit tests!).

I'm not sure that PyFunction_RemoveSpecialized() must return 0 (success) if the removed specialized code doesn't exist (invalid index).
msg259068 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2016-01-27 22:01
FIXME: sys.getsizecode(func) doesn't include specialized code and guards.
msg259447 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2016-02-03 00:01
Patch version 6: I inlined PyFunction_GetSpecializedCode() into fast_function() of Python/ceval.c. It reduces *a little bit* the overhead of the patch when specialization is not used, but it also avoids to expose this function. I don't think that it's worth to expose PyFunction_GetSpecializedCode(): it was only used in ceval.c. For example, I don't use it for unit tests. I prefer to write tests calling the function and checking the results (see

*Raw* overhead of specialized-6.patch on calling "def f(): pass": 1.7 nanoseconds. I computed the overhead using timeit:

./python -m timeit -s 'def f(): pass' 'f()'

* Original: 71.7 ns
* specialize-6.patch: 73.4 ns (+1.7 ns, +2.4%)
* specialize-5.patch: 74.3 ns (+2.6 ns, +3.6%)

I will run to see the overhead on a macro benchmark.
msg259450 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2016-02-03 00:53
Results of the "The Grand Unified Python Benchmark Suite" on  specialize-6.patch.

I'm skeptical, I don't understand how my patch can make a benchmark faster :-) The result of regex_v8 is bad :-/

$ python3 -u --rigorous ../default/python.orig ../default/python
Report on Linux smithers 4.3.3-300.fc23.x86_64 #1 SMP Tue Jan 5 23:31:01 UTC 2016 x86_64 x86_64
Total CPU cores: 8

### chameleon_v2 ###
Min: 5.558607 -> 5.831682: 1.05x slower
Avg: 5.613403 -> 5.902949: 1.05x slower
Significant (t=-27.95)
Stddev: 0.06994 -> 0.07640: 1.0924x larger

### django_v3 ###
Min: 0.582356 -> 0.573327: 1.02x faster
Avg: 0.604402 -> 0.582197: 1.04x faster
Significant (t=3.43)
Stddev: 0.05618 -> 0.03215: 1.7474x smaller

### regex_v8 ###
Min: 0.043784 -> 0.049854: 1.14x slower
Avg: 0.044270 -> 0.050521: 1.14x slower
Significant (t=-19.87)
Stddev: 0.00200 -> 0.00243: 1.2105x larger

The following not significant results are hidden, use -v to show them:
2to3, fastpickle, fastunpickle, json_dump_v2, json_load, nbody, tornado_http.
msg259653 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2016-02-05 10:25
Patch version 7:

* Fix a random crash related to _testcapi.PyGuard: implement tp_traverse on PyFuncGuard and "inherit" tp_traverse on PyGuard
* Fix a typo Include/funcobject.h
* (rebase the patch)
msg259654 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2016-02-05 10:29
Oh, I missed comments on the code review. Fixed on patch version 8.
msg296703 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-06-23 11:59
Recently, some people asked me for an update for my FAT Python project. So I rebased this change I wrote 1 year 1/2 and adapted it for the new code base:

* I renamed to
* I removed the useless PyFunction_Check() macro
* I changed the guard check prototype to use the new FASTCALL calling convention: (PyObject **args, Py_ssize_t nargs, PyObject *kwnames: tuple)
* I patched _PyFunction_FastCallDict() *and* PyFunction_FastCallKeywords() to check guards and call specified code if guards succeeded

The PEP 510 is not accepted, so the implementation is still a work-in-progress (WIP) and must not be merged.
Date User Action Args
2017-06-28 01:12:18hayposettitle: PEP 510: Specialize functions with guards -> [WIP] PEP 510: Specialize functions with guards
2017-06-23 11:59:33hayposetmessages: + msg296703
2017-06-23 11:56:24hayposetpull_requests: + pull_request2400
2017-05-29 15:25:18Jim Fasarakis-Hilliardsetnosy: + Jim Fasarakis-Hilliard
2016-02-05 10:29:42hayposetfiles: + specialize-8.patch

messages: + msg259654
2016-02-05 10:25:19hayposetfiles: + specialize-7.patch

messages: + msg259653
2016-02-03 00:53:37hayposetmessages: + msg259450
2016-02-03 00:01:50hayposetfiles: + specialize-6.patch

messages: + msg259447
2016-01-27 22:01:58hayposetmessages: + msg259068
2016-01-27 10:54:44hayposetfiles: + specialize-5.patch

messages: + msg259012
2016-01-23 13:02:15hayposetfiles: + specialize-4.patch

messages: + msg258868
2016-01-19 12:51:09hayposetfiles: + specialize-3.patch

messages: + msg258591
2016-01-13 15:24:08hayposetmessages: + msg258147
2016-01-13 14:12:39hayposetfiles: + specialize-2.patch

messages: + msg258143
2016-01-13 12:52:48haypocreate