New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Specialize FASTCALL for functions with positional-only parameters #73650
Comments
Proposed patch renames METH_FASTCALL to METH_FASTCALL|METH_KEYWORDS and makes bare METH_FASTCALL be used for functions with positional-only parameters. This eliminates small cost that these functions pay for handling empty keywords: calling _PyStack_UnpackDict() and _PyArg_NoStackKeywords(), passing kwnames. This also can slightly reduce stack consumption. |
While I tried to keep everything related to FASTCALL private, it seems like Cython uses some FASTCALL features. I don't know which ones exactly. Well, if only one project in the world uses FASTCALL, we can help them to support such backward incompatible change ;-)
My idea when I designed FASTCALL was to move code to parse arguments in the function body rather than in _PyCFunction_FastCallKeywords(), and to have a single calling function METH_FASTCALL, rather than two (METH_FASTCALL and METH_FASTCALL|METH_KEYWORDS). The long term plan is also to support passing arguments by keyword in more functions. IMHO many functions don't accept keywords for technical reasons, but once we converted a module, function or type to Argument Clinic, it becomes trivial to accept keywords. If most functions accept keywords, I'm not sure that having a special case for positional-only is still worth it. But this plan was before I had discussions on supporting keywords in unicode methods. In fact, it's deliberate to not accept keywords in many functions or methods. Well, when I see your patch, I see that it removes a lot of code. So it's likely to be a good idea :-)
You mean the removal of the "PyObject *kwnames" for METH_FASTCALL (positional arguments only)? Do you have an idea of the stack usage? Try maybe testcapi_stacksize.patch of the issue bpo-28870? It would help to take a decision on this change. |
We can avoid breaking backward compatibility and introduce new call method METH_FASTCALL_NO_KEYWORDS. But combining existing flags looks better to me. FASTCALL is not a part of stable ABI. I still didn't do any benchmarking or stack usage measurements. |
I'm running benchmarks on the speed-python server. |
Adding Stefan Behnel, perhaps Cython doesn't need backwards compatibility. |
I measured the stack consumption, it's not better. But I created the issue bpo-29465 "Add _PyObject_FastCall() to reduce stack consumption" which would allow to reduce the stack consumption with this patch. |
If we decide that having two FASTCALL calling convention, I prefer what you proposed: METH_FASTCALL (pos only) and METH_FASTCALL|METH_KEYWORDS (pos+kw). As you wrote, I like reusing the existing METH_KEYWORDS flag, it reduces the surprises if someone ports existing code using METH_VARARGS and METH_VARARGS|METH_KEYWORDS. |
Hum, benchmark results don't seem good. There is probably a performance bug somewhere. I should investigate further to analyze these results. M aybe combined with the issue bpo-29465, results will be better. haypo@speed-python$ python3 -m perf compare_to ~/benchmarks/2017-02-06_07-15-default-e06af4027546.json fastcall-no-keywords_ref_e06af4027546.json -G --min-speed=5 Slower (19):
Faster (2):
Benchmark hidden because not significant (43): (...) |
Thanks for asking. Cython doesn't use METH_FASTCALL yet, so this doesn't introduce any problems. Generally speaking, if Cython generated user code stops working with a new CPython version, we expect people to regenerate their code with the newest Cython version to fix it, so this kind of internal incompatibility is usually reparable by adapting Cython appropriately. |
There was a bug in previous patch. The signature of generated by Argument Clinic functions was not changed. Updated patch fixes this bug and also fixes the use of fast call in deque methods. |
Ok, so I looked again at your change: fastcall-no-keywords-2.patch LGTM, Since your patch is huge, I expect that it will be a pain to rebase it. I According to the number of modified functions and files, not accepting keyword At least my assumption that almost no function will accept only positional I checked quickly fastcall-no-keywords-2.patch stack usage and performance. For stack consumption and performances, I got good results on my issue bpo-29465 @Stefan Behnel: Thank you for you reply. So the backward compatibility was a |
The major part of the patch is generated by Argument Clinic. It is not a pain to rebase it. I prefer to delay pushing the patch until we prove its usefulness, because the cost of this change is not zero. Is it a stopper for bpo-29465? |
Serhiy Storchaka: "I prefer to delay pushing the patch until we prove its usefulness, because the cost of this change is not zero. Is it a stopper for bpo-29465?" Ok. No, it's not a blocker for my issue bpo-29465. About usefulness, I'm curious of the performance impact on this patch on top of the issue bpo-29465. I tried to run a benchmark, but my tooling only works well with a single patch, not with two patches, and one based on the other. Moreover, our patches are in conflict. So it seems like the issue bpo-29465 makes Python faster and uses less stack memory, I suggest to first focus on that one, and then rebase your patch on top of that. What do you think? |
+1 |
I looked up this change again and was surprised that it still wasn't applied. It feels to me that it makes sense already for reasons of consistency. Any time frame for changing it? I'd like to use METH_FASTCALL in Cython in a future-proof way. |
+1 for all the reasons listed. These are very reasonable specializations. The empty keyword checks are really irritating for fine-grained functions. |
Also +1. Can we consider the API frozen after this issue or do we have to wait for bpo-29465 (or others)? |
Actually these arguments are pretty weak. The sole argument of consistency is weak itself. _PyArg_NoStackKeywords() is pretty cheap since implementing it as a macro. This change just moves the check from the implementation of functions to the calling place. Virtually all affected functions (320 vs 7) are generated by Argument Clinic, so this doesn't simplify the maintenance much. The only runtime effect is possible saving a register or few bytes on the stack and few CPU tacts for passing the kwnames argument (always NULL). But this depends on the compiler. I don't consider this API frozen still. I have other idea for passing keyword arguments and want to experiment with it. |
Victor, could you please rerun benchmarks and check whether this change cause performance degradation? |
For third party projects who want to use FASTCALL the functions are not generated by AC. I didn't understand the reasoning why the consistency argument is weak. |
Here are benchmark results. Sorry, but I'm not really convinced that this specialization is worth it. The change adds yet another calling convention where we already have METH_NOARG, METH_VARARGS, METH_O, METH_NOARG | METH_KEYWORDS, METH_FASTCALL... I'm ok to add a new calling convention but only if it's faster on more benchmarks or if it uses much less memory. It doesn't seem to be the case with the current change. Differences of at least 5%: haypo@speed-python$ python3 -m perf compare_to /home/haypo/json/2017-06-09_08-18-master-ef8320cf6f09.json.gz ~/json/patch/2017-06-09_08-18-master-ef8320cf6f09-patch-1955.json.gz -G --min-speed=5 --table +-------------------------+--------------------------------------+-------------------------------------------------+ Differences of at least 2%: haypo@speed-python$ python3 -m perf compare_to /home/haypo/json/2017-06-09_08-18-master-ef8320cf6f09.json.gz ~/json/patch/2017-06-09_08-18-master-ef8320cf6f09-patch-1955.json.gz -G --min-speed=2 --table +-------------------------+--------------------------------------+-------------------------------------------------+ |
I do not see this as a matter of performance but as a matter of usability. Basically, CPython could do just fine with just a single catch-all calling convention that packs all pos/kw arguments into C arguments and passes them over, leaving it entirely to the callee to handle them. Why does it not do that? Because it's cumbersome for the callee to analyse what kind of arguments it received and how to handle them. A surprisingly large number of functions take only a single argument, that's why METH_O exists. Many functions take no keyword arguments, that's why VARARGS and KEYWORDS are separate options. The same should apply to FASTCALL. Also with that, many implementors will not want to care about keyword arguments and would thus appreciate it if they didn't have to. Forcing them to test for keyword arguments and raising the correct error for it (with the correct and consistent message) seems an entirely unnecessary imposition. |
Ah, I took the habit of using Argument Clinic, so I don't have to both to thing anymore :-) But yes, writing manually the PyArg_XXX() code is boring and error-prone, I agree. |
Can the PR be applied then? It looks good to me. |
Go for it Serhiy. Even if it isn't faster today, we might find more optimizations later, at least in term of C stack consumption. Moreover, I understand perfectly the cost of having to parse arguments. Checking for keywords in the caller seems reasonable. This change breaks the backward compatibility with Python 3.6, right? I mean, code using METH_FASTCALL. I guess that only Cython extensions use it in the wild, and I expect that Cython extensions don't use the stable ABI but are recompiled for each 3.x release, so it should be fine in practice. |
Oh, it seems like this change should help to use FASTCALL in the _decimal module: issue bpo-29301. Stefan doesn't want to use Argument Clinic. |
Josay noticed that _PyArg_NoStackKeywords() is not used anymore except for one place (_hashopenssl.c). Seems this is my oversign. Generated constructors in _hashopenssl.c use the METH_FASTCALL | METH_KEYWORDS calling method, but check that no keyword arguments are passed. They can now just use the METH_FASTCALL calling method. And _PyArg_NoStackKeywords() can be excluded from public header. |
Let's remove this function which became useless. |
Removed _PyArg_NoStackKeywords() (7e60192). |
Hi, I observed an error while trying to install numpy after 6969eaf. gcc: numpy/random/mtrand/mtrand.c Is this a bug in Python or does it need to be fixed in numpy? |
I think you need to update Cython to a version that supports new FASTCALL call convention in 3.7. If the support of FASTCALL in Cython is optional try to disable it. |
Thanks, I'm not sure what that means exactly but I added the note to numpy/numpy#9391. Perhaps a note in the Python release notes is warranted? |
For future reference, this change is supported by Cython 0.26 (which is currently close to release). |
Great! Are all uses of internal CPython details optional? I would disable them by default for alpha versions of CPython. |
Well, what classifies as a "CPython detail" sometimes just becomes clear when other implementations don't have it. ;-) But yes, the C code that Cython generates selects alternative implementations based on some more or less generic C defines, e.g. CYTHON_COMPILING_IN_CPYTHON or CYTHON_USE_PYLONG_INTERNALS or CYTHON_ASSUME_SAFE_MACROS. We enable them based on the Python implementation and even users can disable them at need. Well, and then there are obviously tons of PY_VERSION_HEX specific special cases, such as this one.
I don't see why. We track the CPython development via travis-CI and alpha versions make us aware of changes that we need to incorporate (or sometimes fight against ;-) ). These internals very rarely change across CPython releases (e.g. the PyLong type didn't really change since 3.0/2.7.0), and this one is really an exception. Excluding alpha versions would probably reduce our response time to these changes. |
When I worked on FASTCALL, I wasn't sure that the whole project was worth Stefan, Serhiy: would it be a good idea to make _PyObject_FastCall() and |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: