New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
convert print() to METH_FASTCALL #73482
Comments
Median +- std dev: [default] 24.2 ms +- 0.4 ms -> [patched] 19.2 ms +- 0.4 ms: 1.26x faster (-21%) |
The performance of print() is not critical. It usually involves slow formatting and IO. _PyArg_ParseStackAndKeywords() is a part of unstable, fast-evolved API. I would beware of using it in common code. |
I see. AC should support this. |
Hi, Serhiy Storchaka: "The performance of print() is not critical. It usually involves slow formatting and IO." Oh, please, I want a print() using METH_FASTCALL! Not for the performance, but just to help me in my development! To identify code not using FASTCALL yet, I'm putting a breakpoint on _PyStack_AsTuple() and _PyStack_AsDict(). It's common to stop on print() when testing the performance benchmark suite. FYI currnently, I'm adding the following Python code to python-gdb.py: class MyBreakpoint (gdb.Breakpoint):
def stop(self):
# frame of the caller
frame = gdb.selected_frame()
frame = frame.older()
name = frame.name()
if name == '_PyCFunction_FastCallKeywords':
pass
else:
# callable->ob_type
obj = frame.read_var('callable')
obj = obj.referenced_value()['ob_type']
obj2 = gdb.lookup_global_symbol('PyType_Type').value()
if obj == obj2.address:
return False
print("don't skip", obj, obj2.address)
gdb.execute("up")
return True
MyBreakpoint("_PyStack_AsTuple")
MyBreakpoint("_PyStack_AsDict") These macros make gdb 40x slower :-/ The code is used to skip type_call(): this function is known to not use FASTCALL yet. I plan to add tp_fastnew and tp_fastinit to optimize type_call(), but it's already hard enough to implement tp_fastcall, so I will do it later.
Ah. I have a different opinion on that. Since _PyArg_ParseStackAndKeywords() and print() are part of the Python core, we have free to use private fast-moving APIs. By the way, _PyArg_ParseStackAndKeywords() already uses the new efficient _PyArg_Parser object. Do you plan other enhancements? INADA Naoki: "I see. AC should support this." Right, but IMHO it can be done later. I tried to contribute to AC but the code is super complex. I understood that AC is not complex, but Python is complex :-D AC is full of corner cases. It's a very long list of special cases :-) |
I reviewed print-fastcall.patch: LGTM, but I proposed a minor change. Serhiy Storchaka: "The performance of print() is not critical. It usually involves slow formatting and IO." I also had the same understanding of print(), but I just analyzed performances of the bm_telco benchmark, and it seems just like handling function parameters of print() take 20% of the runtime!? http://bugs.python.org/issue29259#msg285667 bm_telco reference (unpatched) => with issue bpo-29259 tp_fastcall-2.patch and print-fastcall.patch: 20.9 ms +- 0.5 ms => 16.7 ms +- 0.2 ms print-fastcall.patch makes bm_telco 20% faster! Just to make sure, I ran again bm_telco only with tp_fastcall-2.patch: telco: Median +- std dev: 21.4 ms +- 0.8 ms Maybe we should optimize _PyStack_AsDict(), but that's a different topic ;-) |
Please wait until the support of var-positional arguments be added in Argument Clinic. After that print() could be just converted to Argument Clinic. |
Serhiy Storchaka: "Please wait until the support of var-positional arguments be added in Argument Clinic. After that print() could be just converted to Argument Clinic." Why can't we start with METH_FASTCALL and convert to AC later? It seems like print-fastcall.patch will be a major performance enhancement (at least for the very specific bm_telco benchmark), once tp_fastcall field will be added. |
New changeset 0327171f05dd by INADA Naoki in branch 'default': |
I pushed print-fastcall.patch since it's simple and has a significant impact on bm_telco benchmark (especially when tp_fastcall slot will be added). I added a reminder (msg285779) in bpo-20291 to convert print() once AC will support **kwargs. Can we close the issue, or do you prefer to keep it open? |
It increases the maintenance burden. When the fastcall protocol be changed, this would require changing more handwritten code. |
Serhiy Storchaka:
I don't plan to modify the fastcall protocol. I'm not sure that it's I tried to make fastcall as private as possible, but recently I About the maintenance burden, I'm ok to handle it :-) My final goal is |
I don't care. |
Since the title of the issue is "convert print() to METH_FASTCALL", I consider that the issue is now done. As I wrote, as soon as AC will support **kwargs, we will be able to print() to use AC. Thanks Naoki for the change. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: