Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

convert print() to METH_FASTCALL #73482

Closed
methane opened this issue Jan 17, 2017 · 13 comments
Closed

convert print() to METH_FASTCALL #73482

methane opened this issue Jan 17, 2017 · 13 comments
Labels
3.7 (EOL) end of life performance Performance or resource usage

Comments

@methane
Copy link
Member

methane commented Jan 17, 2017

BPO 29296
Nosy @vstinner, @methane, @serhiy-storchaka
Files
  • print-fastcall.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2017-01-20.07:51:29.279>
    created_at = <Date 2017-01-17.11:34:31.660>
    labels = ['3.7', 'performance']
    title = 'convert print() to METH_FASTCALL'
    updated_at = <Date 2017-01-20.07:51:29.277>
    user = 'https://github.com/methane'

    bugs.python.org fields:

    activity = <Date 2017-01-20.07:51:29.277>
    actor = 'vstinner'
    assignee = 'none'
    closed = True
    closed_date = <Date 2017-01-20.07:51:29.279>
    closer = 'vstinner'
    components = []
    creation = <Date 2017-01-17.11:34:31.660>
    creator = 'methane'
    dependencies = []
    files = ['46314']
    hgrepos = []
    issue_num = 29296
    keywords = ['patch']
    message_count = 13.0
    messages = ['285632', '285634', '285637', '285638', '285668', '285672', '285692', '285780', '285781', '285784', '285847', '285886', '285890']
    nosy_count = 4.0
    nosy_names = ['vstinner', 'methane', 'python-dev', 'serhiy.storchaka']
    pr_nums = []
    priority = 'normal'
    resolution = 'later'
    stage = None
    status = 'closed'
    superseder = None
    type = 'performance'
    url = 'https://bugs.python.org/issue29296'
    versions = ['Python 3.7']

    @methane
    Copy link
    Member Author

    methane commented Jan 17, 2017

    Median +- std dev: [default] 24.2 ms +- 0.4 ms -> [patched] 19.2 ms +- 0.4 ms: 1.26x faster (-21%)

    @methane methane added 3.7 (EOL) end of life performance Performance or resource usage labels Jan 17, 2017
    @serhiy-storchaka
    Copy link
    Member

    The performance of print() is not critical. It usually involves slow formatting and IO.

    _PyArg_ParseStackAndKeywords() is a part of unstable, fast-evolved API. I would beware of using it in common code.

    @methane
    Copy link
    Member Author

    methane commented Jan 17, 2017

    I see. AC should support this.

    @vstinner
    Copy link
    Member

    Hi,

    Serhiy Storchaka: "The performance of print() is not critical. It usually involves slow formatting and IO."

    Oh, please, I want a print() using METH_FASTCALL! Not for the performance, but just to help me in my development!

    To identify code not using FASTCALL yet, I'm putting a breakpoint on _PyStack_AsTuple() and _PyStack_AsDict(). It's common to stop on print() when testing the performance benchmark suite.

    FYI currnently, I'm adding the following Python code to python-gdb.py:
    ---

    class MyBreakpoint (gdb.Breakpoint):
        def stop(self):
            # frame of the caller
            frame = gdb.selected_frame()
            frame = frame.older()
    
            name = frame.name()
            if name == '_PyCFunction_FastCallKeywords':
                pass
            else:
                # callable->ob_type
                obj = frame.read_var('callable')
                obj = obj.referenced_value()['ob_type']
    
                obj2 = gdb.lookup_global_symbol('PyType_Type').value()
    
                if obj == obj2.address:
                    return False
    
                print("don't skip", obj, obj2.address)
    
            gdb.execute("up")
            return True
    
    
    MyBreakpoint("_PyStack_AsTuple")
    MyBreakpoint("_PyStack_AsDict")

    These macros make gdb 40x slower :-/

    The code is used to skip type_call(): this function is known to not use FASTCALL yet. I plan to add tp_fastnew and tp_fastinit to optimize type_call(), but it's already hard enough to implement tp_fastcall, so I will do it later.

    _PyArg_ParseStackAndKeywords() is a part of unstable, fast-evolved API. I would beware of using it in common code.

    Ah. I have a different opinion on that. Since _PyArg_ParseStackAndKeywords() and print() are part of the Python core, we have free to use private fast-moving APIs.

    By the way, _PyArg_ParseStackAndKeywords() already uses the new efficient _PyArg_Parser object. Do you plan other enhancements?

    INADA Naoki: "I see. AC should support this."

    Right, but IMHO it can be done later.

    I tried to contribute to AC but the code is super complex. I understood that AC is not complex, but Python is complex :-D AC is full of corner cases. It's a very long list of special cases :-)

    @vstinner
    Copy link
    Member

    I reviewed print-fastcall.patch: LGTM, but I proposed a minor change.

    Serhiy Storchaka: "The performance of print() is not critical. It usually involves slow formatting and IO."

    I also had the same understanding of print(), but I just analyzed performances of the bm_telco benchmark, and it seems just like handling function parameters of print() take 20% of the runtime!?

    http://bugs.python.org/issue29259#msg285667

    bm_telco reference (unpatched) => with issue bpo-29259 tp_fastcall-2.patch and print-fastcall.patch:

    20.9 ms +- 0.5 ms => 16.7 ms +- 0.2 ms

    print-fastcall.patch makes bm_telco 20% faster! Just to make sure, I ran again bm_telco only with tp_fastcall-2.patch:

    telco: Median +- std dev: 21.4 ms +- 0.8 ms

    Maybe we should optimize _PyStack_AsDict(), but that's a different topic ;-)

    @serhiy-storchaka
    Copy link
    Member

    Please wait until the support of var-positional arguments be added in Argument Clinic. After that print() could be just converted to Argument Clinic.

    @vstinner
    Copy link
    Member

    Serhiy Storchaka: "Please wait until the support of var-positional arguments be added in Argument Clinic. After that print() could be just converted to Argument Clinic."

    Why can't we start with METH_FASTCALL and convert to AC later? It seems like print-fastcall.patch will be a major performance enhancement (at least for the very specific bm_telco benchmark), once tp_fastcall field will be added.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jan 19, 2017

    New changeset 0327171f05dd by INADA Naoki in branch 'default':
    Issue bpo-29296: convert print() to METH_FASTCALL
    https://hg.python.org/cpython/rev/0327171f05dd

    @vstinner
    Copy link
    Member

    I pushed print-fastcall.patch since it's simple and has a significant impact on bm_telco benchmark (especially when tp_fastcall slot will be added).

    I added a reminder (msg285779) in bpo-20291 to convert print() once AC will support **kwargs.

    Can we close the issue, or do you prefer to keep it open?

    @serhiy-storchaka
    Copy link
    Member

    Why can't we start with METH_FASTCALL and convert to AC later?

    It increases the maintenance burden. When the fastcall protocol be changed, this would require changing more handwritten code.

    @vstinner
    Copy link
    Member

    Serhiy Storchaka:

    When the fastcall protocol be changed, this would require changing more handwritten code.

    I don't plan to modify the fastcall protocol. I'm not sure that it's
    really possible to modify it anymore. It would probably be simpler to
    add a new protocol. But it may become a little bit annoying to have to
    support so many calling convention :-) So someone should come up with
    a serious rationale to add another one :-)

    I tried to make fastcall as private as possible, but recently I
    noticed that METH_FASTCALL is now public in Python 3.6 API (but
    hopefully excluded from the stable ABI). I know that it's already used
    by Cython since 0.25:
    https://mail.python.org/pipermail/cython-devel/2016-October/004959.html

    About the maintenance burden, I'm ok to handle it :-) My final goal is
    to use Argument Clinic everywhere. While converting existing code to
    AC is a slow process, I consider (from my recent experiences with AC)
    that the new code is easier to maintain! It's a slow process beause AC
    still has limitaitons, but also because it's used an opportunity to
    review (and enhance!) the existing docstrings.

    @methane
    Copy link
    Member Author

    methane commented Jan 20, 2017

    Can we close the issue, or do you prefer to keep it open?

    I don't care.

    @vstinner
    Copy link
    Member

    Since the title of the issue is "convert print() to METH_FASTCALL", I consider that the issue is now done. As I wrote, as soon as AC will support **kwargs, we will be able to print() to use AC.

    Thanks Naoki for the change.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life performance Performance or resource usage
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants