Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add PyObject_CallOneArg() #81664

Closed
jdemeyer opened this issue Jul 2, 2019 · 32 comments
Closed

Add PyObject_CallOneArg() #81664

jdemeyer opened this issue Jul 2, 2019 · 32 comments
Labels
3.9 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage

Comments

@jdemeyer
Copy link
Contributor

jdemeyer commented Jul 2, 2019

BPO 37483
Nosy @vstinner, @methane, @skrah, @jdemeyer, @corona10, @miss-islington
PRs
  • bpo-37483: add _PyObject_CallOneArg() function #14558
  • bpo-37493: use _PyObject_CallNoArg in more places #14575
  • bpo-37483: fix refleak #14599
  • bpo-37483: fix reference leak in _PyCodec_Lookup #14600
  • bpo-37483: Add PyObject_CallOneArg() in the What's New in Python 3.9 #23062
  • [3.9] bpo-37483: Add PyObject_CallOneArg() in the What's New in Python 3.9 (GH-23062) #23074
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2020-11-01.13:28:37.261>
    created_at = <Date 2019-07-02.12:14:43.792>
    labels = ['interpreter-core', '3.9', 'performance']
    title = 'Add PyObject_CallOneArg()'
    updated_at = <Date 2020-11-01.13:28:37.261>
    user = 'https://github.com/jdemeyer'

    bugs.python.org fields:

    activity = <Date 2020-11-01.13:28:37.261>
    actor = 'corona10'
    assignee = 'none'
    closed = True
    closed_date = <Date 2020-11-01.13:28:37.261>
    closer = 'corona10'
    components = ['Interpreter Core']
    creation = <Date 2019-07-02.12:14:43.792>
    creator = 'jdemeyer'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 37483
    keywords = ['patch']
    message_count = 32.0
    messages = ['347138', '347172', '347181', '347196', '347197', '347198', '347200', '347202', '347203', '347205', '347206', '347208', '347209', '347210', '347211', '347213', '347214', '347216', '347217', '347220', '347222', '347224', '347231', '347267', '347318', '347330', '347332', '347333', '347335', '370685', '380132', '380133']
    nosy_count = 6.0
    nosy_names = ['vstinner', 'methane', 'skrah', 'jdemeyer', 'corona10', 'miss-islington']
    pr_nums = ['14558', '14575', '14599', '14600', '23062', '23074']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'performance'
    url = 'https://bugs.python.org/issue37483'
    versions = ['Python 3.9']

    @jdemeyer
    Copy link
    Contributor Author

    jdemeyer commented Jul 2, 2019

    As discussed in https://mail.python.org/archives/list/capi-sig@python.org/thread/X6HJOEX6RRFLNKAQSQR6HLD7K4QZ4OTZ/ it would be convenient to have a function PyObject_CallOneArg() for making calls with exactly 1 positional argument and no keyword arguments. Such calls are very common. This would be analogous to PyObject_CallNoArgs().

    @jdemeyer jdemeyer added 3.9 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage labels Jul 2, 2019
    @methane
    Copy link
    Member

    methane commented Jul 3, 2019

    What do you think about macro like this?

      #define _PyObject_CALL_WITH_ARGS(func, ...) \
          _PyObject_Vectorcall(func, (PyObject*[]){NULL, __VA_ARGS__} + 1, \
                  sizeof((PyObject*[]){__VA_ARGS__})/sizeof(PyObject*) | PY_VECTORCALL_ARGUMENTS_OFFSET, \
                  NULL)

    Pros: it can be used for 2 or 3 arguments too.
    Cons: readability...

    @jdemeyer
    Copy link
    Contributor Author

    jdemeyer commented Jul 3, 2019

    Variadic macros are not part of C89, so that would require changing PEP-7.

    @jdemeyer
    Copy link
    Contributor Author

    jdemeyer commented Jul 3, 2019

    Cons: readability...

    It's a single relatively short macro. I'm not worried about that. I'm more about compiler support for such macros. I'll make a PR with this idea and see what CI says.

    @vstinner
    Copy link
    Member

    vstinner commented Jul 3, 2019

    Variadic macros are not part of C89, so that would require changing PEP-7.

    PEP-7 uses C99 since Python 3.6:
    https://www.python.org/dev/peps/pep-0007/#c-dialect

    What do you think about macro like this?

    Macros cannot be used in all C extensions. For the PEP-587, I was asked to replace macros with functions.

    I would prefer a PyObject_CallOneArg() function than yet another ugly and dangerous macro. Sometimes, a regular function is better for stack consumption: it doesn't increase the stack consumption in the call site, stack memory is "released" once PyObject_CallOneArg() exit. In the past, I had to disable inlining in some functions to reduce the stack consumption. (I'm not sure if this optimization is still around.)

    @methane
    Copy link
    Member

    methane commented Jul 3, 2019

    AFAIK, gcc, clang, and MSVC support it.

    Another cons is, general pitfall of macro:

    _PyObject_CALL_WITH_ARGS(func, PyDict_GetItem(d, key)); // PyDict_GetItem(d, key) is called twice.

    If two or more arguments are not common, I prefer _PyObject_CallOneArg to macro.

    @jdemeyer
    Copy link
    Contributor Author

    jdemeyer commented Jul 3, 2019

    _PyObject_CALL_WITH_ARGS(func, PyDict_GetItem(d, key)); // PyDict_GetItem(d, key) is called twice.

    That's pretty bad and in my opinion a good reason to reject this idea.

    If two or more arguments are not common, I prefer _PyObject_CallOneArg to macro.

    I posted some counts at https://mail.python.org/archives/list/capi-sig@python.org/message/3I76R27YMFSKB5JQIM3E4NA3BECVTZHP/

    @jdemeyer
    Copy link
    Contributor Author

    jdemeyer commented Jul 3, 2019

    _PyObject_CALL_WITH_ARGS(func, PyDict_GetItem(d, key)); // PyDict_GetItem(d, key) is called twice.

    Actually, it's not a problem: sizeof() is special, it only looks at the type of its argument, it doesn't execute the code.

    @jdemeyer
    Copy link
    Contributor Author

    jdemeyer commented Jul 3, 2019

    PEP-7 uses C99 since Python 3.6: https://www.python.org/dev/peps/pep-0007/#c-dialect

    That's not what the PEP says: "Python versions greater than or equal to 3.6 use C89 with several select C99 features"

    "several select C99 features" is not the same of using C99.

    Plus, I'm also worried about C++. We also support compiling C++ extensions with the same header files, so it must be C++ compatible also.

    @methane
    Copy link
    Member

    methane commented Jul 3, 2019

    Macros cannot be used in all C extensions. For the PEP-587, I was asked to replace macros with functions.

    This is just an helper inline function / macro to ease calling _PyObject_Vectorcall.
    Extensions can port this helper inline function / macro to ease calling _PyObject_Vectorcall.

    I don't want to make this non-inline function.

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Jul 3, 2019

    The motivation for this PR is "it would be convenient to have this function".

    This is probably true, but generally I would not change 47 files at once.

    Most of the locations are probably not performance sensitive.

    @methane
    Copy link
    Member

    methane commented Jul 3, 2019

    This change doesn't make caller code complicated. It makes caller little simpler.

    Choosing performance sensitive code is much hard than replace all occurrences.
    So I'm OK to change all callers except code owner opposed at once.

    @jdemeyer
    Copy link
    Contributor Author

    jdemeyer commented Jul 3, 2019

    Exactly. I see no reason to prefer PyObject_CallFunctionObjArgs(func, arg, NULL) over _PyObject_CallOneArg(func, arg)

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Jul 3, 2019

    It adds yet another special case underscore function that one cannot use in external projects. So I would not say that is simpler.

    Has there been any performance measurement at all?

    @jdemeyer
    Copy link
    Contributor Author

    jdemeyer commented Jul 3, 2019

    It adds yet another special case underscore function that one cannot use in external projects. So I would not say that is simpler.

    If you're worried about the underscore, I will make a separate PR to add a non-underscored version, similar to PyObject_CallNoArgs()

    @vstinner
    Copy link
    Member

    vstinner commented Jul 3, 2019

    Exactly. I see no reason to prefer PyObject_CallFunctionObjArgs(func, arg, NULL) over _PyObject_CallOneArg(func, arg)

    In this case, maybe the whole idea of _PyObject_CallOneArg() is not worth it?

    Is there any benchmark showing if it's faster or use less stack?

    @methane
    Copy link
    Member

    methane commented Jul 3, 2019

    It adds yet another special case underscore function that one cannot use in external projects. So I would not say that is simpler.

    I don't get what you mean.

    Do you care about *adding* API with underscore? If so, it doesn't make caller code complex. It can not be a reason to reject changing many files.

    Or do you care about *using* API with underscore? If so, I'm OK to stop changing some callers which are not tightly coupled with Python. (e.g. sqlite)

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Jul 3, 2019

    Or do you care about *using* API with underscore? If so, I'm OK to stop changing some callers which are not tightly coupled with Python.

    I care about this one. Indeed I think underscore functions should be used in strategic places inside the core interpreter and not sprinkled across the whole code base by default.

    But in general I also care about not having too many new functions to avoid churn.

    @jdemeyer
    Copy link
    Contributor Author

    jdemeyer commented Jul 3, 2019

    Is there any benchmark showing if it's faster

    Here is one example:

    class D(dict):
        def __missing__(self, key):
            return None
    d = D()

    and now benchmark d[0]

    **before**: Mean +- std dev: 173 ns +- 1 ns
    **after**: Mean +- std dev: 162 ns +- 1 ns

    To be precise, I ran: ./python -m perf timeit --duplicate 200 -s 'class D(dict):' -s ' def __missing__(self, key):' -s ' return None' -s 'd = D()' 'd[0]'

    @jdemeyer
    Copy link
    Contributor Author

    jdemeyer commented Jul 3, 2019

    Stefan: I used an underscore by analogy with PyObject_CallNoArgs()/_PyObject_CallNoArg(), where the first is in the limited API and the second is an inline function in the cpython API.

    But maybe we could revisit that decision.

    @methane
    Copy link
    Member

    methane commented Jul 3, 2019

    I don't want to add many functions to limited/public APIs.
    Note that public APIs defined outside of cpython/ are considered mandatory for all Python implementations supporting Python/C API.

    When portability is more important than performance, there are many APIs already.
    Please keep _PyObject_CallOneArg() as just an easy wrapper of _PyObject_Vectorcall which is CPython specific in this issue.

    Candidate of revert: _csv, _decimal, _sqlite, pyexpat, _testbuffer, _xxtestfuzz
    They doesn't use much private APIs, or they are tests which is useful for other interpreters.

    @jdemeyer
    Copy link
    Contributor Author

    jdemeyer commented Jul 3, 2019

    Victor, what's your opinion on adding PyObject_CallOneArg() to the limited API?

    @jdemeyer
    Copy link
    Contributor Author

    jdemeyer commented Jul 3, 2019

    Test of stack usage:

    from _testcapi import stack_pointer
    class D(dict):
        def __missing__(self, key):
            sp = stack_pointer()
            print(f"stack usage = {TOP - sp}")
            return None
    d = D()
    TOP = stack_pointer()
    d[0]

    **before**: stack usage = 1296
    **after**: stack usage = 976

    @methane
    Copy link
    Member

    methane commented Jul 4, 2019

    New changeset 196a530 by Inada Naoki (Jeroen Demeyer) in branch 'master':
    bpo-37483: add _PyObject_CallOneArg() function (bpo-14558)
    196a530

    @vstinner
    Copy link
    Member

    vstinner commented Jul 5, 2019

    The following commit introduced a reference leak:

    commit 196a530
    Author: Jeroen Demeyer <J.Demeyer@UGent.be>
    Date: Thu Jul 4 12:31:34 2019 +0200

    bpo-37483: add \_PyObject_CallOneArg() function (bpo-14558)
    

    Example:

    https://buildbot.python.org/all/#/builders/80/builds/642

    test_xml_etree leaked [1, 1, 1] references, sum=3
    test_threading leaked [6, 6, 6] references, sum=18
    test_atexit leaked [6, 6, 6] references, sum=18
    test_compile leaked [1, 1, 1] references, sum=3
    test_zipfile leaked [1, 1, 1] references, sum=3
    test_tokenize leaked [3, 3, 3] references, sum=9
    test_pydoc leaked [1, 1, 1] references, sum=3
    test_codecs leaked [43, 43, 43] references, sum=129
    test_codecs leaked [11, 11, 11] memory blocks, sum=33
    Re-running failed tests in verbose mode
    test__xxsubinterpreters leaked [136, 136, 136] references, sum=408
    test_getargs2 leaked [4, 4, 4] references, sum=12
    test_source_encoding leaked [3, 3, 3] references, sum=9
    test_capi leaked [4, 4, 4] references, sum=12
    test_email leaked [17, 17, 17] references, sum=51
    test_xml_etree leaked [1, 1, 1] references, sum=3
    test_threading leaked [6, 6, 6] references, sum=18
    test_atexit leaked [6, 6, 6] references, sum=18
    test_compile leaked [1, 1, 1] references, sum=3
    test_zipfile leaked [1, 1, 1] references, sum=3
    test_tokenize leaked [3, 3, 3] references, sum=9
    test_pydoc leaked [1, 1, 1] references, sum=3
    test_codecs leaked [43, 43, 43] references, sum=129
    test_codecs leaked [11, 11, 11] memory blocks, sum=33
    (...)

    @vstinner
    Copy link
    Member

    vstinner commented Jul 5, 2019

    INADA-san, Jeroen: hum, you both proposed a similar fix :-) One fix is enough. See my review on PR 14600.

    @jdemeyer
    Copy link
    Contributor Author

    jdemeyer commented Jul 5, 2019

    Jeroen: hum, you both proposed a similar fix :-)

    It seems that I lost the race ;-)

    But seriously: if we both independently came up with the same solution, that's a good sign that the solution is correct.

    @methane
    Copy link
    Member

    methane commented Jul 5, 2019

    New changeset 6e43d07 by Inada Naoki (Jeroen Demeyer) in branch 'master':
    bpo-37483: fix reference leak in _PyCodec_Lookup (GH-14600)
    6e43d07

    @vstinner
    Copy link
    Member

    vstinner commented Jul 5, 2019

    Thanks for the quick fix ;-)

    @vstinner
    Copy link
    Member

    vstinner commented Jun 3, 2020

    Oh, PyObject_CallOneArg() is missing in the What's New in Python 3.9:
    https://docs.python.org/dev/whatsnew/3.9.html#id1

    Can someone please propose a PR to add it? I reopen the issue.

    @vstinner vstinner reopened this Jun 3, 2020
    @corona10
    Copy link
    Member

    corona10 commented Nov 1, 2020

    New changeset 7feb54a by Dong-hee Na in branch 'master':
    bpo-37483: Add PyObject_CallOneArg() in the What's New in Python 3.9 (GH-23062)
    7feb54a

    @miss-islington
    Copy link
    Contributor

    New changeset 0312efc by Miss Skeleton (bot) in branch '3.9':
    bpo-37483: Add PyObject_CallOneArg() in the What's New in Python 3.9 (GH-23062)
    0312efc

    @corona10 corona10 closed this as completed Nov 1, 2020
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.9 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants