Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Derby #18: Convert 31 sites to Argument Clinic across 23 files #64385

Closed
larryhastings opened this issue Jan 8, 2014 · 55 comments
Closed

Derby #18: Convert 31 sites to Argument Clinic across 23 files #64385

larryhastings opened this issue Jan 8, 2014 · 55 comments
Labels
3.7 (EOL) end of life extension-modules C modules in the Modules dir topic-argument-clinic type-feature A feature request or enhancement

Comments

@larryhastings
Copy link
Contributor

BPO 20186
Nosy @birkenfeld, @rhettinger, @jaraco, @taleinat, @larryhastings, @vadmium, @serhiy-storchaka
PRs
  • bpo-20186: Convert tuple object implementation to Argument Clinic. #614
  • Dependencies
  • bpo-34797: Convert heapq to the argument clinic
  • Files
  • heapq_clinic.patch
  • modules_issue20186.patch
  • issue20186.mathmodule.patch: updated patch for mathmodule
  • issue20186.enumobject.patch
  • issue20186._operator.patch: AC conversion of nearly all functions in the _operator module
  • issue20186._operator.v2.patch: slightly revised patch after Serhiy's review
  • issue20186._operator.v3.patch: slightly revised (again) patch after Serhiy's additional review
  • issue20186._operator.v4.patch: slightly revised (yet again) patch after Serhiy's latest review
  • issue20186.mathmodule.v2.patch: revised patch for mathmodule.c
  • csv_clinic.patch
  • lsprof_clinic.patch
  • tracemalloc_clinic.patch
  • symtable_clinic.patch
  • enumobject-docstrings.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2014-01-08.00:23:06.113>
    labels = ['extension-modules', 'type-feature', '3.7', 'expert-argument-clinic']
    title = 'Derby #18: Convert 31 sites to Argument Clinic across 23 files'
    updated_at = <Date 2021-03-06.02:14:31.336>
    user = 'https://github.com/larryhastings'

    bugs.python.org fields:

    activity = <Date 2021-03-06.02:14:31.336>
    actor = 'jaraco'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Extension Modules', 'Argument Clinic']
    creation = <Date 2014-01-08.00:23:06.113>
    creator = 'larry'
    dependencies = ['34797']
    files = ['33425', '33442', '39602', '39603', '39653', '39654', '39655', '39656', '39696', '46357', '46358', '46359', '46360', '46362']
    hgrepos = []
    issue_num = 20186
    keywords = ['patch']
    message_count = 55.0
    messages = ['207644', '207914', '207917', '207923', '207931', '207933', '207934', '207935', '207936', '207937', '207938', '208009', '208011', '218717', '218718', '224767', '244731', '244736', '244737', '244971', '244972', '244973', '244974', '244975', '244977', '244978', '244987', '244989', '245276', '262010', '262022', '262036', '285798', '285799', '285802', '285803', '285806', '285807', '285808', '285833', '285915', '285916', '285917', '285919', '285920', '286939', '286947', '286950', '286952', '286955', '287062', '287063', '289440', '290163', '388190']
    nosy_count = 8.0
    nosy_names = ['georg.brandl', 'rhettinger', 'jaraco', 'taleinat', 'larry', 'python-dev', 'martin.panter', 'serhiy.storchaka']
    pr_nums = ['614']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue20186'
    versions = ['Python 3.7']

    @larryhastings
    Copy link
    Contributor Author

    This issue is part of the Great Argument Clinic Conversion Derby,
    where we're trying to convert as much of Python 3.4 to use
    Argument Clinic as we can before Release Candidate 1 on January 19.

    This issue asks you to change the following bundle of files:
    Objects/tupleobject.c: 2 sites
    Objects/memoryobject.c: 2 sites
    Objects/descrobject.c: 2 sites
    Objects/complexobject.c: 2 sites
    Modules/_operator.c: 2 sites
    Modules/_opcode.c: 2 sites
    Modules/_lsprof.c: 2 sites
    Modules/_heapqmodule.c: 2 sites
    Objects/weakrefobject.c: 1 sites
    Objects/structseq.c: 1 sites
    Objects/rangeobject.c: 1 sites
    Objects/object.c: 1 sites
    Objects/moduleobject.c: 1 sites
    Objects/funcobject.c: 1 sites
    Objects/fileobject.c: 1 sites
    Objects/enumobject.c: 1 sites
    Objects/codeobject.c: 1 sites
    Objects/boolobject.c: 1 sites
    Modules/symtablemodule.c: 1 sites
    Modules/mathmodule.c: 1 sites
    Modules/_tracemalloc.c: 1 sites
    Modules/_io/_iomodule.c: 1 sites
    Modules/_csv.c: 1 sites

    Talk to me (larry) if you only want to attack part of a bundle.

    For instructions on how to convert a function to work with Argument
    Clinic, read the "howto":
    http://docs.python.org/dev/howto/clinic.html

    @larryhastings larryhastings added extension-modules C modules in the Modules dir type-feature A feature request or enhancement labels Jan 8, 2014
    @birkenfeld
    Copy link
    Member

    Looking at _csv.c, I see a few functions using PyArg_UnpackTuple. They should be converted too, no?

    @birkenfeld
    Copy link
    Member

    Attached part 1 of mathmodule (17 functions).

    I'm looking forward to a suggestion for handling the rest (see FUNC1/1A/2 macros :)

    @larryhastings
    Copy link
    Contributor Author

    Wow. I never knew about PyArg_UnpackTuple. You're right, those should be converted too. Hooray, more entry points to convert.

    I'll write something up for the howto about UnpackTuple.

    I just did a quick check, and there are 96 entry points (by my count) that use PyArg_UnpackTuple(). Shall I create Derby issues #19 and #20, or do you have a better idea?

    1. For FUNC1 / 1A / 2 macros: right now you'd have to just copy and paste over and over. There might be something you could do with a [python] block where you automatedly reuse the existing sigantures. I was thinking about having Clinic support it directly, maybe with the syntax:

    /*[clinic input]
    func_name = existing_func_name

    docstring goes here
    [...]*/

    You'd skip the parameters and the return annotation. You could only reuse functions from the current file. Would that be a big boon to you?

    @birkenfeld
    Copy link
    Member

    Wow. I never knew about PyArg_UnpackTuple. You're right, those
    should be converted too. Hooray, more entry points to convert.
    I'll write something up for the howto about UnpackTuple.

    One thing to note is that (at least in math) many instances of UnpackTuple could have been replaced by ParseTuple. See for example math_hypot: it uses UnpackTuple to get two objects, and then immediately calls PyFloat_AsDouble on them. I've converted these using 'd' and not 'O' specifiers.

    I just did a quick check, and there are 96 entry points (by my count)
    that use PyArg_UnpackTuple(). Shall I create Derby issues #19 and
    #20, or do you have a better idea?

    Probably better to add them to the issues that cover their modules, otherwise people might get confused.

    1. For FUNC1 / 1A / 2 macros: right now you'd have to just copy and
      paste over and over. There might be something you could do with a
      [python] block where you automatedly reuse the existing sigantures.
      I was thinking about having Clinic support it directly, maybe with
      the syntax:

    /*[clinic input]
    func_name = existing_func_name

    docstring goes here
    [...]*/

    You'd skip the parameters and the return annotation. You could only
    reuse functions from the current file. Would that be a big boon to you?

    That sounds good.

    On the other hand, if clinic expanded cpp macros we could... *:-)

    @birkenfeld
    Copy link
    Member

    OK, here's a patch for _csv. Two problems here:

    First problem is the __new__ method of the Dialect class:

    • it has no docstring and no methoddef entry
    • it is a class method, but the first arg is conventionally called "type"

    I tried to hack something into clinic with a new decorator, but it may not be how you want it to look, take care.

    Second problem is the functions reader(), writer(), register_dialect(): they parse their *args but pass their **kwargs through to another class.
    Is there anything like a "**kwds" argument specifier?

    BTW, for a module like _csv that is exported through a Python module named csv, should we use the "real" or the "nice" module name?

    @birkenfeld
    Copy link
    Member

    Tried to tackle symtable -- it uses an O& converter. The clinic howto says

    'O&' object(converter='name_of_c_function')

    but

    Traceback (most recent call last):
      File "Tools/clinic/clinic.py", line 2817, in <module>
        sys.exit(main(sys.argv[1:]))
      File "Tools/clinic/clinic.py", line 2813, in main
        parse_file(filename, output=ns.output, verify=not ns.force)
      File "Tools/clinic/clinic.py", line 1116, in parse_file
        cooked = clinic.parse(raw)
      File "Tools/clinic/clinic.py", line 1066, in parse
        parser.parse(block)
      File "Tools/clinic/clinic.py", line 2109, in parse
        self.state(line)
      File "Tools/clinic/clinic.py", line 2378, in state_parameter
        converter = dict[name](parameter_name, self.function, value, **kwargs)
      File "Tools/clinic/clinic.py", line 1403, in __init__
        self.converter_init(**kwargs)
    TypeError: converter_init() got an unexpected keyword argument 'converter'

    @birkenfeld
    Copy link
    Member

    _tracemalloc converted.

    Its existing docstrings did use the

    func(arg: argtype) -> rettype

    convention. Is there a way in clinic to retain that?

    @birkenfeld
    Copy link
    Member

    Here's _iomodule. _io.open has a whopping 100-line docstring, which is ... unfortunate ... to have duplicated in the file :)

    @birkenfeld
    Copy link
    Member

    And _heapq. No problems there, except that it also used "->" return annotations in the docstring.

    @birkenfeld
    Copy link
    Member

    And lsprof.

    @birkenfeld
    Copy link
    Member

    OK, new patches coming in.

    @birkenfeld
    Copy link
    Member

    Actually I put all I have in one. Rietveld doesn't care.

    The mathmodule still awaits some kind of solution for the macro atrocities.

    Objects will be attacked next.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented May 17, 2014

    New changeset 060cfd049d14 by Stefan Krah in branch 'default':
    Issue bpo-20186: memoryobject.c: add function signatures.
    http://hg.python.org/cpython/rev/060cfd049d14

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented May 17, 2014

    memoryobject.c is converted with a minimal patch (I would like to keep
    100% code coverage for the file).

    @larryhastings
    Copy link
    Contributor Author

    All the Derby patches should only go into trunk at this point.

    @taleinat
    Copy link
    Contributor

    taleinat commented Jun 3, 2015

    Attached is an updated patch for Modules/mathmodule.c.

    This is based on Georg's patch, updated to apply to current 3.5, with several improvements:

    • replaced legacy converters
    • converted math.ceil() and math.floor() functions
    • converted the new math.gcd() and math.isclose() functions
    • AC generated code in separate file: Modules/clinic/mathmodule.c.h
    • this patch doesn't change any internal variable names in the C code

    @taleinat
    Copy link
    Contributor

    taleinat commented Jun 3, 2015

    Should Argument Clinic conversion patches still be against the 'default' branch, and not 3.5, even though they don't include any functionality changes?

    @taleinat
    Copy link
    Contributor

    taleinat commented Jun 3, 2015

    Attached is an AC conversion patch for Objects/enumobject.c.

    Note that this file contains the implementations of the 'enumerate' and 'reversed' classes, but *not* the 'Enum' class.

    This is based on the 3.5 branch.

    @taleinat
    Copy link
    Contributor

    taleinat commented Jun 7, 2015

    Attached is a patch for all of _operator except for itemgetter, attrgetter and methodcaller. The entire test suite passes after applying this patch.

    Using AC has allowed the removal of all of the cryptic "spam*()" macros in the code, making it much more straightforward. In terms of readability, IMO this is a great improvement.

    I skipped itemgetter, attrgetter and methodcaller since they all support passing a variable number of arguments and treating those as a sequence of values.

    @serhiy-storchaka
    Copy link
    Member

    Is there any performance difference?

    @taleinat
    Copy link
    Contributor

    taleinat commented Jun 7, 2015

    I tried running the pystone benchmark, but the results were inconclusive. I saw very large differences (up to 20%) between runs at different times, with no clear differences with or without the patch.

    However, a quick search shows that the operator module is almost completely unused by the rest of the code and the stdlib. So I really don't think performance is an issue here.

    @taleinat
    Copy link
    Contributor

    taleinat commented Jun 7, 2015

    Attached a slightly revised patch thanks to Serhiy's review.

    In addition to Serhiy's remarks, I used "_operator._compare_digest = _operator.eq" to reduce a bit more boilerplate.

    @serhiy-storchaka
    Copy link
    Member

    The operator module is rarely used in the stdlib, but if it is used in user code (mainly with map(), reduce() or like) the performance often is important. You can use microbenchmarks like following (operator.add is twice faster than lambda x, y: x + y).

    ./python -m timeit -s "import operator; a = list(range(10000)); b = a[:]" -- "list(map(operator.add, a, b))"
    

    @taleinat
    Copy link
    Contributor

    taleinat commented Jun 7, 2015

    I just ran such microbenchmarks for operator.add and operator.not_, and saw no significant difference with or without the patch.

    @taleinat
    Copy link
    Contributor

    taleinat commented Jun 7, 2015

    Here's another complete conversion patch for _operator.

    As suggested by Serhiy, I changed the comparison operators to copy the signature from _operator.eq() instead of _operator.lt(), which is easier to understand.

    @taleinat
    Copy link
    Contributor

    taleinat commented Jun 8, 2015

    Here's another version of the _operator patch, with another small change after Serhiy's latest review.

    @serhiy-storchaka serhiy-storchaka added the 3.7 (EOL) end of life label Jan 19, 2017
    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jan 19, 2017

    New changeset 112f27b8c8ea by Serhiy Storchaka in branch 'default':
    Issue bpo-20186: Converted the math module to Argument Clinic.
    https://hg.python.org/cpython/rev/112f27b8c8ea

    @serhiy-storchaka
    Copy link
    Member

    bpo-20186.mathmodule.v2.patch needed just synchronizing docstrings.

    @serhiy-storchaka
    Copy link
    Member

    bpo-20186.enumobject.patch LGTM too.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jan 19, 2017

    New changeset e3db9bccff3f by Serhiy Storchaka in branch 'default':
    Issue bpo-20186: Converted builtins enumerate() and reversed() to Argument Clinic.
    https://hg.python.org/cpython/rev/e3db9bccff3f

    @serhiy-storchaka
    Copy link
    Member

    Sorry for committing patches for you Tal, but they were hanging so long time.

    @taleinat
    Copy link
    Contributor

    Serhiy, no apology is required. On the contrary, thank you for the taking the time to review this and commit, I don't have time available for this these days.

    @serhiy-storchaka
    Copy link
    Member

    There are patches based on modules_issue20186.patch that convert to Argument
    Clinic 4 modules: _csv, _lsprof, _tracemalloc and symtable. They are
    synchronized with current sources and updated to current Argument Clinic.
    Other changes:

    • Addressed Larry's comments.

    • Used converter names instead of format units (object instead of 'O' etc).

    • Removed unneeded self declarations in _lsprof.

    • Removed changes for _csv.Dialect.__new__. Generated signature has None as
      default values, but this is not true. Actually parameters don't have default
      values. This function can't be converted.

    • Parameters of _lsprof.Profiler.enable should have the int converter, not
      bool. Default values are -1 (meaning "don't change current value"), not True.

    • Made the parameter of _tracemalloc._get_object_traceback() positional-only.

    • Simplified code for creating a result in _traceback functions.

    @rhettinger
    Copy link
    Contributor

    The application of AC to enumerate() lost information about the start argument and the signature of the call. We're going backwards.

    ---- New help -----------------------------------------------------

    class enumerate(object)
     |  Return an enumerate object.
     |
     |    iterable
     |      an object supporting iteration
     |
     |  The enumerate object yields pairs containing a count (from start, which
     |  defaults to zero) and a value yielded by the iterable argument.
     |
     |  enumerate is useful for obtaining an indexed list:
     |      (0, seq[0]), (1, seq[1]), (2, seq[2]), ...

    ---- Old help -----------------------------------------------------

    class enumerate(object)
     |  enumerate(iterable[, start]) -> iterator for index, value of iterable
     |  
     |  Return an enumerate object.  iterable must be another object that supports
     |  iteration.  The enumerate object yields pairs containing a count (from
     |  start, which defaults to zero) and a value yielded by the iterable argument.
     |  enumerate is useful for obtaining an indexed list:
     |      (0, seq[0]), (1, seq[1]), (2, seq[2]), ...

    Also, reversed() lost the indication of its signature:
    reversed(sequence) -> reverse iterator over values of the sequence

    And the help doesn't have the usual:

    iterable
    |      an object supporting iteration
    

    @rhettinger
    Copy link
    Contributor

    When reviewing AC patches, we should always compare the help() before and after.

    Also, if the code already had fast parsing like:
    if (!PyArg_UnpackTuple(args, "reversed", 1, 1, &seq) )
    there needs to be a before and after timing to make sure there wasn't a performance regression.

    When PyArg_NoKeywords was present, we need to verify that the AC version also precludes keyword arguments (to prevent the creation of unhelpful keyword args and to keep compatibility with other versions of Python).

    @serhiy-storchaka
    Copy link
    Member

    Microbenchmarks:

    $ ./python -m perf timeit --duplicate 100 "enumerate('abc')"
    Unpatched:  Median +- std dev: 1.76 us +- 0.10 us
    Patched:    Median +- std dev: 1.61 us +- 0.07 us
    
    $ ./python -m perf timeit --duplicate 100 "enumerate('abc', 1)"
    Unpatched:  Median +- std dev: 2.14 us +- 0.09 us
    Patched:    Median +- std dev: 1.76 us +- 0.07 us
    
    $ ./python -m perf timeit --duplicate 100 "reversed('abc')"
    Unpatched:  Median +- std dev: 1.20 us +- 0.06 us
    Patched:    Median +- std dev: 1.20 us +- 0.07 us

    enumerate() is 9-21% faster (due to avoiding of tuple creating), reversed() is not changed (Argument Clinic generates the same parsing code for it).

    @serhiy-storchaka
    Copy link
    Member

    Here is a patch that restores ol docstrings for enumerate() and reversed(). But it may be better to change pydoc so that it would output a text signature of class constructor if available.

    @serhiy-storchaka
    Copy link
    Member

    Argument Clinic generates incorrect parsing code for _csv.field_size_limit().

    @serhiy-storchaka
    Copy link
    Member

    The problem with lsprof_clinic.patch is that it exposes default value of _lsprof.Profiler.enable() parameters as -1. Actually _lsprof.Profiler.enable() should accept boolean arguments without default value.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Feb 4, 2017

    New changeset 7f8a3eb3459e by Serhiy Storchaka in branch 'default':
    Issue bpo-20186: Converted the symtable module to Argument Clinic.
    https://hg.python.org/cpython/rev/7f8a3eb3459e

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Feb 4, 2017

    New changeset e1df73b46094 by Serhiy Storchaka in branch 'default':
    Issue bpo-20186: Converted the tracemalloc module to Argument Clinic.
    https://hg.python.org/cpython/rev/e1df73b46094

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Feb 4, 2017

    New changeset b0ef37ec83f337b4b77275b367288a5656a0682c by Serhiy Storchaka in branch 'master':
    Issue bpo-20186: Converted the symtable module to Argument Clinic.
    b0ef37e

    New changeset 18a02e9d1f8e981b7b2f4287a4ed871021b13ade by Serhiy Storchaka in branch 'master':
    Issue bpo-20186: Converted the tracemalloc module to Argument Clinic.
    18a02e9

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Feb 5, 2017

    New changeset 8ccb3ad39ee4 by Serhiy Storchaka in branch 'default':
    Issue bpo-20186: Regenerated Argument Clinic.
    https://hg.python.org/cpython/rev/8ccb3ad39ee4

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Feb 5, 2017

    New changeset 5762cf299f863e06244e6b44ba3a91efee7b35c1 by Serhiy Storchaka in branch 'master':
    Issue bpo-20186: Regenerated Argument Clinic.
    5762cf2

    @serhiy-storchaka
    Copy link
    Member

    PR 614: Objects/tupleobject.c

    @serhiy-storchaka
    Copy link
    Member

    New changeset 0b56159 by Serhiy Storchaka in branch 'master':
    bpo-20186: Convert tuple object implementation to Argument Clinic. (#614)
    0b56159

    @jaraco
    Copy link
    Member

    jaraco commented Mar 6, 2021

    I suspect change in this issue led to bpo-43413.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    JelleZijlstra pushed a commit that referenced this issue Oct 6, 2022
    …ller (#94591)
    
    These were intentionally skipped when operator was updated to use the argument clinic:
    #64385 (comment)
    
    However, by not using the argument clinic, they missed out on getting signatures.
    This is a narrow PR to update the docstrings so that `__text_signature__` can be
    extracted from them.  Updating to use the argument clinic is beyond scope.
    
    `methodcaller` uses `*args, **kwargs` to match variadic names used elsewhere,
    including in `operator.call`.
    mpage pushed a commit to mpage/cpython that referenced this issue Oct 11, 2022
    …thodcaller (python#94591)
    
    These were intentionally skipped when operator was updated to use the argument clinic:
    python#64385 (comment)
    
    However, by not using the argument clinic, they missed out on getting signatures.
    This is a narrow PR to update the docstrings so that `__text_signature__` can be
    extracted from them.  Updating to use the argument clinic is beyond scope.
    
    `methodcaller` uses `*args, **kwargs` to match variadic names used elsewhere,
    including in `operator.call`.
    @carljm carljm closed this as completed Feb 20, 2023
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life extension-modules C modules in the Modules dir topic-argument-clinic type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    8 participants