This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Argument Clinic for bisect.bisect_left
Type: Stage: resolved
Components: Argument Clinic Versions: Python 3.7
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: larry, martin.panter, mdk, rhettinger, serhiy.storchaka, skrah, vstinner
Priority: normal Keywords: patch

Created on 2016-11-20 16:51 by mdk, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
issue28754.diff mdk, 2016-11-20 16:52 Missing Modules/clinic/_bisectmodule.c.h review
issue28754-2.diff mdk, 2016-11-20 16:56 review
bisect_left.diff mdk, 2016-11-20 21:30 review
bisect.diff mdk, 2016-11-20 21:30
insort.diff mdk, 2016-11-20 21:30
insort-left.diff mdk, 2016-11-20 21:30
issue28754-3.diff mdk, 2016-11-20 21:33 review
issue28754-4.diff mdk, 2016-11-24 16:19 review
issue28754-5.diff mdk, 2016-11-24 23:36 review
check_bisect_doc.py vstinner, 2016-11-25 13:07
issue28754-6.diff mdk, 2016-11-26 09:23 review
issue28754-7.diff mdk, 2016-11-29 18:43 review
issue28754-8.diff mdk, 2016-12-10 09:36 review
issue28754-9.diff mdk, 2016-12-10 14:55 review
Pull Requests
URL Status Linked Edit
PR 177 closed mdk, 2017-02-19 16:20
Messages (54)
msg281282 - (view) Author: Julien Palard (mdk) * (Python committer) Date: 2016-11-20 16:51
Today I read https://docs.python.org/3.6/howto/clinic.html so I tried one: bisect.bisect_left.

I was unable to do `bisect_right`, as it's an "alias" for `bisect`, and there's a unit-test checking `self.assertEqual(self.module.bisect, self.module.bisect_right)`. Any hint welcome.
msg281283 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2016-11-20 17:08
There's special syntax to handle aliases.  From comments in clinic.py:

        # alternatively:
        #     modulename.fnname [as c_basename] = modulename.existing_fn_name
        # clones the parameters and return converter from that
        # function.  you can't modify them.  you must enter a
        # new docstring.

Sorry the docs are out of date--patches welcome.  And speaking of patches, thanks for the patch!  I'll review it when you say it's ready.
msg281285 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2016-11-20 17:09
Oh, and, if the code literally asserts they're the same function, that's just a sanity check based on the implementation.  You could preserve that if you care to, or you could just write a new function and remove the assertion.  Do what you think is best, then post your patch to the tracker and we'll all argue about it forever, how's that sound?
msg281299 - (view) Author: Julien Palard (mdk) * (Python committer) Date: 2016-11-20 20:12
I searched an occurrence of what I'm describing which is already using clinic and there is, at least, one in Modules/binascii.c line 1090:

TL;DR: The idea is to use the `modulename.fnname [as c_basename] = modulename.existing_fn_name` clinic syntax, drop a "This function is also available as …" in the docstring, and check if they return the same thing in the tests.

This way looks good to me, it uses the dedicated syntax which looks right, and I'll drop the `self.assertEqual(self.module.bisect, self.module.bisect_right)` which is an implementation detail, replacing it by tests checking that both have the same behavior.

```
/*[clinic input]
binascii.b2a_hex

    data: Py_buffer
    /

Hexadecimal representation of binary data.

The return value is a bytes object.  This function is also
available as "hexlify()".
[clinic start generated code]*/

static PyObject *
binascii_b2a_hex_impl(PyObject *module, Py_buffer *data)
/*[clinic end generated code: output=92fec1a95c9897a0 input=96423cfa299ff3b1]*/
{
    return _Py_strhex_bytes((const char *)data->buf, data->len);
}

/*[clinic input]
binascii.hexlify = binascii.b2a_hex

Hexadecimal representation of binary data.

The return value is a bytes object.
[clinic start generated code]*/

static PyObject *
binascii_hexlify_impl(PyObject *module, Py_buffer *data)
/*[clinic end generated code: output=749e95e53c14880c input=2e3afae7f083f061]*/
{
    return _Py_strhex_bytes((const char *)data->buf, data->len);
}
```
msg281302 - (view) Author: Julien Palard (mdk) * (Python committer) Date: 2016-11-20 21:30
Here it is for the whole bisect module.

I separated my work in commits, but I'm not sure how rietveld will eat that as they'll have unknown references, so I'll probably also upload a single patch with a known reference.
msg281303 - (view) Author: Julien Palard (mdk) * (Python committer) Date: 2016-11-20 21:34
The whole diff is reviewable in `issue28754-3.diff`.
msg281652 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-11-24 21:08
Hum, I dislike your changes on test_bisect.py: I don't see why Argument Clinic would have to modify tests!?

I created issue #28792: "bisect: implement aliases in Python, remove C aliases".
msg281669 - (view) Author: Julien Palard (mdk) * (Python committer) Date: 2016-11-24 23:36
New simplier patch thanks to https://bugs.python.org/issue28792. Also corrected docstrings.

Also ran a micro-benchmark of `bisect.bisect(foo, "c")` with `foo = list("abcdef")`:
Median +- std dev: [before] 434 ns +- 17 ns -> [after] 369 ns +- 22 ns: 1.18x faster

FTR: ./python -m perf timeit -s 'import bisect; foo = list("abdef")' 'bisect.bisect(foo, "c")' -o after.json --inherit=PYTHONPATH --affinity=2,3
msg281685 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-11-25 08:29
issue28754-4.diff does 3 things:
* convert C code to Argument Clinic
* change docstrings of the C code
* change unit tests

issue28754-5.diff doesn't touch tests, but still does 2 things. Sadly, I dislike the changes on docstrings because it makes the C docstring and the Python docstring inconsistent.

I suggest to copy/paste docstring between C and Python code.

By the way, why do we have a docstring different from Doc/library/bisect.rst? Maybe we can simply use the same doc everywhere, but just add reST formatting in Doc/library/bisect.rst?

Julien: I suggest to first restrict your change to Argument Clinic, and later write a new patch to update docstrings. I prefer to only do one thing per commit.
msg281712 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-11-25 13:00
Victor, what changes to the doc strings are you talking about? Adding descriptions of the parameters, maybe? Other than that they look very similar to me.

I haven’t looked closely, but the Python doc strings have been updated at least once more than the C doc strings: r54666 and Issue 1602378. So if you are copying anything, it might be better to copy from Py to C. But I would do that separately.

The doc strings and reference documentation just seem to have been added independently: ce2d120c3903 (main documentation) vs 67b3ac439f64 (doc strings)
msg281713 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-11-25 13:07
Martin: "Victor, what changes to the doc strings are you talking about?"

See attached check_bisect_doc.py script.

Before:
---
=== bisect_right ===
bisect_right(a, x[, lo[, hi]]) -> index

Return the index where to insert item x in list a, assuming a is sorted.

The return value i is such that all e in a[:i] have e <= x, and all e in
a[i:] have e > x.  So if x already appears in the list, i points just
beyond the rightmost x already there

Optional args lo (default 0) and hi (default len(a)) bound the
slice of a to be searched.

=== insort_right ===
insort_right(a, x[, lo[, hi]])

Insert item x in list a, and keep it sorted assuming a is sorted.

If x is already in a, insert it to the right of the rightmost x.

Optional args lo (default 0) and hi (default len(a)) bound the
slice of a to be searched.

=== bisect_left ===
bisect_left(a, x[, lo[, hi]]) -> index

Return the index where to insert item x in list a, assuming a is sorted.

The return value i is such that all e in a[:i] have e < x, and all e in
a[i:] have e >= x.  So if x already appears in the list, i points just
before the leftmost x already there.

Optional args lo (default 0) and hi (default len(a)) bound the
slice of a to be searched.

=== insort_left ===
insort_left(a, x[, lo[, hi]])

Insert item x in list a, and keep it sorted assuming a is sorted.

If x is already in a, insert it to the left of the leftmost x.

Optional args lo (default 0) and hi (default len(a)) bound the
slice of a to be searched.

---

After:
---
=== bisect_right ===
Return the index where to insert item x in list a, assuming a is sorted.

  a
    The list in which x is to be inserted.
  x
    The value to be inserted.
  lo
    Lower bound, defaults to 0.
  hi
    Upper bound, defaults to -1 (meaning len(a)).

The return value i is such that all e in a[:i] have e <= x, and all e in
a[i:] have e > x.  So if x already appears in the list, i points just
beyond the rightmost x already there.

Optional args lo and hi bound the slice of a to be searched.
=== insort_right ===
Insert item x in list a, and keep it sorted assuming a is sorted.

  a
    The list in which x will be inserted.
  x
    The value to insert.
  lo
    Lower bound, defaults to 0.
  hi
    Upper bound, defaults to -1 (meaning len(a)).

If x is already in a, insert it to the right of the rightmost x.

Optional args lo and hi bound the slice of a to be searched.
=== bisect_left ===
Return the index where to insert item x in list a, assuming a is sorted.

  a
    The list in which x is to be inserted.
  x
    The value to be inserted.
  lo
    Lower bound, defaults to 0.
  hi
    Upper bound, defaults to -1 (meaning len(a)).

The return value i is such that all e in a[:i] have e < x, and all e in
a[i:] have e >= x.  So if x already appears in the list, i points just
before the leftmost x already there.

Optional args lo and hi bound the slice of a to be searched.
=== insort_left ===
Insert item x in list a, and keep it sorted assuming a is sorted.

  a
    The list in which x will be inserted.
  x
    The value to insert.
  lo
    Lower bound, defaults to 0.
  hi
    Upper bound, defaults to -1 (meaning len(a)).

If x is already in a, insert it to the left of the leftmost x.

Optional args lo and hi bound the slice of a to be searched.

---

The output is different. I suggest to not touch the docstring in the patch upgrading _bisect.c to Argument Clinic to ease the review.
msg281717 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-11-25 13:35
Taking the first function, bisect_right(), as an example, I see these differences:

* bisect_right(a, x[, lo[, hi]]) -> index
This signature is removed. I think removing it is reasonable, because pydoc can extract the proper signature from the Arg Clinic metadata.

* Additional descriptions of each parameter. I tend to think these are redundant with the main text, so agree with removing them from the patch now.

* Addition of full stop (.) at end of first paragraph. I suggested this as a minor cleanup, but it could be fixed later if you prefer.

* Removal of default values in last paragraph. The first is redundant with the Arg Clinic signature, so I suggested to remove it. For the second, I would add it back if we remove the list of parameters, since it explains what the special value -1 means.

Do you want to revert all these differences?
msg281734 - (view) Author: Julien Palard (mdk) * (Python committer) Date: 2016-11-25 19:37
Quoting @martin
> * bisect_right(a, x[, lo[, hi]]) -> index
> This signature is removed. I think removing it is reasonable, because pydoc can extract the proper signature from the Arg Clinic metadata.

In fact, http://docs.python.org/howto/clinic.html, bullet ".4", told me do remove it:

> If the old docstring had a first line that looked like a function signature, throw that line away. (The docstring doesn’t need it anymore—when you use help() on your builtin in the future, the first line will be built automatically based on the function’s signature.)


> * Additional descriptions of each parameter. I tend to think these are redundant with the main text, so agree with removing them from the patch now.

OK, will remove them.

* Addition of full stop (.) at end of first paragraph. I suggested this as a minor cleanup, but it could be fixed later if you prefer.

Can remove it to simplify verification, but as doctests will be different as doc told me to remove the signature, so is it work removing this dot ? I don't think so.

* Removal of default values in last paragraph. The first is redundant with the Arg Clinic signature, so I suggested to remove it. For the second, I would add it back if we remove the list of parameters, since it explains what the special value -1 means.

I'll revert it too as I'll remove the documentation in the parameter list.

Will upload a patch later.
msg281761 - (view) Author: Julien Palard (mdk) * (Python committer) Date: 2016-11-26 09:23
Here is the new patch, I ran a diff between "./python -m pydoc _bisect" before and after my patch, here it is:

13,15c13
<     bisect_left(...)
<         bisect_left(a, x[, lo[, hi]]) -> index
<         
---
>     bisect_left(a, x, lo=0, hi=-1)
25,27c23
<     bisect_right(...)
<         bisect_right(a, x[, lo[, hi]]) -> index
<         
---
>     bisect_right(a, x, lo=0, hi=-1)
32c28
<         beyond the rightmost x already there
---
>         beyond the rightmost x already there.
37,39c33
<     insort_left(...)
<         insort_left(a, x[, lo[, hi]])
<         
---
>     insort_left(a, x, lo=0, hi=-1)
47,49c41
<     insort_right(...)
<         insort_right(a, x[, lo[, hi]])
<         
---
>     insort_right(a, x, lo=0, hi=-1)

So I kept my addition of the missing full stop, and dropped the handwritten signature as Argument Clinic generates it.
msg281762 - (view) Author: Julien Palard (mdk) * (Python committer) Date: 2016-11-26 09:27
Should we also update howto/clinic, bullet "11.", to be explicit about not adding per-parameter in the same patch? Like a:

> If you're porting existing function to Argument clinic, skip this step to simplify code review.

?
msg281763 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2016-11-26 10:19
Curiously, this patch gives about a 10% to 15% speedup.  Any sense of how that improvement arises?

$ py3.7 -m timeit -s 'from bisect import bisect' -s 'a=list(range(100))' 'bisect(a, 10)'

229 nsec     # Clang baseline 
202 nsec     # Clang with patch

189 nsec     # GCC-6 baseline
156 nsec     # Gcc-6 with patch
msg281764 - (view) Author: Julien Palard (mdk) * (Python committer) Date: 2016-11-26 10:26
Hi Raymond,

> Curiously, this patch gives about a 10% to 15% speedup.  Any sense of how that improvement arises?

That's because Argument Clinic is generating methoddef with METH_FASTCALL:

$ grep FASTCALL Modules/clinic/_bisectmodule.c.h 
    {"bisect_right", (PyCFunction)bisect_bisect_right, METH_FASTCALL, bisect_bisect_right__doc__},
    {"insort_right", (PyCFunction)bisect_insort_right, METH_FASTCALL, bisect_insort_right__doc__},
    {"bisect_left", (PyCFunction)bisect_bisect_left, METH_FASTCALL, bisect_bisect_left__doc__},
    {"insort_left", (PyCFunction)bisect_insort_left, METH_FASTCALL, bisect_insort_left__doc__},

Instead of METH_VARARGS|METH_KEYWORDS:

-    {"bisect_right", (PyCFunction)bisect_right,
-        METH_VARARGS|METH_KEYWORDS, bisect_right_doc},
-    {"insort_right", (PyCFunction)insort_right,
-        METH_VARARGS|METH_KEYWORDS, insort_right_doc},
-    {"bisect_left", (PyCFunction)bisect_left,
-        METH_VARARGS|METH_KEYWORDS, bisect_left_doc},
-    {"insort_left", (PyCFunction)insort_left,
-        METH_VARARGS|METH_KEYWORDS, insort_left_doc},
msg281765 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-11-26 10:45
Julien, you can declare the hi parameter as

    hi: Py_ssize_t(py_default="len(a)") = -1

if you want the signature be matching the documentation.

In general, it is better to use converter names rather than format codes.

> Curiously, this patch gives about a 10% to 15% speedup.  Any sense of how that improvement arises?

This is rather a random difference. Try to run the bench several times.  With using Victor's perf module the difference is not significant.

Maybe using FASTCALL have some positive effect, but it is unlikely noticeable with such complex function.
msg281768 - (view) Author: Julien Palard (mdk) * (Python committer) Date: 2016-11-26 11:44
Hi Serhiy,

> Julien, you can declare the hi parameter as
>    hi: Py_ssize_t(py_default="len(a)") = -1

Looks like a good idea, I was aware of its existance but did not took the time to read the doc about it, kind of learning step by stpe. But as you provided the syntax, I tested it, thanks for this! But it fails with a RuntimeError during python -m pydoc _bisect, I'm currently trying to understand why:

$ ./python -m pydoc _bisect 
Traceback (most recent call last):
  File "/home/mdk/cpython-git/Lib/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/home/mdk/cpython-git/Lib/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/mdk/cpython-git/Lib/pydoc.py", line 2663, in <module>
    cli()
  File "/home/mdk/cpython-git/Lib/pydoc.py", line 2628, in cli
    help.help(arg)
  File "/home/mdk/cpython-git/Lib/pydoc.py", line 1908, in help
    elif request: doc(request, 'Help on %s:', output=self._output)
  File "/home/mdk/cpython-git/Lib/pydoc.py", line 1645, in doc
    pager(render_doc(thing, title, forceload))
  File "/home/mdk/cpython-git/Lib/pydoc.py", line 1638, in render_doc
    return title % desc + '\n\n' + renderer.document(object, name)
  File "/home/mdk/cpython-git/Lib/pydoc.py", line 382, in document
    if inspect.ismodule(object): return self.docmodule(*args)
  File "/home/mdk/cpython-git/Lib/pydoc.py", line 1172, in docmodule
    contents.append(self.document(value, key, name))
  File "/home/mdk/cpython-git/Lib/pydoc.py", line 384, in document
    if inspect.isroutine(object): return self.docroutine(*args)
  File "/home/mdk/cpython-git/Lib/pydoc.py", line 1357, in docroutine
    signature = inspect.signature(object)
  File "/home/mdk/cpython-git/Lib/inspect.py", line 2994, in signature
    return Signature.from_callable(obj, follow_wrapped=follow_wrapped)
  File "/home/mdk/cpython-git/Lib/inspect.py", line 2744, in from_callable
    follow_wrapper_chains=follow_wrapped)
  File "/home/mdk/cpython-git/Lib/inspect.py", line 2223, in _signature_from_callable
    skip_bound_arg=skip_bound_arg)
  File "/home/mdk/cpython-git/Lib/inspect.py", line 2055, in _signature_from_builtin
    return _signature_fromstr(cls, func, s, skip_bound_arg)
  File "/home/mdk/cpython-git/Lib/inspect.py", line 2003, in _signature_fromstr
    p(name, default)
  File "/home/mdk/cpython-git/Lib/inspect.py", line 1985, in p
    default_node = RewriteSymbolics().visit(default_node)
  File "/home/mdk/cpython-git/Lib/ast.py", line 253, in visit
    return visitor(node)
  File "/home/mdk/cpython-git/Lib/ast.py", line 317, in generic_visit
    new_node = self.visit(old_value)
  File "/home/mdk/cpython-git/Lib/ast.py", line 253, in visit
    return visitor(node)
  File "/home/mdk/cpython-git/Lib/inspect.py", line 1977, in visit_Name
    return wrap_value(node.id)
  File "/home/mdk/cpython-git/Lib/inspect.py", line 1959, in wrap_value
    raise RuntimeError()
RuntimeError


> > Curiously, this patch gives about a 10% to 15% speedup.  Any sense of how that improvement arises?
> This is rather a random difference. Try to run the bench several times.  With using Victor's perf module the difference is not significant.

If you search for "perf" in the history of this issue you'll see that I already tested with Victor's perf module and it yielded a consistent 18% improvement on bisect.bisect("abcdef", "c") on two different machines. Clearly with more than 5 items, it will start to fade, and with a lot of items, it's not significand, obviously as it's an optimization about the call and not the implementation. So Raymond's test with 100 items may still see a speedup.
msg281772 - (view) Author: Julien Palard (mdk) * (Python committer) Date: 2016-11-26 12:52
Hi Serhiy,

    hi: Py_ssize_t(py_default="len(a)") = -1

Won't works, as pydoc will use the inspect module (_signature_fromstr) to get the signature. _signature_fromstr expects a valid python signature, but `def foo(a, x, lo=0, high=len(a)): pass` is not valid (SyntaxError).

Should we note that in the clinic documentation:

> py_default
>     default as it should appear in Python code, as a string. Or None if there is no default.

> py_default
>     default as it should appear in valid Python code, as a string. Or None if there is no default.

?
msg281779 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-11-26 15:06
Argument Clinic adds a signature and a better docstring. So I like it!

Speedup for tiny lists is just a nice side effect.
msg281791 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2016-11-26 19:00
[SS]
> This is rather a random difference. Try to run the bench several times.

I did run several times.  The results were perfectly stable (+/- 1ns).  It tells us that METH_FASTCALL is something we want to use as broadly as possible.


[JP]
> it yielded a consistent 18% improvement on bisect.bisect("abcdef", "c")
> on two different machines.

Thanks for running timings.  When you do timing in the future, try to select timings that are indicative of actual use.  No one uses bisect to search strings for a character -- the typical use case is searching a list of range breakpoints like the example shown in the docs:

    def grade(score, breakpoints=[60, 70, 80, 90], grades='FDCBA'):
        i = bisect(breakpoints, score)
        return grades[i]

Hash tables are faster for exact lookup.  Bisect is mainly used for searching ranges (i.e. tax brackets or position in a cumulative distribution).

[VS]
> Speedup for tiny lists is just a nice side effect.

To you perhaps; however, the entire reason I put this code in many years ago was for performance.  The existing pure python code was already very fast.  Anyway, this patch is a nice improvement in that regard.

[Everyone]

What to do about the "hi" argument is a sticky point.  This is a long-standing API, so any API changes would likely break some code or lock us into exposing unintuitive implementation details (like hi=-1).

The point of view of the existing C code, the pure Python version, and the docs is that the "hi" argument is an optional argument that when omitted will default to "len(a)".  It was not intended that a user would explicitly pass in "hi=None" as allowed by the pure python version but not by the C version, nor was it intended to have the user pass in "hi=-1" as allowed by the C version but causes incorrect behavior on the pure Python version.

It is unfortunate that the argument clinic has not yet grown support for the concept of "there is special meaning for an omitted argument that cannot be simulated by passing in a particular default value".  I believe when this issue has arisen elsewhere, the decision was to skip applying argument clinic at all.  See:  builtin_getattr, builtin_next, dict_pop, etc.

In this case though, we already have a conflict, so it may be possible to resolve the issue through clear parameter descriptions, indicating that "hi=-1" and "hi=None" are implementation details and that it is intended that the user never explicitly pass in either of those values, and that only the guaranteed ways to get the default is to omit the argument entirely or to pass in "hi=len(a)" as a numeric argument.
msg281792 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-11-26 19:18
I would suggest to use sys.maxsize for default value in Argument Clinic, but right now bisect functions don't support well hi > len(a). This needs adding an equivalent of hi = min(hi, len(a)) in C code.

-1 could be explicitly supported for backward compatibility. But it could be deprecated.
msg281832 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2016-11-27 18:37
> I would suggest to use sys.maxsize for default value in Argument Clinic

No thanks.  I don't want to subtly change the API for this module or even suggest to users that it might be a good idea to set hi > len.  Further, we don't want to slow the pure python code for this pointless extension.

Really, we're dancing around the issue that argument clinic and signatures need to become more expressive, allowing for omitted arguments without requiring a default value.  Waiting for this to be done is far preferable to mangling long-standing APIs to force fit them to into argument clinic.

Elsewhere, we've shown self discipline and restraint when applying AC, skipping over cases where it doesn't fit.  If we can't find a way to apply AC without changing this API, this issue should be closed and deferred until AC grows the requisite expressive power.
msg281834 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2016-11-27 20:12
After some further thought, I am open to changing the C API to accept either hi=None or hi=-1 to reconcile the implementation details without breaking any existing code that relies on either.  That would let the argument clinic expose a default value of hi=None.  That isn't perfect because it complicates the code and it exposes an implementation detail contrary to what the API originally intended.

Short of that, this module ought to be skipped for the argument clinic pending a build-out of signature objects to handle signatures that vary depending on the number of arguments supplied.
msg281878 - (view) Author: Julien Palard (mdk) * (Python committer) Date: 2016-11-28 14:22
Hi Raymond,

About Argument Clinic
#####################

Just to clarify the situation, Argument Clinic allows for clear method signature, typically: `hi: Py_ssize_t(py_default="len(a)") = -1` gives the expected "hi=len(a)" signature successfully.

The problem is that it's not a legal signature (it's a syntax error), so pydoc crashes while parsing it (search for RuntimeError in the history of the issue for a full stacktrace).

About what should we put in the signature
#########################################

Question is: Should we allow "semantically precise, but frustrating*" signatures to be documented or should we be move the semantic away from the signature?

*: I mean, is a new user sees "hi=len(a)" in a signature in a docstring he will immediately flag this syntax as valid and usable later, which lead him to a syntax error and some frustration.

Something else to keep in mind is that I expect users be able to call those methods indirectly, like:

def whatever_they_need(self, lo=0, hi=?):
    …
    bisect.bisect(…, …, lo, hi)
    …

Currently, to follow strictly the documentation they have to write:

def whatever_they_need(self, lo=0, hi=None):
    …
    if hi is not None:
        bisect.bisect(…, …, lo, hi)
    else:
        bisect.bisect(…, …, lo)
    …

Which looks bad to me. So from my point of view we *have to* tell them what is the real default value for this parameter so they can use it in their implementations.

Should we generalize and say that depending on the number of arguments is a bad practice? I think so.
msg281880 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2016-11-28 16:16
Sorry Julien, you don't to decide that long-standing APIs that have worked just fine for users are now a bad practice.  Some of these APIs (range, dict.pop, type) we designed by Guido and work fine in practice.  The expression, "don't have the tail wag the dog" comes to mind (where "tail" is the existing successful language and "dog" is the argument clinic).  Besides, API change is very disruptive to users (especially those trying to migrate to Python 3).

For the case of bisect, I've given a way forward.  It is okay to make the documented default value for "hi" to be "None" while internally still allowing -1 so that we don't break code that happens to depend on it.  In the clinic, use "hi: 'O' = None".  Then in the dispatch code, handle both the None case and the numeric case (see islice_new() in itertoolsmodule.c for some ideas on how to do this.

What isn't okay is to document and expose "hi=-1" because that is a confusing and error-prone default value that is just an awkward hack to express "hi=len(a)".
msg281888 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-11-28 16:51
Sorry, I didn't follow the discussion, but why not using "/" magic separator in Argument Clinic to mark arguments as optional but don't document their default value?
msg281892 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-11-28 17:03
Yet one option is to introduce the constant UNBOUND = -1 in the bisect module and use it as a default value.
msg281899 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-11-28 17:23
> Sorry, I didn't follow the discussion, but why not using "/" magic separator in Argument Clinic to mark arguments as optional but don't document their default value?

'/' marks positional-only arguments, not optional arguments.
msg281901 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-11-28 17:50
Ah right, _bisect.bisect_right() support keywords. I expected that it
doesn't support keywords yet ;-)
msg281926 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-11-29 01:01
Regarding the problem with the default value, can we use “unspecified”, as documented at <https://docs.python.org/dev/howto/clinic.html#writing-a-custom-converter>, or is that an old relic that no longer exists (hinted at the end of Issue 20293)?

Failing that, I prefer Raymond’s earlier suggestion: say that any support for −1 or None are implementation details, and not part of the API. The same statement could be added to the native Python doc strings as well.

BTW I am not opposed to the proposal for proper hi=None support either, although it does complicate the change. IMO allowing hi=None could be done as a second step.
msg282028 - (view) Author: Julien Palard (mdk) * (Python committer) Date: 2016-11-29 18:43
I tried myself at Argument Clinic custom converters to create the "optional ssize_t converter" and it works, so as advised by Raymond, pydoc now shows None, and the C implementation now allows for both default values None and -1, as the custom converter returns -1 for none.
msg282030 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2016-11-29 19:01
Julien, the syntax converters look pretty clever.  Do we need AC
everywhere though?  I wonder (once again) if this is really more
readable than the existing code.
msg282038 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-11-29 20:00
Stefan Krah added the comment:
> Julien, the syntax converters look pretty clever.  Do we need AC
> everywhere though?

Please see previous comments for advantages of AC (signature object,
docstring, speed).
msg282041 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2016-11-29 20:22
Signature and docstring can be done manually with very little effort.
Currently METH_FASTCALL is AC only, but I hope that will change.
msg282049 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-11-29 21:34
If adding proper support for hi=None, maybe lo=None should also be supported. Also, I would think the main Doc/library/bisect.rst documentation needs updating, and a test and What’s New entry added.
msg282063 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2016-11-30 04:01
> If adding proper support for hi=None, maybe lo=None should
> also be supported.

That would be gratuitous.  Lo already has a reasonable, useful, and self-explanatory value.  This would add more complexity to the signature while reducing clarity.  I really don't want to see calls like bisect(arr, x, None).
msg282308 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-12-03 23:08
Fair enough, I don’t really mind if it is (lo=0, hi=None). I think I have only used bisect with both defaults anyway.
msg282350 - (view) Author: Julien Palard (mdk) * (Python committer) Date: 2016-12-04 16:11
Hi Martin,

Historically (lo=0, hi=None) was the implementation (in Python), until the C implementation, which used (lo=0, hi=-1). As my implementation allows for -1 and None, I don't think I'm breaking something.
msg282842 - (view) Author: Julien Palard (mdk) * (Python committer) Date: 2016-12-10 09:36
I was able to simplify my patch a bit, I think I should also add a test to ensure we keep the hi=-1 and hi=None compatibility in the future.
msg282854 - (view) Author: Julien Palard (mdk) * (Python committer) Date: 2016-12-10 14:54
Added a test to ensure compatibility of both hi=None (introduced in original Python version) and hi=-1 (Introduced by the C version).

Modified Python version to be compatible with the C-introduced hi=-1, so that the new test pass.
msg282864 - (view) Author: Julien Palard (mdk) * (Python committer) Date: 2016-12-10 17:58
Maybe wait for issue28933.
msg282875 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2016-12-10 20:44
FWIW, pypy uses defaults of lo=0, hi=None.

https://bitbucket.org/pypy/pypy/src/25da7bc9719457aee57547993833d5168ba6aded/lib-python/2.7/bisect.py?at=default&fileviewer=file-view-default
msg282877 - (view) Author: Julien Palard (mdk) * (Python committer) Date: 2016-12-10 21:22
Hi Larry,

In any cases it looks like supporting hi=-1 and hi=None is mandatory: 

hi=None is the current implementation in the bisect (Python) module.
hi=-1 is the current implementation in the _bisect (C) module.

Both are currently living together, the C version takes over the Python version when available, but the Python version is older, so both -1 and None has been the "current implementation" during some time.

Raymond legitimately fear that someone, somewhere, used -1 in some code. We should not break it (even if it had never been documented).

So let's just accept both for the moment, and if deprecation of one or another should be discussed, let's do it later, in another issue. I'm _just_ trying to make AC move forward here, without breaking anything.
msg282882 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2016-12-10 22:19
Okay, but do it with a converter function, not by changing the default behavior for Py_ssize_t.  The vast majority of uses of Py_ssize_t should not accept None.
msg282888 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2016-12-11 00:11
Why does this need a converter function?  I there is any reason this can't use "O" with "hi" defaulting to None (matching the pure python API) and letting the C function handle both -1 and None in the implementation rather than in AC?
msg282889 - (view) Author: Julien Palard (mdk) * (Python committer) Date: 2016-12-11 00:27
Hi Raymond,

I don't like having the converters in the C implementation too, that's why I'm working on issue28933 to clean this.

> letting the C function handle both -1 and None in the implementation rather than in AC?

It works, yes. But I prefer to clearly split responsibilities: AC being responsible of adapting argument from PyObjects to C types, and C functions being responsible of ... doing their job.

If the idea in issue28933 is accepted, we'll be able to declare hi as simply as:

    hi: Py_ssize_t(c_default="-1") = None

meaning "C function will get a Py_ssize_t, default value for C code is -1, None is documented, and None can be passed to get the C default value", that's this last point "None can be passed to get the C default value" that I introduce in issue28933.

With this syntax, both C converters and the python hi_parameter_converter can be dropped.
msg282891 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2016-12-11 00:32
We don't *have* to use a converter function, no.  But consider: someone somewhere will have to convert a PyObject * into a Py_ssize_t, also accepting None.  Doing it in a converter function means we can easily share the code with bisect_right, which should presumably behave the same way.  It also keeps all the argument parsing logic in the Argument Clinic domain instead of having some in AC and some in the implementation.

Is there some drawback to using a converter function?
msg283100 - (view) Author: Julien Palard (mdk) * (Python committer) Date: 2016-12-13 13:31
As the pattern of this converter is not widely used, I'll let it in the code of _bisect for the moment, see: http://bugs.python.org/issue28933.
msg291160 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-04-05 06:34
Now `Py_ssize_t(accept={int, NoneType})` can be used instead of a local converter.
msg291162 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2017-04-05 06:42
Wouldn't it be easily to let the OP apply AC to some other module.  Right now, it isn't a very good fit and was this a tough one to start with.
msg302894 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2017-09-24 23:15
After more thought, I'm going to close this one because the Argument Clinic and signature objects aren't yet a good fit for this API.  The bisect module isn't broken (its API has been successfully living in the wild for a very long time).  I do welcome Argument Clinic patches where there is a good fit but don't think this is the place.  Each of the proposed patches makes the module worst in one way or another.
msg302908 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2017-09-25 02:51
FWIW, the itertools module has many functions that can have the ArgumentClinic applied without any fundamental problems (pretty much all of the functions will work except for repeat(), accumulate(), and islice()).
History
Date User Action Args
2022-04-11 14:58:39adminsetgithub: 72940
2017-09-25 02:51:14rhettingersetmessages: + msg302908
2017-09-24 23:15:12rhettingersetstatus: open -> closed
resolution: rejected
messages: + msg302894

stage: resolved
2017-04-05 06:42:20rhettingersetmessages: + msg291162
2017-04-05 06:34:15serhiy.storchakasetmessages: + msg291160
2017-02-19 16:20:27mdksetpull_requests: + pull_request143
2016-12-13 13:31:45mdksetmessages: + msg283100
2016-12-11 00:32:38larrysetmessages: + msg282891
2016-12-11 00:27:09mdksetmessages: + msg282889
2016-12-11 00:11:38rhettingersetmessages: + msg282888
2016-12-10 22:19:53larrysetmessages: + msg282882
2016-12-10 21:22:13mdksetmessages: + msg282877
2016-12-10 20:44:31larrysetmessages: + msg282875
2016-12-10 17:58:39mdksetmessages: + msg282864
2016-12-10 14:55:03mdksetfiles: + issue28754-9.diff
2016-12-10 14:54:54mdksetmessages: + msg282854
2016-12-10 09:36:39mdksetfiles: + issue28754-8.diff

messages: + msg282842
2016-12-04 16:11:20mdksetmessages: + msg282350
2016-12-03 23:08:28martin.pantersetmessages: + msg282308
2016-11-30 04:01:14rhettingersetmessages: + msg282063
2016-11-29 21:34:29martin.pantersetmessages: + msg282049
2016-11-29 20:22:37skrahsetmessages: + msg282041
2016-11-29 20:00:52vstinnersetmessages: + msg282038
2016-11-29 19:01:27skrahsetnosy: + skrah
messages: + msg282030
2016-11-29 18:43:30mdksetfiles: + issue28754-7.diff

messages: + msg282028
2016-11-29 01:01:44martin.pantersetmessages: + msg281926
2016-11-28 17:50:08vstinnersetmessages: + msg281901
2016-11-28 17:23:22serhiy.storchakasetmessages: + msg281899
2016-11-28 17:03:21serhiy.storchakasetmessages: + msg281892
2016-11-28 16:51:11vstinnersetmessages: + msg281888
2016-11-28 16:16:14rhettingersetmessages: + msg281880
2016-11-28 14:22:17mdksetmessages: + msg281878
2016-11-27 20:12:24rhettingersetmessages: + msg281834
2016-11-27 18:37:28rhettingersetassignee: rhettinger
messages: + msg281832
2016-11-26 19:18:14serhiy.storchakasetmessages: + msg281792
2016-11-26 19:00:04rhettingersetmessages: + msg281791
2016-11-26 15:06:44vstinnersetmessages: + msg281779
2016-11-26 12:52:19mdksetmessages: + msg281772
2016-11-26 11:44:19mdksetmessages: + msg281768
2016-11-26 10:45:07serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg281765
2016-11-26 10:26:37mdksetmessages: + msg281764
2016-11-26 10:19:50rhettingersetmessages: + msg281763
2016-11-26 09:27:42mdksetmessages: + msg281762
2016-11-26 09:23:22mdksetfiles: + issue28754-6.diff

messages: + msg281761
2016-11-25 19:37:06mdksetmessages: + msg281734
2016-11-25 13:35:20martin.pantersetmessages: + msg281717
2016-11-25 13:07:21vstinnersetfiles: + check_bisect_doc.py

messages: + msg281713
2016-11-25 13:00:17martin.pantersetnosy: + martin.panter
messages: + msg281712
2016-11-25 08:29:46vstinnersetmessages: + msg281685
2016-11-24 23:37:00mdksetfiles: + issue28754-5.diff

messages: + msg281669
2016-11-24 21:37:27martin.panterlinkissue20185 dependencies
2016-11-24 21:08:51vstinnersetnosy: + vstinner
messages: + msg281652
2016-11-24 16:19:25mdksetfiles: + issue28754-4.diff
2016-11-20 21:34:30mdksetmessages: + msg281303
2016-11-20 21:33:31mdksetfiles: + issue28754-3.diff
2016-11-20 21:30:57mdksetfiles: + insort-left.diff
2016-11-20 21:30:42mdksetfiles: + insort.diff
2016-11-20 21:30:33mdksetfiles: + bisect.diff
2016-11-20 21:30:19mdksetfiles: + bisect_left.diff
2016-11-20 21:30:08mdksetmessages: + msg281302
2016-11-20 20:12:56mdksetmessages: + msg281299
2016-11-20 17:11:51serhiy.storchakasetnosy: + rhettinger

versions: + Python 3.7
2016-11-20 17:09:41larrysetmessages: + msg281285
2016-11-20 17:08:08larrysetmessages: + msg281283
2016-11-20 16:56:01mdksetfiles: + issue28754-2.diff
2016-11-20 16:52:24mdksetfiles: + issue28754.diff
keywords: + patch
2016-11-20 16:51:59mdkcreate