Title: "bisect" module should support reverse-sorted sequences
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.10
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: Nosy List: mCoding, rhettinger, terry.reedy
Priority: normal Keywords:

Created on 2021-02-22 18:12 by mCoding, last changed 2021-03-05 06:06 by rhettinger. This issue is now closed.

Messages (4)
msg387523 - (view) Author: James Murphy (mCoding) Date: 2021-02-22 18:12
Currently, the bisect module's functions all assume the user is maintaining a sorted list/sequence in increasing order. From the docs:

"This module provides support for maintaining a list in sorted order without having to sort the list after each insertion"

However, bisection is not limited to this use case, nor is it limited to increasing sequences. Many algorithms involve bisecting a sequence derived from (potentially immutable) user input, such as the standard longest-increasing-subsequence algorithm. Sometimes these derived sequences are naturally reverse-sorted, as they would be in the longest-decreasing-subsequence algorithm. (I have personally had to work around bisect not working with decreasing sequences in this case). There are other natural reasons for a reverse-sorted list to appear that one might want to bisect. Of course, a monotone decreasing function applied to a sorted list would result in a reverse-sorted one. One may want to use bisect as a root-finding algorithm for a decreasing function that may not be differentiable (or even continuous). Or a user may simply have chosen to reverse-sort using sorted(a, reverse=True).

In my mind, the bisect module's purpose should be to support common use cases of bisection, not specifically to maintain a sorted list.

So then the question arises, how to support reverse-sorted sequences? I see a few possible routes.

1. Add a "decreasing" parameter to bisect_left, bisect_right, (and perhaps insort_left, insort_right as well).

2. Add dedicated functions bisect_left_decreasing, bisect_right_decreasing, (and perhaps insort_left_decreasing, insort_right_decreasing as well) that operate on reverse-sorted sequences.

3. Add a more general bisect_first_false(a, pred, lo, hi) (equivalent to C++'s std::partition_point) that assumes the sequence to be partitioned with respect to the predicate function and returns the first index such that the predicate is falsey, or hi if no falsey elements are found. Then reverse-sorted lists can be bisected with something like bisect_first_false(a, lambda y: y > x). This way may be too general, but it becomes very obvious what operators will be called on the values and what the post-condition of the function is, similar to the condition of a while loop.

4. Add a "cmp" parameter to bisect_left, bisect_right, (and perhaps insort_left, insort_right as well) that keys are meant to be compared with, so one would pass bisect_left(a, x, It could be used in conjuction with "key", i.e. "cmp" is not meant to be a replacement for "key" as it was with sorted.

5. Do nothing. People who _really_ want to bisect reverse-sorted lists will write their own bisect_*_decreasing or figure out that if "a" is reverse-sorted, the following monstrosity does the job, making ab(use) of the new "key" parameter in 3.10 and due to the fact that False < True:

a = ... # reverse-sorted
x = ...

# bisect left on reverse-sorted
pred = lambda y: not (x < y)
idx = bisect_left(a, x, key=pred)

# bisect right on reverse-sorted
pred = lambda y: y < x
idx = bisect_right(a, x, key=pred)

Commentary on the choices. 1 or 4 are the most convenient from a user perspective, no extra imports and discoverable behavior. 4 also has the benefit that the element/key class does not need to define a __lt__, which can be useful in cases where the element/key has many comparable attributes and it does not make sense to pick one to be used to define a __lt__. 3 would have the most generality and usefulness to library writers, who would probably be the primary users in need of a reverse-sorted bisection implementation. 2 suffers from combinatorial explosion (many new functions) and extra importing needed, but allows the implementation of existing functions to be unchanged, as well as mildly discouraging users from using it unless they _really_ want to use it. 5 will lead to many incorrect home-grown bisection searches or other hacks being used in the wild.

Personally, I am somewhat indifferent between 1 and 4, perhaps slightly learning towards 4 as it makes explicit exactly how elements/keys will be compared. I would like to see 3 implemented as well, but admit that it is likely too general a solution for the proposed issue "support reverse-sorted lists".
msg387540 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2021-02-22 19:56
I prefer option 5 and leaving the module as-is.  The bisect module api is slice oriented and, in general, people have a hard time reasoning about slices for reverse ordered sequences.  Even the forward ordered API is a bit awkward so that we had to document how to implement find_lt(), find_le(), etc. because it wasn't obvious how to do so.  Also, it doesn't seem to come-up much (the bisect module is over two decades old).  So, I don't think the API and the internals should be gummed up for this.

My vote is -1 because it would cause more problems than it would solve.  Let's cater to the common use cases rather than every possible variant.
msg387762 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2021-02-27 03:35
I think that we should just document the use of the new key parameter (and otherwise do nothing).
msg388148 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2021-03-05 06:06
Thank you for the suggestion, but am going close this for the reasons mentioned above.
Date User Action Args
2021-03-05 06:06:29rhettingersetstatus: open -> closed
resolution: rejected
messages: + msg388148

stage: resolved
2021-02-27 03:35:41terry.reedysetnosy: + terry.reedy
messages: + msg387762
2021-02-22 19:56:37rhettingersetnosy: + rhettinger
messages: + msg387540
2021-02-22 18:12:33mCodingcreate