This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author remi.lapeyre
Recipients Dima.Tisnek, NeilGirdhar, alex, bls, dmtr, ericreynolds, gregory.p.smith, jafo, josh.r, mark.dickinson, martin.panter, milko.krachounov, remi.lapeyre, rhettinger, tebeka, terry.reedy, wumpus, yselivanov
Date 2019-02-18.15:13:32
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1550502813.23.0.362995259727.issue4356@roundup.psfhosted.org>
In-reply-to
Content
Hi everybody, I opened PR 11781 to add a key argument to functions in the bisect
module. I agree with @dmtr's points that this addition is not a bad design.

As far as I can tell, the key function is at called at most once per item as this
example where an assertion would break shows:


    import bisect
    from collections import defaultdict


    class Test:
        def __init__(self, value):
            self.value = value


    cache = defaultdict(int)

    def key(e):
        cache[e] += 1
        assert cache[e] <= 1
        return e.value


    l = [Test(i) for i in range(10000)]

    bisect.bisect(l, Test(25), key=key)


    ➜  cpython git:(add-key-argument-to-bisect) ./python.exe
    Python 3.8.0a1+ (heads/add-key-argument-to-bisect:b7aaa1adad, Feb  7 2019, 17:33:24) 
    [Clang 10.0.0 (clang-1000.10.44.4)] on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import bisect
    >>> from collections import defaultdict
    >>> 
    >>> 
    >>> class Test:
    ...     def __init__(self, value):
    ...         self.value = value
    ... 
    >>> 
    >>> cache = defaultdict(int)
    >>> 
    >>> def key(e):
    ...     cache[e] += 1
    ...     assert cache[e] <= 1
    ...     return e.value
    ... 
    >>> 
    >>> l = [Test(i) for i in range(10000)]
    >>> 
    >>> bisect.bisect(l, Test(25), key=key)
    26


This argument can be used where the objects are immutable and I have not been able
to see changes in bisect speed using Victor Stinner perf module:

(cpython-venv) ➜  cpython git:(add-key-argument-to-bisect) ✗ make distclean && ./configure --with-pydebug --with-openssl=$(brew --prefix openssl) && make -sj && python -m perf timeit --rigorous -s "from bisect import bisect" "bisect(range(1_000_000_000_000_000), 25)" -o $(git rev-parse --short HEAD).json
(cpython-venv) ➜  cpython git:(add-key-argument-to-bisect) ✗ git checkout cd90f6a369
(cpython-venv) ➜  cpython git:(cd90f6a369) ✗ make distclean && ./configure --with-pydebug --with-openssl=$(brew --prefix openssl) && make -sj && python -m perf timeit --rigorous -s "from bisect import bisect" "bisect(range(1_000_000_000_000_000), 25)" -o $(git rev-parse --short HEAD).json
(cpython-venv) ➜  cpython git:(cd90f6a369) ✗ python -m perf compare_to cd90f6a369.json b7aaa1adad.json
Mean +- std dev: [cd90f6a369] 36.2 us +- 1.0 us -> [b7aaa1adad] 35.7 us +- 0.5 us: 1.01x faster (-1%)

(cd90f6a369 was somtime faster than b7aaa1adad, sometime slower but they always
were less than one std dev from one another)

As I wrote in the discussion of the PR, I suspect the branch predictor to predict
reliably the branching in the hot path (though I don't know much about that and
would love some input).


For the record, here is the performance when a key function is given:

(cpython-venv) ➜  cpython git:(add-key-argument-to-bisect) ✗ python -m perf timeit --rigorous -s "from bisect import bisect" "bisect(range(1_000_000_000_000_000), 25, key=lambda e: e)"

.........................................
Mean +- std dev: 59.3 us +- 1.0 us

It seems to me that adding the key parameter is the best solution possible.
History
Date User Action Args
2019-02-18 15:13:33remi.lapeyresetrecipients: + remi.lapeyre, rhettinger, terry.reedy, gregory.p.smith, jafo, tebeka, mark.dickinson, alex, milko.krachounov, dmtr, bls, martin.panter, Dima.Tisnek, yselivanov, NeilGirdhar, josh.r, ericreynolds, wumpus
2019-02-18 15:13:33remi.lapeyresetmessageid: <1550502813.23.0.362995259727.issue4356@roundup.psfhosted.org>
2019-02-18 15:13:33remi.lapeyrelinkissue4356 messages
2019-02-18 15:13:32remi.lapeyrecreate