This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: listsort.txt wrongly assumes you cannot calculate leading zeros in O(1) time.
Type: performance Stage: resolved
Components: Documentation Versions: Python 3.11
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: tim.peters Nosy List: chemoelectric, docs@python, rhettinger, tim.peters
Priority: normal Keywords:

Created on 2022-01-23 17:08 by chemoelectric, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (6)
msg411384 - (view) Author: Barry Schwartz (chemoelectric) Date: 2022-01-23 17:08
The Objects/listsort.txt incorrectly implies that it is not possible to compute leading zero bits in O(1) time, using only standard C. For a fixed integer size it can be done, for instance, using de Bruijn sequences. See https://www.chessprogramming.org/BitScan

(The existence of such methods is not as widely known as it ought to be.)
msg411425 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2022-01-23 22:11
Presumably the OP is referring to this text:

"""
`powerloop()` emulates these divisions, 1 bit at a time, using comparisons,
subtractions, and shifts in a loop.

You'll notice the paper uses an O(1) method instead, but that relies on two
things we don't have:

- An O(1) "count leading zeroes" primitive. We can find such a thing as a C
  extension on most platforms, but not all, and there's no uniform spelling
  on the platforms that support it.

- Integer division on an integer type twice as wide as needed to hold the
  list length. But the latter is Py_ssize_t for us, and is typically the
  widest native signed integer type the platform supports.

But since runs in our algorithm are almost never very short, the once-per-run
overhead of `powerloop()` seems lost in the noise.

"""
msg411426 - (view) Author: Barry Schwartz (chemoelectric) Date: 2022-01-23 22:21
Yes. Actually the issue is branching, not order of complexity, because looping at most 64 times is a linear-bounded operation. The methods I point out involve no branching! And so can be rather fast. I don't suggest they be used, but that the listsort.txt be revised.
msg411428 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2022-01-23 22:24
I'm not inclined to change anything here. It's a trivial point, and by "primitive" I had in mind a dedicated hardware instruction, blazing fast. Yes, I was aware of long-winded ways of doing it for specific fixed integer widths. But that's not what `O(1)` means. A dead obvious loop testing each bit, one at a time, starting with the MSB, until finding the first bit set, is also O(1) for any fixed-width int size.

So I'm not doing anything here. If someone else creates a PR with text they want to see instead, I'll review it, and if it's not unbearably pedantic ;-) I'll merge it.
msg411429 - (view) Author: Barry Schwartz (chemoelectric) Date: 2022-01-23 22:24
I meant constant bounded
msg411444 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2022-01-24 04:49
For any fixed width integer type, the worst case of the dead simple loop (all bits are zero) is a fixed upper bound. So you don't mean "constant bounded" either. You mean something more like "clever C code that usually runs faster than the obvious loop".

See my "if it's not unbearably pedantic" comment earlier ;-) Again, by "primitive" I meant HW-level primitive. I agree that's not wholly clear from what I wrote, but really don't care - it's a trivial point that makes no difference in context. The lack of an integer type in C wide enough to support the division the paper uses is _the_ deal-breaker. That C doesn't define a count-leading-zero function either is just a flea on the tail of that dog.
History
Date User Action Args
2022-04-11 14:59:55adminsetgithub: 90646
2022-01-24 06:48:38rhettingersetstatus: open -> closed
resolution: not a bug
stage: resolved
2022-01-24 04:49:46tim.peterssetmessages: + msg411444
2022-01-23 22:24:38chemoelectricsetmessages: + msg411429
2022-01-23 22:24:06tim.peterssetmessages: + msg411428
2022-01-23 22:21:02chemoelectricsetmessages: + msg411426
2022-01-23 22:11:12rhettingersetassignee: docs@python -> tim.peters

messages: + msg411425
nosy: + rhettinger
2022-01-23 17:15:26mark.dickinsonsetnosy: + tim.peters
2022-01-23 17:08:12chemoelectriccreate