Message 412120 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	tim.peters
Recipients	Dennis Sweeney, tim.peters
Date	2022-01-30.01:30:43
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1643506243.72.0.655388844522.issue46558@roundup.psfhosted.org>
In-reply-to

Content
The test case here is a = (1 << 100000000) - 1, a solid string of 100 million 1 bits. The goal is to convert to a decimal string. Methods: native: str(a) numeral: the Python numeral() function from bpo-3451's div.py after adapting to use the Python divmod_fast() from the same report's fast_div.py. todecstr: from the Python file attached to this report. gmp: str() applied to gmpy2.mpz(a). Timings: native: don't know; gave up after waiting over 2 1/2 hours. numeral: about 5 1/2 minutes. todecstr: under 30 seconds. () gmp: under 6 seconds. So there's room for improvement ;-) But here's the thing: I've lost count of how many times someone has whipped up a pure-Python implementation of a bigint algorithm that leaves CPython in the dust. And they're generally pretty easy in Python. But then they die there, because converting to C is soul-crushing, losing the beauty and elegance and compactness to mountains of low-level details of memory-management, refcounting, and checking for errors after every tiny operation. So a new question in this endless dilemma: _why_ do we need to convert to C? Why not leave the extreme cases to far-easier to write and maintain Python code? When we're cutting runtime from hours down to minutes, we're focusing on entirely the wrong end to not settle for 2 minutes because it may be theoretically possible to cut that to 1 minute by resorting to C. () I hope this algorithm tickles you by defying expectations ;-) It essentially stands `numeral()` on its head by splitting the input by a power of 2 instead of by a power of 10. As a result _no_ divisions are used. But instead of shifting decimal digits into place, it has to multiply the high-end pieces by powers of 2. That seems insane on the face of it, but hard to argue with the clock ;-) The "tricks" here are that the O(log log n) powers of 2 needed can be computed efficiently in advance of any splitting, and that all the heavy arithmetic is done _in_ the `decimal` module, which implements fancier-than-Karatsuba multiplication and whose values can be converted to decimal strings very quickly.

The test case here is a = (1 << 100000000) - 1, a solid string of 100 million 1 bits. The goal is to convert to a decimal string.

Methods:

native: str(a)

numeral: the Python numeral() function from bpo-3451's div.py after adapting to use the Python divmod_fast() from the same report's fast_div.py.

todecstr: from the Python file attached to this report.

gmp: str() applied to gmpy2.mpz(a).

Timings:

native: don't know; gave up after waiting over 2 1/2 hours.
numeral: about 5 1/2 minutes.
todecstr: under 30 seconds. (*)
gmp: under 6 seconds.

So there's room for improvement ;-)

But here's the thing: I've lost count of how many times someone has whipped up a pure-Python implementation of a bigint algorithm that leaves CPython in the dust. And they're generally pretty easy in Python.

But then they die there, because converting to C is soul-crushing, losing the beauty and elegance and compactness to mountains of low-level details of memory-management, refcounting, and checking for errors after every tiny operation.

So a new question in this endless dilemma: _why_ do we need to convert to C? Why not leave the extreme cases to far-easier to write and maintain Python code? When we're cutting runtime from hours down to minutes, we're focusing on entirely the wrong end to not settle for 2 minutes because it may be theoretically possible to cut that to 1 minute by resorting to C.


(*) I hope this algorithm tickles you by defying expectations ;-) It essentially stands `numeral()` on its head by splitting the input by a power of 2 instead of by a power of 10. As a result _no_ divisions are used. But instead of shifting decimal digits into place, it has to multiply the high-end pieces by powers of 2. That seems insane on the face of it, but hard to argue with the clock ;-) The "tricks" here are that the O(log log n) powers of 2 needed can be computed efficiently in advance of any splitting, and that all the heavy arithmetic is done _in_ the `decimal` module, which implements fancier-than-Karatsuba multiplication and whose values can be converted to decimal strings very quickly.

History
Date	User	Action	Args
2022-01-30 01:30:43	tim.peters	set	recipients: + tim.peters, Dennis Sweeney
2022-01-30 01:30:43	tim.peters	set	messageid: <1643506243.72.0.655388844522.issue46558@roundup.psfhosted.org>
2022-01-30 01:30:43	tim.peters	link	issue46558 messages
2022-01-30 01:30:43	tim.peters	create