Issue 1746088: long.__str__ is quadratic time

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/45145

classification

Title:	long.__str__ is quadratic time
Type:		Stage:
Components:	Library (Lib)	Versions:	Python 2.6

process

Status:	closed	Resolution:	duplicate
Dependencies:		Superseder:	Asymptotically faster divmod and str(long) View: 3451
Assigned To:		Nosy List:	dbenbenn, mark.dickinson, rhettinger
Priority:	normal	Keywords:

Created on 2007-07-01 15:42 by dbenbenn, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (6)
msg32438 - (view)	Author: David Benbennick (dbenbenn)	Date: 2007-07-01 15:42
In the 2.5.1 source code, Objects/longobject.c:long_format() is used to convert a long int to a string. When base is not a power of 2, it is quadratic time in the length of the output string. Of course, this is fine on small numbers, but is catastrophic on huge numbers. Suppose base is 10. The problem is that the algorithm basically does the following: take the number mod 10 to get a digit, stick that digit on the output, divide the number by 10, and repeat. To print an n digit number, there is an O(n log n) algorithm, using divide-and-conquer. You break the number into 2 pieces, each n/2 digits long, and iterate on both pieces. Converting string to long has the same quadratic time problem, in PyLong_FromString(). The solution is the same, in reverse: break the string in half, convert each piece to a long, and combine the two longs into one. Alternatively, Python could just use GMP (GNU MP Bignum Library, http://gmplib.org/) to provide long integers. That would make other operations, such as * and /, more efficient, too. But it would require a much bigger change.
msg32439 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2007-07-03 14:22
I'd call this a feature request rather than a bug. If I understand correctly, an O(n^(1+epsilon)) printing algorithm would rely on having an FFT-based fast multiplication algorithm, together with some form of divide-and-conquer division---is this right? These algorithms are nontrivial to implement efficiently, and even then the crossover point (where the FFT-based method becomes faster than the quadratic method) is likely to be in the thousands of digits. So I can't imagine there's much demand for this---even a 4096-bit RSA key is only 1233 (or 1234) digits long. If you just want subquadratic printing (O(n^1.585) or so) then you'd still need a subquadratic division (Python already has Karatsuba multiplication for large integers); here I guess the crossover would be smaller. A subquadratic division for Python might well be of interest to the developers, if someone could be persuaded to write and test one, and demonstrate a significant positive impact on performance. What's your use-case for printing huge integers fast? It doesn't seem like a very common need. Regarding GMP, I believe there are licensing issues: it's not legal to include GMP in core Python and release Python under its current non-GPL license, or something like that---I don't know anything about the details. But have you encountered Martelli's gmpy? http://code.google.com/p/gmpy/
msg32440 - (view)	Author: David Benbennick (dbenbenn)	Date: 2007-07-03 18:23
> rely on having an FFT-based fast multiplication algorithm, together with > some form of divide-and-conquer division---is this right? Yes, that's true: fast str() relies on fast division. I had assumed Python already had fast division; if it doesn't, I'd consider that a bug, too. > What's your use-case for printing huge integers fast? Note that it's not a question of printing them fast. With a quadratic time algorithm, it's infeasible to print huge numbers at all. My personal use case is doing computations in Thompson's group F; an element of F is a list of humongous fractions. But I expect it's a problem that often comes up in mathematical programming. I'll admit it isn't a very important bug, since anyone who is harmed by it will either use a different language, or use gmpy, or print in hex. But it's still a bug. > Regarding GMP, I believe there are licensing issues: it's not legal to > include GMP in core Python and release Python under its current non-GPL > license, or something like that---I don't know anything about the details. I don't see what the problem would be. Python's LICENSE file says that Python's license is GPL compatible. And in any case, GMP is LGPL, not GPL, so any program can link to it.
msg59938 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2008-01-14 23:33
I would be happy with a patch that does divide-and-conquer. That code would be much easier to get correct than the FFT algorithm and it would still give nice Big-O results.
msg68586 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2008-06-22 22:24
Still waiting for the patch.
msg70703 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2008-08-04 15:57
Closing this as a duplicate; it's superseded by issue 3451.

History
Date	User	Action	Args
2022-04-11 14:56:25	admin	set	github: 45145
2008-08-04 15:57:54	mark.dickinson	set	status: open -> closed resolution: duplicate superseder: Asymptotically faster divmod and str(long) messages: + msg70703
2008-06-22 22:25:00	rhettinger	set	assignee: rhettinger -> messages: + msg68586
2008-01-14 23:33:42	rhettinger	set	messages: + msg59938
2007-11-27 17:33:05	rhettinger	set	assignee: rhettinger nosy: + rhettinger versions: + Python 2.6, - Python 2.5
2007-07-01 15:42:45	dbenbenn	create