This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: random.choices() raises "int too large" error while random.randint does not
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.8
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: mark.dickinson, mathtester, rhettinger, serhiy.storchaka
Priority: normal Keywords:

Created on 2020-09-25 05:14 by mathtester, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (3)
msg377480 - (view) Author: Heng Sun (mathtester) Date: 2020-09-25 05:14
If I run this one line of code:

random.choices(range(2**100), k=5)

I would get error:

OverflowError: Python int too large to convert to C ssize_t

But I can run equivalent line to achieve this without error:

[random.randint(0, 2**100-1) for j in range(5)]

With the understanding of the issue coming from len(), ref https://bugs.python.org/issue12159, I still think random.choices() should be able to handle large integers.
msg377482 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-09-25 06:57
This is virtually a duplicate of issue40388.
msg377483 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2020-09-25 08:24
This is a known limitation and there isn't much we can do about it. 

The root cause is that for performance reasons the len() function doesn't handle sizes larger than a C ssize_t:

    >>> len(range(2**100))
    Traceback (most recent call last):
    ... 
    OverflowError: Python int too large to convert to C ssize_t

For the same reason, you would also see the same error for random.choice():

    >>> random.choice(range(2**100))
    Traceback (most recent call last):
    ... 
    OverflowError: Python int too large to convert to C ssize_t

Given that we can't get the size of the population, there isn't much that choice() or choices() can do about the situation without special casing range objects and reconstructing what len() would have returned had it not been restricted.  Given that this hasn't seemed to have ever been a problem in practice, I recommend just using randrange() in a loop.
History
Date User Action Args
2022-04-11 14:59:36adminsetgithub: 86026
2020-09-25 08:24:21rhettingersetstatus: open -> closed
resolution: not a bug
messages: + msg377483

stage: resolved
2020-09-25 06:57:57serhiy.storchakasetnosy: + rhettinger, serhiy.storchaka, mark.dickinson
messages: + msg377482
2020-09-25 05:14:18mathtestercreate