Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

random.choice: raise IndexError on empty sequence even when not using getrandbits internally #77384

Closed
wm75 mannequin opened this issue Apr 1, 2018 · 10 comments
Closed
Assignees
Labels
3.7 (EOL) end of life 3.8 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@wm75
Copy link
Mannequin

wm75 mannequin commented Apr 1, 2018

BPO 33203
Nosy @rhettinger, @serhiy-storchaka, @wm75, @selik
PRs
  • bpo-33203: Ensure random.choice always raises IndexError on empty sequence #6338
  • [3.7] bpo-33203: Ensure random.choice always raises IndexError on empty sequence (GH-6338) #6387
  • [3.6] bpo-33203: Ensure random.choice always raises IndexError on empty sequence (GH-6338) #6388
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/rhettinger'
    closed_at = <Date 2018-04-05.16:24:53.997>
    created_at = <Date 2018-04-01.20:04:10.759>
    labels = ['3.7', '3.8', 'type-bug', 'library']
    title = 'random.choice: raise IndexError on empty sequence even when not using getrandbits internally'
    updated_at = <Date 2018-04-05.16:24:53.997>
    user = 'https://github.com/wm75'

    bugs.python.org fields:

    activity = <Date 2018-04-05.16:24:53.997>
    actor = 'rhettinger'
    assignee = 'rhettinger'
    closed = True
    closed_date = <Date 2018-04-05.16:24:53.997>
    closer = 'rhettinger'
    components = ['Library (Lib)']
    creation = <Date 2018-04-01.20:04:10.759>
    creator = 'wolma'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 33203
    keywords = ['patch']
    message_count = 10.0
    messages = ['314787', '314835', '314837', '314840', '314869', '314870', '314957', '314990', '314991', '314992']
    nosy_count = 4.0
    nosy_names = ['rhettinger', 'serhiy.storchaka', 'wolma', 'selik']
    pr_nums = ['6338', '6387', '6388']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue33203'
    versions = ['Python 3.6', 'Python 3.7', 'Python 3.8']

    @wm75
    Copy link
    Mannequin Author

    wm75 mannequin commented Apr 1, 2018

    from https://docs.python.org/3/library/random.html#random.choice:

    Return a random element from the non-empty sequence seq. If seq is empty, raises IndexError.

    Indeed:
    >>> import random
    >>> random.choice([])
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/home/wolma/cpython/random.py", line 259, in choice
        raise IndexError('Cannot choose from an empty sequence') from None
    IndexError: Cannot choose from an empty sequence
    
    but when not using getrandbits internally:
    >>> class MyRandom(random.Random):
    ...     def random(self):
    ...         return super().random()
    ... 
    >>> my_random=MyRandom()
    >>> my_random.choice([])
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/home/wolma/cpython/random.py", line 257, in choice
        i = self._randbelow(len(seq))
      File "/home/wolma/cpython/random.py", line 245, in _randbelow
        rem = maxsize % n
    ZeroDivisionError: integer division or modulo by zero

    This is because the ValueError that random.choice tries to catch gets raised only in the getrandbits-dependent branch of Random._randbelow, but not in the branch using only Random.random (even though Random._randbelow suggests uniform behaviour.

    @wm75 wm75 mannequin added 3.7 (EOL) end of life 3.8 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Apr 1, 2018
    @selik
    Copy link
    Mannequin

    selik mannequin commented Apr 2, 2018

    If you're going to tackle this problem, this should probably be solved for the general case of n <= 0 rather than just n == 0.

    In [1]: import random

    In [2]: class Random(random.Random):
    ...: def random(self):
    ...: return super().random()
    ...:

    In [3]: r = Random()

    In [4]: r._randbelow(-1) # Should raise a ValueError
    Out[4]: 0

    But honestly, if someone is going to override random I think we can expect them to realize they're stepping into the deep end.

    @rhettinger
    Copy link
    Contributor

    FWIW, it was intended that the error checking (when required) occur at higher levels in the API rather than in the inner-most non-public utility function. Some calls to _randbelow cannot be zero or negative, so they shouldn't have to pay an penalty for the extra error check. A comment to this effect should be added but I don't think the design should be changed.

    @rhettinger rhettinger self-assigned this Apr 2, 2018
    @rhettinger
    Copy link
    Contributor

    I'll mull this over for while. There is some merit in getting the various components to fit together more uniformly. There's also the option of having choice() catch either a ZeroDivisionError or ValueError.

    @wm75
    Copy link
    Mannequin Author

    wm75 mannequin commented Apr 3, 2018

    @rhettinger: the reason the ValueError gets raised correctly in the getrandbits-dependent branch is because getrandbits itself does a n<=0 check (in C for random.Random, in Python for random.SystemRandom).
    So I thought the real reason why _randbelow does not perform the check was that it would be redundant, which it isn't when only random is used.

    So instead of thinking about the suggested check as something that impacts performance (which certainly it does), I would rather see it as adding something into that second branch that was forgotten and gave that branch a performance benefit over the other one, which it never deserved.
    It is also worthwhile to remember that any performance impact of this will only hit subclasses of random.Random that define random, but not getrandbits. Your alternative idea of having random.choice catch (ValueError, ZeroDivisionError) would affect random.Random and all subclasses.
    If a subclass defines getrandbits, it needs to perform that check anyway and, thinking about this more, I suggest that the requirement for any user-provided getrandbits to raise ValueError on k <= 0 should actually be added to the getrandbits docs.

    @wm75
    Copy link
    Mannequin Author

    wm75 mannequin commented Apr 3, 2018

    @selik: it's true _randbelow doesn't work for negative numbers, but the difference is that both branches are affected, the docstring does not make any guarantees about it, and no public part of the random module is affected by this behavior. In addition, "fixing" _randbelow for negative input cannot be done without impacting performance of several public methods of random.Random so I don't think this should be done.

    @rhettinger
    Copy link
    Contributor

    The patch looks fine. Once a news blurb is added, it can go forward.

    @rhettinger
    Copy link
    Contributor

    New changeset 091e95e by Raymond Hettinger (Wolfgang Maier) in branch 'master':
    bpo-33203: Ensure random.choice always raises IndexError on empty sequence (GH-6338)
    091e95e

    @rhettinger
    Copy link
    Contributor

    New changeset baf304e by Raymond Hettinger (Miss Islington (bot)) in branch '3.7':
    bpo-33203: Ensure random.choice always raises IndexError on empty sequence (GH-6338) (GH-6387)
    baf304e

    @rhettinger
    Copy link
    Contributor

    New changeset e25af93 by Raymond Hettinger (Miss Islington (bot)) in branch '3.6':
    bpo-33203: Ensure random.choice always raises IndexError on empty sequence (GH-6338) (GH-6388)
    e25af93

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life 3.8 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant