Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DeprecationWarning triggers for sequences which happen to be sets as well #86636

Closed
masklinn mannequin opened this issue Nov 26, 2020 · 9 comments
Closed

DeprecationWarning triggers for sequences which happen to be sets as well #86636

masklinn mannequin opened this issue Nov 26, 2020 · 9 comments
Assignees
Labels
3.10 only security fixes stdlib Python modules in the Lib dir

Comments

@masklinn
Copy link
Mannequin

masklinn mannequin commented Nov 26, 2020

BPO 42470
Nosy @tim-one, @rhettinger, @masklinn, @serhiy-storchaka
PRs
  • bpo-42470: Do not warn on sequences which are also sets in random.sample() #23639
  • bpo-42470: Do not warn on sequences which are also sets in random.samples #23665
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/rhettinger'
    closed_at = <Date 2020-12-19.04:35:57.189>
    created_at = <Date 2020-11-26.08:23:55.346>
    labels = ['library', '3.10']
    title = 'DeprecationWarning triggers for sequences which happen to be sets as well'
    updated_at = <Date 2020-12-19.04:35:57.186>
    user = 'https://github.com/masklinn'

    bugs.python.org fields:

    activity = <Date 2020-12-19.04:35:57.186>
    actor = 'rhettinger'
    assignee = 'rhettinger'
    closed = True
    closed_date = <Date 2020-12-19.04:35:57.189>
    closer = 'rhettinger'
    components = ['Library (Lib)']
    creation = <Date 2020-11-26.08:23:55.346>
    creator = 'xmorel'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 42470
    keywords = ['patch']
    message_count = 9.0
    messages = ['381885', '381909', '381931', '382286', '382485', '382486', '382598', '383359', '383360']
    nosy_count = 5.0
    nosy_names = ['tim.peters', 'rhettinger', 'xmorel', 'python-dev', 'serhiy.storchaka']
    pr_nums = ['23639', '23665']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue42470'
    versions = ['Python 3.10']

    @masklinn
    Copy link
    Mannequin Author

    masklinn mannequin commented Nov 26, 2020

    In 3.9, using random.sample on sets triggers

    DeprecationWarning: Sampling from a set deprecated
    

    since Python 3.9 and will be removed in a subsequent version.

    *However* it also triggers on types which implement *both* Sequence and Set, despite Sequence on its own being fine.

    The issue is that it first checks for Set and triggers a warning, and only then checks that the input is a sequence:

            if isinstance(population, _Set):
                _warn('Sampling from a set deprecated\n'
                      'since Python 3.9 and will be removed in a subsequent version.',
                      DeprecationWarning, 2)
                population = tuple(population)
            if not isinstance(population, _Sequence):
                raise TypeError("Population must be a sequence.  For dicts or sets, use sorted(d).")

    the check should rather be:

            if not isinstance(population, _Sequence):
                if isinstance(population, _Set):
                    _warn('Sampling from a set deprecated\n'
                          'since Python 3.9 and will be removed in a subsequent version.',
                          DeprecationWarning, 2)
                    population = tuple(population)
                else:
                    raise TypeError("Population must be a sequence.  For dicts or sets, use sorted(d).")

    this also only incurs a single instance check for _Sequence types instead of two.

    @masklinn masklinn mannequin added 3.9 only security fixes labels Nov 26, 2020
    @rhettinger
    Copy link
    Contributor

    +0

    Do you want to submit a PR for this?

    Some thoughts:

    • The current logic matches the logic before the warning was added.
    • The proposed logic matches what the code will do after the
      deprecation period (it will only check for non-sequences).
    • There is some value in the warning in that it lets you know an
      inefficient code path is being used (i.e. the conversion to a tuple).
    • The proposed logic doesn't just change the warning, it changes
      what actually happens to the data. IMO the change is for the
      better, but it is a behavior change and could potentially cause
      a failure in someone's code.
    • The case of an object that is both a sequence and a set is
      likely very rare.

    @masklinn
    Copy link
    Mannequin Author

    masklinn mannequin commented Nov 27, 2020

    Do you want to submit a PR for this?

    Sure. Do you think the code I proposed would be suitable?

    • The current logic matches the logic before the warning was added.
    • The proposed logic matches what the code will do after the
      deprecation period (it will only check for non-sequences).

    Yes, that was my basis for it as it seemed sensible, but you're right that it's a bit of a behavioural change as you note:

    • There is some value in the warning in that it lets you know an
      inefficient code path is being used (i.e. the conversion to a tuple).
    • The proposed logic doesn't just change the warning, it changes
      what actually happens to the data. IMO the change is for the
      better, but it is a behavior change and could potentially cause
      a failure in someone's code.

    Aye, and also I guess the "sequence" implementation of the input collection might be less efficient than one-shot converting to a set and sampling from the set.

    • The case of an object that is both a sequence and a set is
      likely very rare.

    Chances are you're right, but it's what got me to stumble upon it ($dayjob makes significant use of a "unique list / ordered set" smart-ish collection, that collection was actually registered against Set and Sequence specifically because Python 3's random.sample typechecks those, we registered against both as the collection technically implements both interfaces so that seemed like a good idea at the time).

    @rhettinger
    Copy link
    Contributor

    >> Do you want to submit a PR for this?

    Sure. Do you think the code I proposed would be suitable?

    Yes. It will need tests and a news entry as well.

    @masklinn
    Copy link
    Mannequin Author

    masklinn mannequin commented Dec 4, 2020

    I was preparing to open the PR but now I'm doubting: should I open the PR against master and miss islington will backport it, or should I open the PR against 3.9?

    @serhiy-storchaka
    Copy link
    Member

    Open it against master.

    @masklinn
    Copy link
    Mannequin Author

    masklinn mannequin commented Dec 6, 2020

    Tried patterning the PR after the one which originally added the warning. Wasn't too sure how the news item was supposed to be generated, and grepping the repository didn't reveal any clear script doing that, so I made up a date and copied an existing random bit (which I expect is just a deduplicator in case multiple NEWS items are created at the same instant?)

    @rhettinger
    Copy link
    Contributor

    New changeset 1e27b57 by masklinn in branch 'master':
    bpo-42470: Do not warn on sequences which are also sets in random.sample() (GH-23665)
    1e27b57

    @rhettinger
    Copy link
    Contributor

    Given that this is rare and that it is a behavior change, I'm only applying this to 3.10. If you think it is essential for 3.9, feel free to reopen and we'll discuss it with the release manager.

    @rhettinger rhettinger added stdlib Python modules in the Lib dir 3.10 only security fixes and removed 3.9 only security fixes labels Dec 19, 2020
    @rhettinger rhettinger added stdlib Python modules in the Lib dir 3.10 only security fixes and removed 3.9 only security fixes labels Dec 19, 2020
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.10 only security fixes stdlib Python modules in the Lib dir
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants