This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: DeprecationWarning triggers for sequences which happen to be sets as well
Type: Stage: resolved
Components: Library (Lib) Versions: Python 3.10
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: python-dev, rhettinger, serhiy.storchaka, tim.peters, xmorel
Priority: normal Keywords: patch

Created on 2020-11-26 08:23 by xmorel, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 23639 closed python-dev, 2020-12-04 13:30
PR 23665 merged xmorel, 2020-12-06 16:13
Messages (9)
msg381885 - (view) Author: Xavier Morel (xmorel) * Date: 2020-11-26 08:23
In 3.9, using `random.sample` on sets triggers

    DeprecationWarning: Sampling from a set deprecated
since Python 3.9 and will be removed in a subsequent version.

*However* it also triggers on types which implement *both* Sequence and Set, despite Sequence on its own being fine.

The issue is that it first checks for Set and triggers a warning, and only then checks that the input is a sequence:

        if isinstance(population, _Set):
            _warn('Sampling from a set deprecated\n'
                  'since Python 3.9 and will be removed in a subsequent version.',
                  DeprecationWarning, 2)
            population = tuple(population)
        if not isinstance(population, _Sequence):
            raise TypeError("Population must be a sequence.  For dicts or sets, use sorted(d).")

the check should rather be:

        if not isinstance(population, _Sequence):
            if isinstance(population, _Set):
                _warn('Sampling from a set deprecated\n'
                      'since Python 3.9 and will be removed in a subsequent version.',
                      DeprecationWarning, 2)
                population = tuple(population)
            else:
                raise TypeError("Population must be a sequence.  For dicts or sets, use sorted(d).")

this also only incurs a single instance check for `_Sequence` types instead of two.
msg381909 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2020-11-26 19:16
+0 

Do you want to submit a PR for this?

Some thoughts:

* The current logic matches the logic before the warning was added.
* The proposed logic matches what the code will do after the
  deprecation period (it will only check for non-sequences).
* There is some value in the warning in that it lets you know an
  inefficient code path is being used (i.e. the conversion to a tuple).
* The proposed logic doesn't just change the warning, it changes
  what actually happens to the data.  IMO the change is for the
  better, but it is a behavior change and could potentially cause
  a failure in someone's code.
* The case of an object that is both a sequence and a set is
  likely very rare.
msg381931 - (view) Author: Xavier Morel (xmorel) * Date: 2020-11-27 08:57
> Do you want to submit a PR for this?

Sure. Do you think the code I proposed would be suitable?

> * The current logic matches the logic before the warning was added.
> * The proposed logic matches what the code will do after the
>   deprecation period (it will only check for non-sequences).

Yes, that was my basis for it as it seemed sensible, but you're right that it's a bit of a behavioural change as you note:

> * There is some value in the warning in that it lets you know an
>   inefficient code path is being used (i.e. the conversion to a tuple).
> * The proposed logic doesn't just change the warning, it changes
>   what actually happens to the data.  IMO the change is for the
>   better, but it is a behavior change and could potentially cause
>   a failure in someone's code.

Aye, and also I guess the "sequence" implementation of the input collection might be less efficient than one-shot converting to a set and sampling from the set.

> * The case of an object that is both a sequence and a set is
>   likely very rare.

Chances are you're right, but it's what got me to stumble upon it ($dayjob makes significant use of a "unique list / ordered set" smart-ish collection, that collection was actually registered against Set and Sequence specifically because Python 3's random.sample typechecks those, we registered against both as the collection technically implements both interfaces so that seemed like a good idea at the time).
msg382286 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2020-12-02 00:15
>>> Do you want to submit a PR for this?
>
> Sure. Do you think the code I proposed would be suitable?

Yes.  It will need tests and a news entry as well.
msg382485 - (view) Author: Xavier Morel (xmorel) * Date: 2020-12-04 13:14
I was preparing to open the PR but now I'm doubting: should I open the PR against master and miss islington will backport it, or should I open the PR against 3.9?
msg382486 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-12-04 13:17
Open it against master.
msg382598 - (view) Author: Xavier Morel (xmorel) * Date: 2020-12-06 16:16
Tried patterning the PR after the one which originally added the warning. Wasn't too sure how the news item was supposed to be generated, and grepping the repository didn't reveal any clear script doing that, so I made up a date and copied an existing random bit (which I expect is just a deduplicator in case multiple NEWS items are created at the same instant?)
msg383359 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2020-12-19 04:33
New changeset 1e27b57dbc8c1b758e37a531487813aef2d111ca by masklinn in branch 'master':
bpo-42470: Do not warn on sequences which are also sets in random.sample() (GH-23665)
https://github.com/python/cpython/commit/1e27b57dbc8c1b758e37a531487813aef2d111ca
msg383360 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2020-12-19 04:35
Given that this is rare and that it is a behavior change, I'm only applying this to 3.10.  If you think it is essential for 3.9, feel free to reopen and we'll discuss it with the release manager.
History
Date User Action Args
2022-04-11 14:59:38adminsetgithub: 86636
2020-12-19 04:35:57rhettingersetstatus: open -> closed
versions: + Python 3.10, - Python 3.9
messages: + msg383360

components: + Library (Lib)
resolution: fixed
stage: patch review -> resolved
2020-12-19 04:33:44rhettingersetmessages: + msg383359
2020-12-06 16:16:48xmorelsetmessages: + msg382598
2020-12-06 16:13:06xmorelsetpull_requests: + pull_request22532
2020-12-04 13:30:19python-devsetkeywords: + patch
nosy: + python-dev

pull_requests: + pull_request22507
stage: patch review
2020-12-04 13:17:11serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg382486
2020-12-04 13:14:12xmorelsetmessages: + msg382485
2020-12-02 00:15:31rhettingersetassignee: rhettinger
messages: + msg382286
2020-11-27 08:57:59xmorelsetmessages: + msg381931
2020-11-26 19:16:47rhettingersetnosy: + tim.peters
messages: + msg381909
2020-11-26 08:37:34xtreaksetnosy: + rhettinger
2020-11-26 08:23:55xmorelcreate