classification
Title: Deprecate using random.randrange() with non-integers
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.10
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: mark.dickinson, rhettinger, serhiy.storchaka, tim.peters
Priority: normal Keywords: patch

Created on 2019-06-17 19:52 by serhiy.storchaka, last changed 2021-01-25 21:03 by serhiy.storchaka. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 19112 merged serhiy.storchaka, 2020-03-22 19:35
Messages (15)
msg345892 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-06-17 19:52
Unlike other range()-like functions random.randrange() accepts not only integers and integer-like objects (implementing __index__), but any numbers with integral values, like 3.0 or Decimal(3). In 3.8 using __int__ for implicit converting to C integers were deprecated, and likely will be forbidden in 3.9. __index__ is now preferable.

I propose to deprecate accepting non-integer arguments in random.randrange() and use __index__ instead of __int__ for converting values. At end it will lead to simpler and faster code. Instead of

        istart = _int(start)
        if istart != start:
            raise ValueError("non-integer arg 1 for randrange()")

we could use just

        start = _index(start)

It will help also in converting the randrange() implementation to C.

See also issue37315.
msg345935 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2019-06-17 22:50
While there isn't anything intrinsically wrong with using index(), the premise of this issue is essentially a claim that most Python code that uses int() is wrong.  That seems rather bold and flies in the face of common practices.  I don't really like the trend of constantly rearranging code in awkward ways to solve problems that no one actually has.

Also, I disagree with the notion that the downstream functions in the random module should be written in C.  That makes them more bug prone and opaque (like when Victor broken seeding for 3.6.0).  The trend for rewriting in C should be restricted to core building blocks like the underlying Mersenne Twister.  Otherwise, we have to do dual maintenance for years and essentially advertise to people that we don't think our own language is good enough to use.  Over time, I am less and less able to show people standard library code as it has become complexified with coding strategies well beyond the norms of day to day Python programmers.  ISTM that most proposals to rewrite something in C come from folks who like rewriting things in C rather than from users or from people who do long term maintenance of the modules.  Sidenote: we're now getting new core devs who don't read or write C, so each transformation like them excludes their ability to participate).
msg345937 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2019-06-17 23:06
We probably should not do this until we're willing to give PEP 8 guidance recommending operator.index() as preferable to int() for all cases needing integer offsets.  Maybe we do want to go down that path or maybe we just stick with "practicality beats purity" and not create any new learning issues.  The int() function is popular and understandable, but it is rare to find a user who knows and cares about index().  ISTM that randrange(3.0) isn't a problem in real life.
msg346106 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-06-20 07:43
Why this code uses int() at all? My guess is that it was added for earlier detection of user errors, otherwise the user could get a TypeError or AttributeError with unrelated message, or even a wrong result. int() is used just for type checking. But it is not good tool for this. int() does several different actions:

1. Parses string representation of integer.

2. Converts a real number to an integer with some truncation.

3. Losslessly converts an int-like number into an instance of int.

The first two options are not applicable here, so we need an additional check

        if istart != start:
            raise ValueError("non-integer arg 1 for randrange()")

And this is not perfect: for randrange('5') we get a ValueError instead of expected TypeError.

operator.index() is better tool for this.
msg364819 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-03-22 19:41
Many functions that accepts non-integer arguments (except float) and truncate them to integers emits a deprecation warning since 3.8 (see issue36048 and issue20092). They will be errors in 3.10 (see 37999).
msg364822 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2020-03-22 20:52
I'm generally against "fixing" things that aren't broken.  AFAICT, this has never been an issue for any real user over its 20 year history.  Also, the current PR makes the code look gross and may impact code that is currently working.  Tim, this is mostly your code, what do you think?
msg364823 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-03-22 21:07
Using index() simplifies the code. Adding deprecation warnings complicates it. The PR consists of two commits, look at the first one to see how the code can finally look.

If you think that the deprecation is not needed, we can just keep the simplified version.
msg364834 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2020-03-23 01:47
I have scant memory of working on this code.  But back then Python wasn't at all keen to enforce type checking in harmless contexts.  If, e.g., someone found it convenient to pass integers that happened to be in float format to randrange(), why not?  Going "tsk, tsk, tsk" won't help them get their work done ;-)

If we were restarting from scratch, I'd be happy enough to say "screw floats here - _and_ screw operator.index() - this function requires ints, period".  But, as is, I see insufficient benefit to potentially making code that's been working fine for decades "deprecated".

Yes, I realize the code may eventually become marginally faster then.  That doesn't, to my eyes, outweigh the certainty of annoying some users.
msg364839 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-03-23 06:13
If you are against deprecating this "feature", it should be at least documented and covered by tests.
msg364857 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2020-03-23 15:36
How about we just leave it alone.  AFAICT nothing is broken -- no user is having any issues.  We really don't need to go through every tool in the standard library that uses int() and document that it accepts any type that defines __int__, nor do we need to test those cases.  I understand that you are bugged by this but it isn't a real problem.
msg364872 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2020-03-23 18:19
It's neither "bug" nor "feature" - it's an inherited quirk CPython maintains to avoid annoying users for no good reason.

If you absolutely have to "do something", how about adding a mere footnote?  The docs already imply the args have the same restrictions as for `range()`.  A footnote could add that, to preserve historical compatibility, CPython also accepts arguments `x` for which `int(x) == x`, and uses `int(x)` in such cases - but portable code shouldn't rely on that.

But nobody cares, so putting that in the main part of the docs would also annoy 99.99% of users for no good reason ;-)
msg364875 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2020-03-23 18:24
Marking this as rejected.  Nothing good would come from obfuscating the docs to note a minor implementation detail.
msg380095 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-10-31 20:06
See issue42222.

As a side effect this change provides 10% speed up (which can be lager after the end of the deprecation period).
msg383918 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2020-12-28 19:13
I merged in PR 23064. If you want to make further modifications or improvements that would be welcome.
msg385663 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2021-01-25 21:02
New changeset f066bd94b9225a5a3c4ade5fc3ff81e3c49b7b32 by Serhiy Storchaka in branch 'master':
bpo-37319: Improve documentation, code and tests of randrange. (GH-19112)
https://github.com/python/cpython/commit/f066bd94b9225a5a3c4ade5fc3ff81e3c49b7b32
History
Date User Action Args
2021-01-25 21:03:21serhiy.storchakasetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2021-01-25 21:02:38serhiy.storchakasetmessages: + msg385663
2020-12-28 19:13:55rhettingersetmessages: + msg383918
2020-10-31 20:06:33serhiy.storchakasetmessages: + msg380095
2020-10-31 19:55:15serhiy.storchakasetstatus: closed -> open
stage: resolved -> patch review
resolution: rejected -> (no value)
versions: + Python 3.10, - Python 3.9
2020-03-23 18:24:41rhettingersetstatus: open -> closed
resolution: rejected
messages: + msg364875

stage: patch review -> resolved
2020-03-23 18:19:36tim.peterssetmessages: + msg364872
2020-03-23 15:36:40rhettingersetmessages: + msg364857
2020-03-23 06:13:32serhiy.storchakasetmessages: + msg364839
2020-03-23 01:47:10tim.peterssetassignee: tim.peters ->
messages: + msg364834
2020-03-22 21:07:01serhiy.storchakasetmessages: + msg364823
2020-03-22 20:52:50rhettingersetassignee: tim.peters
messages: + msg364822
2020-03-22 19:41:00serhiy.storchakasetmessages: + msg364819
2020-03-22 19:35:21serhiy.storchakasetkeywords: + patch
stage: patch review
pull_requests: + pull_request18473
2019-06-20 07:43:34serhiy.storchakasetmessages: + msg346106
2019-06-17 23:06:33rhettingersetmessages: + msg345937
2019-06-17 22:50:46rhettingersetnosy: + tim.peters
messages: + msg345935
2019-06-17 19:52:35serhiy.storchakacreate