msg345892 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2019-06-17 19:52 |
Unlike other range()-like functions random.randrange() accepts not only integers and integer-like objects (implementing __index__), but any numbers with integral values, like 3.0 or Decimal(3). In 3.8 using __int__ for implicit converting to C integers were deprecated, and likely will be forbidden in 3.9. __index__ is now preferable.
I propose to deprecate accepting non-integer arguments in random.randrange() and use __index__ instead of __int__ for converting values. At end it will lead to simpler and faster code. Instead of
istart = _int(start)
if istart != start:
raise ValueError("non-integer arg 1 for randrange()")
we could use just
start = _index(start)
It will help also in converting the randrange() implementation to C.
See also issue37315.
|
msg345935 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2019-06-17 22:50 |
While there isn't anything intrinsically wrong with using index(), the premise of this issue is essentially a claim that most Python code that uses int() is wrong. That seems rather bold and flies in the face of common practices. I don't really like the trend of constantly rearranging code in awkward ways to solve problems that no one actually has.
Also, I disagree with the notion that the downstream functions in the random module should be written in C. That makes them more bug prone and opaque (like when Victor broken seeding for 3.6.0). The trend for rewriting in C should be restricted to core building blocks like the underlying Mersenne Twister. Otherwise, we have to do dual maintenance for years and essentially advertise to people that we don't think our own language is good enough to use. Over time, I am less and less able to show people standard library code as it has become complexified with coding strategies well beyond the norms of day to day Python programmers. ISTM that most proposals to rewrite something in C come from folks who like rewriting things in C rather than from users or from people who do long term maintenance of the modules. Sidenote: we're now getting new core devs who don't read or write C, so each transformation like them excludes their ability to participate).
|
msg345937 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2019-06-17 23:06 |
We probably should not do this until we're willing to give PEP 8 guidance recommending operator.index() as preferable to int() for all cases needing integer offsets. Maybe we do want to go down that path or maybe we just stick with "practicality beats purity" and not create any new learning issues. The int() function is popular and understandable, but it is rare to find a user who knows and cares about index(). ISTM that randrange(3.0) isn't a problem in real life.
|
msg346106 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2019-06-20 07:43 |
Why this code uses int() at all? My guess is that it was added for earlier detection of user errors, otherwise the user could get a TypeError or AttributeError with unrelated message, or even a wrong result. int() is used just for type checking. But it is not good tool for this. int() does several different actions:
1. Parses string representation of integer.
2. Converts a real number to an integer with some truncation.
3. Losslessly converts an int-like number into an instance of int.
The first two options are not applicable here, so we need an additional check
if istart != start:
raise ValueError("non-integer arg 1 for randrange()")
And this is not perfect: for randrange('5') we get a ValueError instead of expected TypeError.
operator.index() is better tool for this.
|
msg364819 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2020-03-22 19:41 |
Many functions that accepts non-integer arguments (except float) and truncate them to integers emits a deprecation warning since 3.8 (see issue36048 and issue20092). They will be errors in 3.10 (see 37999).
|
msg364822 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2020-03-22 20:52 |
I'm generally against "fixing" things that aren't broken. AFAICT, this has never been an issue for any real user over its 20 year history. Also, the current PR makes the code look gross and may impact code that is currently working. Tim, this is mostly your code, what do you think?
|
msg364823 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2020-03-22 21:07 |
Using index() simplifies the code. Adding deprecation warnings complicates it. The PR consists of two commits, look at the first one to see how the code can finally look.
If you think that the deprecation is not needed, we can just keep the simplified version.
|
msg364834 - (view) |
Author: Tim Peters (tim.peters) *  |
Date: 2020-03-23 01:47 |
I have scant memory of working on this code. But back then Python wasn't at all keen to enforce type checking in harmless contexts. If, e.g., someone found it convenient to pass integers that happened to be in float format to randrange(), why not? Going "tsk, tsk, tsk" won't help them get their work done ;-)
If we were restarting from scratch, I'd be happy enough to say "screw floats here - _and_ screw operator.index() - this function requires ints, period". But, as is, I see insufficient benefit to potentially making code that's been working fine for decades "deprecated".
Yes, I realize the code may eventually become marginally faster then. That doesn't, to my eyes, outweigh the certainty of annoying some users.
|
msg364839 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2020-03-23 06:13 |
If you are against deprecating this "feature", it should be at least documented and covered by tests.
|
msg364857 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2020-03-23 15:36 |
How about we just leave it alone. AFAICT nothing is broken -- no user is having any issues. We really don't need to go through every tool in the standard library that uses int() and document that it accepts any type that defines __int__, nor do we need to test those cases. I understand that you are bugged by this but it isn't a real problem.
|
msg364872 - (view) |
Author: Tim Peters (tim.peters) *  |
Date: 2020-03-23 18:19 |
It's neither "bug" nor "feature" - it's an inherited quirk CPython maintains to avoid annoying users for no good reason.
If you absolutely have to "do something", how about adding a mere footnote? The docs already imply the args have the same restrictions as for `range()`. A footnote could add that, to preserve historical compatibility, CPython also accepts arguments `x` for which `int(x) == x`, and uses `int(x)` in such cases - but portable code shouldn't rely on that.
But nobody cares, so putting that in the main part of the docs would also annoy 99.99% of users for no good reason ;-)
|
msg364875 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2020-03-23 18:24 |
Marking this as rejected. Nothing good would come from obfuscating the docs to note a minor implementation detail.
|
msg380095 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2020-10-31 20:06 |
See issue42222.
As a side effect this change provides 10% speed up (which can be lager after the end of the deprecation period).
|
msg383918 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2020-12-28 19:13 |
I merged in PR 23064. If you want to make further modifications or improvements that would be welcome.
|
msg385663 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2021-01-25 21:02 |
New changeset f066bd94b9225a5a3c4ade5fc3ff81e3c49b7b32 by Serhiy Storchaka in branch 'master':
bpo-37319: Improve documentation, code and tests of randrange. (GH-19112)
https://github.com/python/cpython/commit/f066bd94b9225a5a3c4ade5fc3ff81e3c49b7b32
|
|
Date |
User |
Action |
Args |
2022-04-11 14:59:16 | admin | set | github: 81500 |
2021-01-25 21:03:21 | serhiy.storchaka | set | status: open -> closed resolution: fixed stage: patch review -> resolved |
2021-01-25 21:02:38 | serhiy.storchaka | set | messages:
+ msg385663 |
2020-12-28 19:13:55 | rhettinger | set | messages:
+ msg383918 |
2020-10-31 20:06:33 | serhiy.storchaka | set | messages:
+ msg380095 |
2020-10-31 19:55:15 | serhiy.storchaka | set | status: closed -> open stage: resolved -> patch review resolution: rejected -> (no value) versions:
+ Python 3.10, - Python 3.9 |
2020-03-23 18:24:41 | rhettinger | set | status: open -> closed resolution: rejected messages:
+ msg364875
stage: patch review -> resolved |
2020-03-23 18:19:36 | tim.peters | set | messages:
+ msg364872 |
2020-03-23 15:36:40 | rhettinger | set | messages:
+ msg364857 |
2020-03-23 06:13:32 | serhiy.storchaka | set | messages:
+ msg364839 |
2020-03-23 01:47:10 | tim.peters | set | assignee: tim.peters -> messages:
+ msg364834 |
2020-03-22 21:07:01 | serhiy.storchaka | set | messages:
+ msg364823 |
2020-03-22 20:52:50 | rhettinger | set | assignee: tim.peters messages:
+ msg364822 |
2020-03-22 19:41:00 | serhiy.storchaka | set | messages:
+ msg364819 |
2020-03-22 19:35:21 | serhiy.storchaka | set | keywords:
+ patch stage: patch review pull_requests:
+ pull_request18473 |
2019-06-20 07:43:34 | serhiy.storchaka | set | messages:
+ msg346106 |
2019-06-17 23:06:33 | rhettinger | set | messages:
+ msg345937 |
2019-06-17 22:50:46 | rhettinger | set | nosy:
+ tim.peters messages:
+ msg345935
|
2019-06-17 19:52:35 | serhiy.storchaka | create | |