New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
random._randbelow optimization #77325
Comments
Given that the random module goes a long way to ensure optimal performance, I was wondering why the check for a match between the random and getrandbits methods is performed per call of Random._randbelow, when it could also be done at instantiation time (the attached patch uses __init_subclass__ for that purpose and, in my hands, gives 10-25% speedups for calls to methods relying on _randbelow). |
FWIW, a 10-25% speedup is only possible because the remaining code is already somewhat fast. All that is being proposed is removing couple of lines that elsewhere would be considered somewhat thin: random = self.random
if type(random) is BuiltinMethod \
or type(getrandbits) is Method: Overall, the idea of doing the check only once at instantiation time seems promising. That said, I have unspecific general worries about using __init_subclass__ and patching the subclass. Perhaps Serhiy, Tim, or Mark will have thoughts on whether this sort of self-patching is something we want to be doing in the standard library, whether it would benefit PyPy, and whether it has risks to existing code, to debugging and testing, and to future maintenance. If I were the one to go the route of making a single pre-check, my instinct would be to just set a flag in __init__, so that the above code would simplify to: if self._valid_getrandbits:
... |
I don't see anything objectionable about the class optimizing the implementation of a private method. I'll note that there's a speed benefit beyond just removing the two type checks in the common case: the optimized But it's really the speed that matters here. |
Yes, that clean-up would be nice as well :-) Any thoughts on having __init__ set a flag versus using __init__subclass__ to backpatch the subclass? To me, the former looks like plain python and latter doesn't seem like something that would normally be done in the standard library. |
I'm the wrong guy to ask about that. Since I worked at Zope Corp, my natural inclination is to monkey-patch everything - but knowing full well that will offend everyone else ;-) That said, this optimization seems straightforward to me: two distinct method implementations for two very different approaches that have nothing in common besides the method name & signature. |
I think this is excellent application of __init_subclass__. It is common to patch an instance method in __init__, but this can create a reference loop if patch it by other instance method. In this case the choice doesn't depend on arguments of __init__, and can be done at class creation time. I like the idea in general, but have comments about the implementation. init_subclass should take **kwargs and pass it to super().init_subclass(). type(cls.random) is not the same as type(self.random). I would use the condition This will break the case when random or getrandbits methods are patched after class creation or per instance, but I think we have no need to support this. We could support also the following cases:
class Rand2(Rand1):
def getrandbits(self): ...
# _randbelow should use getrandbits()
# this is broken in the current patch
class Rand2(Rand1):
def random(self): ...
# _randbelow should use random()
# this is broken in the current code |
Serhiy:
My bad, sorry, and thanks for catching all these issues! You're absolutely right about the class type checks not being equivalent
Right, hadn't thought of this situation.
May be worth fixing, too. |
Wolfgang, can you submit this as a PR. |
Thanks, Raymond. I'll do that once I've addressed Serhiy's points. |
So, the PR implements the behaviour suggested by Serhiy as his cases 1 and 2. |
In addition, I took the opportunity to fix a bug in the original _randbelow in that it would only raise the advertised ValueError on n=0 in the getrandbits-dependent branch, but ZeroDivisionError in the pure random branch. |
ok, I've created bpo-33203 to deal with raising ValueError in _randbelow consistently. |
Possibly, the switch from type checks to identity checks could be considered a bugfix that could be backported. I've always had a lingering worry about that part of the code. |
PR 6291 didn't work properly with case 1. Rand2 uses getrandbits() since it is overridden in the parent despites the fact that random() is defined later. PR 6563 fixes this. It walks classes in method resolution order and finds the first class that defines random() or getrandbits(). PR 6563 also makes tests not using logging for testing purpose. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: