Message388480
The locking on functools.cached_property (https://github.com/python/cpython/blob/87f649a409da9d99682e78a55a83fc43225a8729/Lib/functools.py#L934) as it was written is completely undesirable for I/O bound values, parallel processing. Instead of protecting the calculation of cached property to the same instance in two threads, it completely blocks parallel calculations of cached values to *distinct instances* of the same class.
Here's the code of __get__ in cached_property:
def __get__(self, instance, owner=None):
if instance is None:
return self
if self.attrname is None:
raise TypeError(
"Cannot use cached_property instance without calling __set_name__ on it.")
try:
cache = instance.__dict__
except AttributeError: # not all objects have __dict__ (e.g. class defines slots)
msg = (
f"No '__dict__' attribute on {type(instance).__name__!r} "
f"instance to cache {self.attrname!r} property."
)
raise TypeError(msg) from None
val = cache.get(self.attrname, _NOT_FOUND)
if val is _NOT_FOUND:
with self.lock:
# check if another thread filled cache while we awaited lock
val = cache.get(self.attrname, _NOT_FOUND)
if val is _NOT_FOUND:
val = self.func(instance)
try:
cache[self.attrname] = val
except TypeError:
msg = (
f"The '__dict__' attribute on {type(instance).__name__!r} instance "
f"does not support item assignment for caching {self.attrname!r} property."
)
raise TypeError(msg) from None
return val
I noticed this because I was recommending that Pyramid web framework deprecate its much simpler [`reify`](https://docs.pylonsproject.org/projects/pyramid/en/latest/_modules/pyramid/decorator.html#reify) decorator in favour of using `cached_property`, and then noticed why it won't do.
Here is the test case for cached_property:
from functools import cached_property
from threading import Thread
from random import randint
import time
class Spam:
@cached_property
def ham(self):
print(f'Calculating amount of ham in {self}')
time.sleep(10)
return randint(0, 100)
def bacon():
spam = Spam()
print(f'The amount of ham in {spam} is {spam.ham}')
start = time.time()
threads = []
for _ in range(3):
t = Thread(target=bacon)
threads.append(t)
t.start()
for t in threads:
t.join()
print(f'Total running time was {time.time() - start}')
Calculating amount of ham in <__main__.Spam object at 0x7fa50bcaa220>
The amount of ham in <__main__.Spam object at 0x7fa50bcaa220> is 97
Calculating amount of ham in <__main__.Spam object at 0x7fa50bcaa4f0>
The amount of ham in <__main__.Spam object at 0x7fa50bcaa4f0> is 8
Calculating amount of ham in <__main__.Spam object at 0x7fa50bcaa7c0>
The amount of ham in <__main__.Spam object at 0x7fa50bcaa7c0> is 53
Total running time was 30.02147102355957
The runtime is 30 seconds; for `pyramid.decorator.reify` the runtime would be 10 seconds:
Calculating amount of ham in <__main__.Spam object at 0x7fc4d8272430>
Calculating amount of ham in <__main__.Spam object at 0x7fc4d82726d0>
Calculating amount of ham in <__main__.Spam object at 0x7fc4d8272970>
The amount of ham in <__main__.Spam object at 0x7fc4d82726d0> is 94
The amount of ham in <__main__.Spam object at 0x7fc4d8272970> is 29
The amount of ham in <__main__.Spam object at 0x7fc4d8272430> is 93
Total running time was 10.010624170303345
`reify` in Pyramid is used heavily to add properties to incoming HTTP request objects - using `functools.cached_property` instead would mean that each independent request thread blocks others because most of them would always get the value for the same lazy property using the the same descriptor instance and locking the same lock. |
|
Date |
User |
Action |
Args |
2021-03-11 06:37:47 | ztane | set | recipients:
+ ztane |
2021-03-11 06:37:47 | ztane | set | messageid: <1615444667.91.0.948817723419.issue43468@roundup.psfhosted.org> |
2021-03-11 06:37:47 | ztane | link | issue43468 messages |
2021-03-11 06:37:46 | ztane | create | |
|