This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Avoid chained exceptions in lru_cache
Type: behavior Stage: patch review
Components: Versions: Python 3.2, Python 3.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: eric.snow, ezio.melotti, python-dev, rhettinger
Priority: low Keywords: needs review, patch

Created on 2011-10-14 12:19 by ezio.melotti, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
issue13177.diff ezio.melotti, 2011-10-14 12:19 review
issue13177-bench.py ezio.melotti, 2011-10-14 22:55 'try' vs 'in' benchmark
Messages (11)
msg145511 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2011-10-14 12:19
The attached patch changes lru_cache to use if/else instead of try/except.
This has 2 effects:
1) it avoids chained exceptions and makes the error messages clearer;
2) it probably makes lru_cache a bit faster since building and catching exceptions is expensive. It also gets rid of the KeyError=KeyError optimization/hack;

For a couple of examples of 1) see #12749 (msg142059, msg142063), or #13169 (msg145469).  See also related issue #6210.
msg145514 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2011-10-14 13:27
Here's an example (copied from msg142063) of what the traceback is without the patch:
>>> from functools import lru_cache
>>> @lru_cache()
... def func(arg): raise ValueError()
... 
>>> func(3)
Traceback (most recent call last):
  File "/home/wolf/dev/py/3.2/Lib/functools.py", line 176, in wrapper
    result = cache[key]
KeyError: (3,)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/wolf/dev/py/3.2/Lib/functools.py", line 180, in wrapper
    result = user_function(*args, **kwds)
  File "<stdin>", line 2, in func
ValueError

The patch gets rid of the first traceback (the one before "During handling...").
I should also mention that my second point might not be valid if the cache hits are mostly successful.  I haven't done any specific benchmark, and I don't know how fast is 'key in dict' compared to raising an exception.  If it's e.g. 10 times faster, it should make lru_cache faster when the hits:miss ratio is lower than 10:1.
msg145516 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2011-10-14 14:00
This changes behavior so that hash() gets called twice for the key tuple, resulting in decreased performance.

In an earlier version of the lru_cache before the "with lock" was introduced, the try/except form was more atomic.  It also worked well with dict subclasses that use __missing__ (which is bypassed with the LBYL style of lookup).

I'll look at this more shortly -- thanks for the links to the other tracker items.
msg145518 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2011-10-14 14:25
Another issue is that I want to keep the related accesses to the OrderedDict inside the "with lock" in order to avoid a race condition between the testing-for-membership step and the retrieve-the-cached-value step.
msg145521 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2011-10-14 14:31
One possibility is to move the call to user_function() outside of the KeyError exception handler so that user exceptions won't be chained:

def wrapper(*args, **kwds):
    nonlocal hits, misses
    key = args
    if kwds:
        key += kwd_mark + tuple(sorted(kwds.items()))
    try:
        with lock:
            result = cache[key]
            cache_renew(key)        # record recent use of this key
            hits += 1
            return result
    except KeyError:
        pass
    result = user_function(*args, **kwds)
    with lock:
        cache[key] = result     # record recent use of this key
        misses += 1
        if len(cache) > maxsize:
            cache_popitem(0)    # purge least recently used cache entry
    return result
msg145560 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2011-10-14 20:05
Could you just cancel the chained exception?

   >>> try: {}["asdf"]
   ... except KeyError:
   ...   try: raise Exception()
   ...   except Exception as x:
   ...     x.__cause__ = None
   ...     x.__context__ = None
   ...     x.__traceback__ = None
   ...     raise x
   ... 
   Traceback (most recent call last):
     File "<stdin>", line 8, in <module>
   Exception

in contrast to:

   >>> try: {}["asdf"]
   ... except KeyError:
   ...   try: raise e
   ...   except Exception as x: 
   ...     raise x
   ... 
   Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
   KeyError: 'asdf'
   
   During handling of the above exception, another exception occurred:
   
   Traceback (most recent call last):
     File "<stdin>", line 5, in <module>
     File "<stdin>", line 3, in <module>
     File "<stdin>", line 8, in <module>
   Exception
msg145570 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2011-10-14 22:55
Canceling the chained exception might work as a workaround, but it requires yet another try/except and it's not very elegant in my opinion.

Raymond, about __missing__ it shouldn't be a problem here, because we are using "in" to look in the cache, and the cache is either a dict or an OrderedDict and they don't use __missing__.

I also did some crude benchmark using strings (see attached file):
always hit
try: 0.3375518321990967
in : 0.43109583854675293
always miss
try: 2.7987680435180664
in : 0.3211240768432617
5 hit, 5 miss
try: 18.37925887107849
in : 5.305108070373535
9 hit, 1 miss
try: 8.38001799583435
in : 6.524656057357788

Of course this will change if the hash() function gets more expensive, or if the ratio hits:miss is different.
msg145612 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2011-10-16 06:23
Sorry, I'm not going to introduce a possible race condition because you find try/except to be less elegant than an if-contains test.  Also, I want to keep the current behavior of calling hash only once.  Your original complaint is valid though, so I'll go ahead and move the user_function call outside the exception block so as to avoid unnecessary exception chaining.
msg145615 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2011-10-16 07:01
New changeset 8365f82f8a13 by Raymond Hettinger in branch '3.2':
Issue 13177: Make tracebacks more readable by avoiding chained exceptions in the lru_cache.
http://hg.python.org/cpython/rev/8365f82f8a13
msg145617 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2011-10-16 07:31
My comment was referring to double try/except suggested by Eric.  Indeed the if/else might lead to a race condition, and that's a good reason to avoid LBYL -- even if on average calling hash() twice might be faster.
I'm happy with the fix you committed, thanks!
msg145618 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2011-10-16 08:46
Thanks for the bug report and for the test case.
History
Date User Action Args
2022-04-11 14:57:22adminsetgithub: 57386
2011-10-16 08:46:13rhettingersetmessages: + msg145618
2011-10-16 07:31:10ezio.melottisetmessages: + msg145617
2011-10-16 07:04:39rhettingersetstatus: open -> closed
resolution: fixed
2011-10-16 07:01:30python-devsetnosy: + python-dev
messages: + msg145615
2011-10-16 06:23:24rhettingersetmessages: + msg145612
2011-10-14 22:55:20ezio.melottisetfiles: + issue13177-bench.py

messages: + msg145570
2011-10-14 20:05:13eric.snowsetnosy: + eric.snow
messages: + msg145560
2011-10-14 14:31:32rhettingersetmessages: + msg145521
2011-10-14 14:25:34rhettingersetmessages: + msg145518
2011-10-14 14:00:39rhettingersetpriority: normal -> low

messages: + msg145516
2011-10-14 13:27:21ezio.melottisetmessages: + msg145514
2011-10-14 12:19:10ezio.melotticreate