classification
Title: fnmatch regex cache use is not threadsafe
Type: crash Stage: resolved
Components: Library (Lib) Versions: Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: serhiy.storchaka Nosy List: mschmitzer, python-dev, serhiy.storchaka, vstinner
Priority: normal Keywords: patch

Created on 2015-01-08 13:02 by mschmitzer, last changed 2015-01-27 09:42 by serhiy.storchaka. This issue is now closed.

Files
File name Uploaded Description Edit
fnmatch_thread.py mschmitzer, 2015-01-08 13:02 Example script
fnmatch_threadsafe.patch serhiy.storchaka, 2015-01-08 13:49 review
Messages (7)
msg233650 - (view) Author: M. Schmitzer (mschmitzer) Date: 2015-01-08 13:02
The way the fnmatch module uses its regex cache is not threadsafe. When multiple threads use the module in parallel, a race condition between retrieving a - presumed present - item from the cache and clearing the cache (because the maximum size has been reached) can lead to KeyError being raised.

The attached script demonstrates the problem. Running it will (eventually) yield errors like the following.

Exception in thread Thread-10:
Traceback (most recent call last):
  File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner
    self.run()
  File "/usr/lib/python2.7/threading.py", line 763, in run
    self.__target(*self.__args, **self.__kwargs)
  File "fnmatch_thread.py", line 12, in run
    fnmatch.fnmatchcase(name, pat)
  File "/home/marc/.venv/modern/lib/python2.7/fnmatch.py", line 79, in fnmatchcase
    return _cache[pat].match(name) is not None
KeyError: 'lYwrOCJtLU'
msg233651 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-01-08 13:03
I guess that a lot of stdlib modules are not thread safe :-/ A workaround is to protect calls to fnmatch with your own lock.
msg233654 - (view) Author: M. Schmitzer (mschmitzer) Date: 2015-01-08 13:22
Ok, if that is the attitude in such cases, feel free to close this.
msg233655 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-01-08 13:25
It would be nice to fix the issue, but I don't know how it is handled in other stdlib modules.
msg233656 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-01-08 13:49
It is easy to make fnmatch caching thread safe without locks. Here is a patch.

The problem with fnmatch is that the caching is implicit and a user don't know that any lock are needed. So either the need of the lock should be explicitly documented, or fnmatch should be made thread safe. The second option looks more preferable to me.

In 3.x fnmatch is thread safe because thread safe lru_cache is used.
msg233659 - (view) Author: M. Schmitzer (mschmitzer) Date: 2015-01-08 14:06
@serhiy.storchaka: My thoughts exactly, especially regarding the caching being implicit. From the outside, fnmatch really doesn't look like it could have threading issues.
The patch also looks exactly like what I had in mind.
msg234813 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015-01-27 09:41
New changeset fe12c34c39eb by Serhiy Storchaka in branch '2.7':
Issue #23191: fnmatch functions that use caching are now threadsafe.
https://hg.python.org/cpython/rev/fe12c34c39eb
History
Date User Action Args
2015-01-27 09:42:51serhiy.storchakasetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2015-01-27 09:41:42python-devsetnosy: + python-dev
messages: + msg234813
2015-01-27 09:27:44serhiy.storchakasetassignee: serhiy.storchaka
2015-01-08 14:06:38mschmitzersetmessages: + msg233659
2015-01-08 13:49:17serhiy.storchakasetfiles: + fnmatch_threadsafe.patch

nosy: + serhiy.storchaka
messages: + msg233656

keywords: + patch
stage: patch review
2015-01-08 13:25:21vstinnersetmessages: + msg233655
2015-01-08 13:22:37mschmitzersetmessages: + msg233654
2015-01-08 13:03:15vstinnersetnosy: + vstinner
messages: + msg233651
2015-01-08 13:02:17mschmitzercreate