This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: cPickle.loads is not thread safe due to non-thread-safe imports
Type: behavior Stage: resolved
Components: Extension Modules Versions: Python 3.2, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: DragonFireCK, Sagiv.Malihi, brett.cannon, eric.snow, ncoghlan, pitrou, serhiy.storchaka, taleinat
Priority: normal Keywords:

Created on 2011-08-02 08:16 by Sagiv.Malihi, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Messages (6)
msg141548 - (view) Author: Sagiv Malihi (Sagiv.Malihi) Date: 2011-08-02 08:16
When trying to cPickle.loads() from several threads at once, there is a race condition when threads try to import modules.

An example will explain it best:
suppose I have module foo.py which takes some time to load:

import time
class A(object):
    def __setstate__(self, state):
        self.x = x
time.sleep(1)
x = 5

and a pickled version of an A() object stored in 'A.pkl'.
the following code, when run for the first time, will raise a NameError about 'x':

>>> p = open('A.pkl','rb').read()
>>> [thread.start_new(cPickle.loads, (p,)) for x in xrange(2)]

Unhandled exception in thread started by <built-in function loads>
Traceback (most recent call last):
  File "foo.py", line 7, in __setstate__
    self.x = x
NameError: global name 'x' is not defined


since the module is now loaded, subsequent calls to cPickle.loads will work as expected.

This was tested on 2.5.2, 2.7.1, and 3.2 on Ubuntu and on Windows 7.

please note that this bug was discovered when unpickling the standard 'decimal.Decimal' class (decimal.py is quite long and takes some time to import), and this is not some corner case.
msg141549 - (view) Author: Sagiv Malihi (Sagiv.Malihi) Date: 2011-08-02 08:48
OK, digging deeper reveals that there are actually two bugs here, one is conceptual in the python importing mechanism, and the other is technical in cPickle.

The first bug: 
PyImport_ExecCodeModuleEx adds the module to sys.modules *before* actually executing the code. This is a design flaw (can it really be changed? )
Demonstrating this bug is easy using the foo.py module from the previous comment:

def f():
    if 'bla' in sys.modules:
        bla = sys.modules['bla']
    else:
        import bla
    return bla.A()
running two instances of f in two threads results in the same error.

The second bug: in cPickle.c: func_class() 
cPickle 'manually' checks if a module is in sys.modules instead of letting the import mechanism do it for him (hence breaking the import lock's defense here).
msg141652 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-08-05 10:50
> PyImport_ExecCodeModuleEx adds the module to sys.modules *before*
> actually executing the code. This is a design flaw (can it really be
> changed? )

I guess it is done so to allow for circular imports.

> The second bug: in cPickle.c: func_class() 
> cPickle 'manually' checks if a module is in sys.modules instead of
> letting the import mechanism do it for him (hence breaking the import
> lock's defense here).

I would guess it is an optimization shortcut. A solution (while keeping the optimization) would be to take the import lock before checking sys.modules.

Note that the _pickle module in 3.x has the same kind of logic, and therefore probably the same issue too (haven't tested).
msg141811 - (view) Author: Sagiv Malihi (Sagiv.Malihi) Date: 2011-08-09 10:44
As I said - it certainly happenes on 3.2 (tested).

@pitrou - what you suggested will not work since the actual import will block on the import lock.
The optimization there is not needed, it's already implemented in __import__.
msg170375 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2012-09-12 12:59
I just checked and this is no longer an issue in Python 3.3.

The sys.modules "bug" isn't a bug as that's how it is supposed to work to prevent partially initialized modules. As for how pickle is doing stuff, that could change if it wouldn't break backwards-compatibility.
msg349036 - (view) Author: Tal Einat (taleinat) * (Python committer) Date: 2019-08-05 08:38
I am able to reproduce this with Python 3.6.3 and 3.7.0, but not with 3.7.4, 3.8.0b3 or current master (to be 3.9).

This has been fixed since 3.7.3; see issue #34572.
History
Date User Action Args
2022-04-11 14:57:20adminsetgithub: 56889
2019-08-05 08:38:26taleinatsetstatus: pending -> closed

nosy: + taleinat
messages: + msg349036

resolution: not a bug -> fixed
stage: resolved
2017-06-18 12:43:09serhiy.storchakasetstatus: open -> pending
resolution: not a bug
2017-02-19 14:16:14serhiy.storchakasetnosy: + serhiy.storchaka
type: crash -> behavior
2012-11-13 06:13:00eric.snowsetnosy: + eric.snow
2012-09-12 12:59:47brett.cannonsetmessages: + msg170375
versions: - Python 3.3
2012-09-11 23:16:03DragonFireCKsetnosy: + DragonFireCK
2011-08-09 10:44:05Sagiv.Malihisetmessages: + msg141811
2011-08-05 10:50:45pitrousetnosy: + pitrou

messages: + msg141652
versions: + Python 3.2, Python 3.3
2011-08-02 10:51:23r.david.murraysetnosy: + brett.cannon, ncoghlan
2011-08-02 08:48:35Sagiv.Malihisetmessages: + msg141549
2011-08-02 08:16:09Sagiv.Malihicreate