This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: C unpickling bypasses import thread safety
Type: behavior Stage: resolved
Components: Extension Modules Versions: Python 3.8, Python 3.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Valentyn Tymofieiev, alexandre.vassalotti, brett.cannon, eric.snow, gst, ncoghlan, pitrou, tjb900
Priority: normal Keywords: needs review, patch

Created on 2018-09-03 15:53 by tjb900, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
reproducer_submit.py tjb900, 2018-09-03 15:53
reproduce_34572.py gst, 2019-11-15 02:04
find_class_deadlock.py Valentyn Tymofieiev, 2019-11-21 19:24
Pull Requests
URL Status Linked Edit
PR 9047 merged tjb900, 2018-09-03 16:06
PR 11921 merged miss-islington, 2019-02-18 15:31
Messages (13)
msg324528 - (view) Author: Tim Burgess (tjb900) * Date: 2018-09-03 15:53
Retrieving and using a module directly from sys.modules (from C in this case) leads to a race condition where the module may be importing on another thread but has not yet been initialised.  For slow filesystems or large modules (e.g. numpy) this seems to lead to easily reproducible errors (the attached code fails 100% of the time on my work machine - CentOS 7).

I believe they have to be in sys.modules during this phase due to the possibility of circular references.

importlib handles this carefully with locking, but _pickle.c bypasses all that, leading to issues with threaded codes that use pickling, e.g. dask/distributed.
msg327729 - (view) Author: Tim Burgess (tjb900) * Date: 2018-10-15 04:55
Hi!  Just wondering if there is anything I can do to move this along?
msg335023 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2019-02-07 15:44
Interesting I could not reproduce here, even by throwing Pandas into the mix and spawning 1024 workers...
msg335027 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2019-02-07 16:39
Perhaps PyImport_GetModule() should aquire-release the module's lock before the lookup?  This would effectively be a call to _lock_unlock_module() in importlib._bootstrap.

The alternative is to encourage using PyImport_Import() instead, like the PR has done.  In the case the docs for PyImport_GetModule() should make it clear that it is guaranteed that the module is fully imported yet (and recommend using PyImport_Import() for the guarantee).

Either way there should be a new issue for the more general change (and it should reference this issue).
msg335096 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2019-02-08 17:09
I agree that more generally PyImport_GetModule() should be fixed.  But it should be done carefully so as to not to lose the performance benefit of doing it.  I think we should open a separate issue about that.

PS: one possibility is to reuse the optimization already done in PyImport_ImportModuleLevelObject():

        /* Optimization: only call _bootstrap._lock_unlock_module() if
           __spec__._initializing is true.
           NOTE: because of this, initializing must be set *before*
           stuffing the new module in sys.modules.
         */
        spec = _PyObject_GetAttrId(mod, &PyId___spec__);
        if (_PyModuleSpec_IsInitializing(spec)) {
            PyObject *value = _PyObject_CallMethodIdObjArgs(interp->importlib,
                                            &PyId__lock_unlock_module, abs_name,
                                            NULL);
            if (value == NULL) {
                Py_DECREF(spec);
                goto error;
            }
            Py_DECREF(value);
        }
        Py_XDECREF(spec);
msg335098 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2019-02-08 17:14
Opened issue35943 for PyImport_GetModule().
msg335099 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2019-02-08 17:15
Thanks, Antoine.
msg335842 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2019-02-18 15:30
New changeset 4371c0a9c0848f7a0947d43f26f234842b41efdf by Antoine Pitrou (tjb900) in branch 'master':
bpo-34572: change _pickle unpickling to use import rather than retrieving from sys.modules (GH-9047)
https://github.com/python/cpython/commit/4371c0a9c0848f7a0947d43f26f234842b41efdf
msg335844 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2019-02-18 15:52
New changeset 3129432845948fdcab1e05042e99a19e9e2c3c71 by Antoine Pitrou (Miss Islington (bot)) in branch '3.7':
bpo-34572: change _pickle unpickling to use import rather than retrieving from sys.modules (GH-9047) (GH-11921)
https://github.com/python/cpython/commit/3129432845948fdcab1e05042e99a19e9e2c3c71
msg335845 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2019-02-18 15:53
This is pushed to 3.7 and master now.  Thank you Tim for the report and the fix!
msg356641 - (view) Author: Guenther Starnberger (gst) Date: 2019-11-15 02:04
For this issue only 3.7 and 3.8 are listed as affected versions, but it appears to be also reproducible on the latest 3.5 and 3.6 releases. I've attached a script that can be used to reproduce the issue on those earlier releases (it consistently fails for me with values of 50 or higher as command line argument).
msg356697 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2019-11-15 17:26
3.6 and 3.5 are in security mode, so unless there's a security risk due to this bug the fix will not be backported to those older versions (https://devguide.python.org/#status-of-python-branches).
msg357200 - (view) Author: Valentyn Tymofieiev (Valentyn Tymofieiev) Date: 2019-11-21 19:24
While investigating[1], I observe that certain unpickling operations, for example, Unpickler.find_class, remain not thread-safe in Python 3.7.5 and earlier versions that I tried. I have not tried 3.8, but cannot reproduce this error on Python 2. 

For example, attached find_class_deadlock.py which consistently fails with a deadlock on Python 3.7.5.

I opened https://bugs.python.org/issue38884, which may be causing these errors, since the failure mode is similar, and in at least some codepaths, find_class seems to be calling __import__ [2].

[1] https://issues.apache.org/jira/browse/BEAM-8651
[2] https://github.com/python/cpython/blob/4ffc569b47bef9f95e443f3c56f7e7e32cb440c0/Lib/pickle.py#L1426.
History
Date User Action Args
2022-04-11 14:59:05adminsetgithub: 78753
2019-11-21 19:24:51Valentyn Tymofieievsetfiles: + find_class_deadlock.py
2019-11-21 19:24:33Valentyn Tymofieievsetnosy: + Valentyn Tymofieiev
messages: + msg357200
2019-11-15 17:26:51brett.cannonsetmessages: + msg356697
2019-11-15 02:04:25gstsetfiles: + reproduce_34572.py
nosy: + gst
messages: + msg356641

2019-02-18 15:53:12pitrousetstatus: open -> closed
versions: - Python 3.6
messages: + msg335845

resolution: fixed
stage: patch review -> resolved
2019-02-18 15:52:35pitrousetmessages: + msg335844
2019-02-18 15:31:09miss-islingtonsetpull_requests: + pull_request11946
2019-02-18 15:30:56pitrousetmessages: + msg335842
2019-02-08 17:15:55eric.snowsetmessages: + msg335099
2019-02-08 17:14:11pitrousetmessages: + msg335098
2019-02-08 17:09:57pitrousetmessages: + msg335096
2019-02-07 16:39:48eric.snowsettype: crash -> behavior
messages: + msg335027
2019-02-07 15:45:41pitrousetnosy: + brett.cannon, ncoghlan, eric.snow
2019-02-07 15:44:53pitrousetnosy: + pitrou
messages: + msg335023
2019-02-05 13:10:12SilentGhostsetkeywords: + needs review
nosy: + alexandre.vassalotti
2018-10-15 04:55:14tjb900setmessages: + msg327729
2018-09-03 16:06:18tjb900setkeywords: + patch
stage: patch review
pull_requests: + pull_request8509
2018-09-03 15:53:21tjb900create