classification
Title: Multiprocessing: bug with Native ID for threading.mainthread()
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.9, Python 3.8
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: davin, jaketesler, miss-islington, pitrou, vstinner
Priority: normal Keywords: patch

Created on 2019-11-05 22:39 by jaketesler, last changed 2019-11-19 22:27 by vstinner. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 17088 merged jaketesler, 2019-11-08 00:24
PR 17261 merged miss-islington, 2019-11-19 19:50
Messages (8)
msg356070 - (view) Author: Jake Tesler (jaketesler) * Date: 2019-11-05 22:39
I have encountered a minor bug with the new `threading.get_native_id()` featureset in Python 3.8. The bug occurs when creating a new multiprocessing.Process object on Unix (or on any platform where the multiprocessing start_method is 'fork' or 'forkserver').

When creating a new process via fork, the Native ID in the new MainThread is incorrect. The new forked process' threading.MainThread object inherits the Native ID from the parent process' MainThread instead of capturing/updating its own (new) Native ID.

See the following snippet:

>>> import threading, multiprocessing
>>> multiprocessing.set_start_method('fork') # or 'forkserver'
>>> def proc(): print(threading.get_native_id(), threading.main_thread().native_id) # get_native_id(), mainthread.native_id
>>> proc()
22605 22605 # get_native_id(), mainthread.native_id
>>> p = multiprocessing.Process(target=proc)
>>> p.start()
22648 22605 # get_native_id(), mainthread.native_id
>>>
>>> def update(): threading.main_thread()._set_native_id()
>>> def print_and_update(): proc(); update(); proc()
>>> print_and_update()
22605 22605 # get_native_id(), mainthread.native_id
22605 22605 
>>> p2=multiprocessing.Process(target=print_and_update); p2.start()
22724 22605 # get_native_id(), mainthread.native_id
22724 22724
>>> print_and_update()
22605 22605 # get_native_id(), mainthread.native_id
22605 22605

As you can see, the new Process object's MainThread.native_id attribute matches that of the MainThread of its parent process. 

Unfortunately, I'm not too familiar with the underlying mechanisms that Multiprocessing uses to create forked processes. 
I believe this behavior occurs because (AFAIK) a forked multiprocessing.Process copies the MainThread object from its parent process, rather than reinitializing a new one. Looking further into the multiprocessing code, it appears the right spot to fix this would be in the multiprocessing.Process.bootstrap() function. 

I've created a branch containing a working fix - I'm also open to suggestions of how a fix might otherwise be implemented. 
If it looks correct I'll create a PR against the CPython 3.8 branch. 

See the branch here: https://github.com/jaketesler/cpython/tree/fix-mp-native-id

Thanks all!
-Jake
msg356202 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-11-07 16:46
> See the branch here: https://github.com/jaketesler/cpython/tree/fix-mp-native-id

Can you please create a PR?
msg356218 - (view) Author: Jake Tesler (jaketesler) * Date: 2019-11-08 00:33
@vstinner PR created :)
https://github.com/python/cpython/pull/17088
msg356543 - (view) Author: Jake Tesler (jaketesler) * Date: 2019-11-13 17:55
PR was updated with tests and is ready for core developer review and then the merge to cpython:master. After that (if I understand correctly) a backport will automatically get picked into the 3.8 branch if there aren't any conflicts.
msg356987 - (view) Author: miss-islington (miss-islington) Date: 2019-11-19 19:50
New changeset c6b20be85c0de6f2355c67ae6e7e578941275cc0 by Miss Islington (bot) (Jake Tesler) in branch 'master':
bpo-38707: Fix for multiprocessing.Process MainThread.native_id (GH-17088)
https://github.com/python/cpython/commit/c6b20be85c0de6f2355c67ae6e7e578941275cc0
msg356989 - (view) Author: miss-islington (miss-islington) Date: 2019-11-19 20:11
New changeset 829593a9262e67c72167c6cb20d383203b2ea410 by Miss Islington (bot) in branch '3.8':
bpo-38707: Fix for multiprocessing.Process MainThread.native_id (GH-17088)
https://github.com/python/cpython/commit/829593a9262e67c72167c6cb20d383203b2ea410
msg356992 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2019-11-19 21:35
Thank you Jake for the report and PR!
msg356998 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-11-19 22:27
Thanks for the fix. That was an interesting bug ;-) I like the simplicity of the fix.
History
Date User Action Args
2019-11-19 22:27:08vstinnersetmessages: + msg356998
2019-11-19 21:35:38pitrousetstatus: open -> closed
versions: + Python 3.9
messages: + msg356992

resolution: fixed
stage: patch review -> resolved
2019-11-19 20:11:25miss-islingtonsetmessages: + msg356989
2019-11-19 19:50:32miss-islingtonsetpull_requests: + pull_request16754
2019-11-19 19:50:17miss-islingtonsetnosy: + miss-islington
messages: + msg356987
2019-11-13 17:55:38jaketeslersetmessages: + msg356543
2019-11-08 19:30:25ned.deilysetnosy: + davin
2019-11-08 00:33:45jaketeslersetmessages: + msg356218
2019-11-08 00:24:04jaketeslersetkeywords: + patch
stage: patch review
pull_requests: + pull_request16596
2019-11-07 16:46:51vstinnersetmessages: + msg356202
2019-11-05 22:42:59jaketeslersetnosy: + pitrou
2019-11-05 22:39:45jaketeslercreate