classification
Title: Multiprocessing: bug with Native ID for threading.mainthread()
Type: behavior Stage: patch review
Components: Library (Lib) Versions: Python 3.8
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: davin, jaketesler, pitrou, vstinner
Priority: normal Keywords: patch

Created on 2019-11-05 22:39 by jaketesler, last changed 2019-11-08 19:30 by ned.deily.

Pull Requests
URL Status Linked Edit
PR 17088 open jaketesler, 2019-11-08 00:24
Messages (3)
msg356070 - (view) Author: Jake Tesler (jaketesler) * Date: 2019-11-05 22:39
I have encountered a minor bug with the new `threading.get_native_id()` featureset in Python 3.8. The bug occurs when creating a new multiprocessing.Process object on Unix (or on any platform where the multiprocessing start_method is 'fork' or 'forkserver').

When creating a new process via fork, the Native ID in the new MainThread is incorrect. The new forked process' threading.MainThread object inherits the Native ID from the parent process' MainThread instead of capturing/updating its own (new) Native ID.

See the following snippet:

>>> import threading, multiprocessing
>>> multiprocessing.set_start_method('fork') # or 'forkserver'
>>> def proc(): print(threading.get_native_id(), threading.main_thread().native_id) # get_native_id(), mainthread.native_id
>>> proc()
22605 22605 # get_native_id(), mainthread.native_id
>>> p = multiprocessing.Process(target=proc)
>>> p.start()
22648 22605 # get_native_id(), mainthread.native_id
>>>
>>> def update(): threading.main_thread()._set_native_id()
>>> def print_and_update(): proc(); update(); proc()
>>> print_and_update()
22605 22605 # get_native_id(), mainthread.native_id
22605 22605 
>>> p2=multiprocessing.Process(target=print_and_update); p2.start()
22724 22605 # get_native_id(), mainthread.native_id
22724 22724
>>> print_and_update()
22605 22605 # get_native_id(), mainthread.native_id
22605 22605

As you can see, the new Process object's MainThread.native_id attribute matches that of the MainThread of its parent process. 

Unfortunately, I'm not too familiar with the underlying mechanisms that Multiprocessing uses to create forked processes. 
I believe this behavior occurs because (AFAIK) a forked multiprocessing.Process copies the MainThread object from its parent process, rather than reinitializing a new one. Looking further into the multiprocessing code, it appears the right spot to fix this would be in the multiprocessing.Process.bootstrap() function. 

I've created a branch containing a working fix - I'm also open to suggestions of how a fix might otherwise be implemented. 
If it looks correct I'll create a PR against the CPython 3.8 branch. 

See the branch here: https://github.com/jaketesler/cpython/tree/fix-mp-native-id

Thanks all!
-Jake
msg356202 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-11-07 16:46
> See the branch here: https://github.com/jaketesler/cpython/tree/fix-mp-native-id

Can you please create a PR?
msg356218 - (view) Author: Jake Tesler (jaketesler) * Date: 2019-11-08 00:33
@vstinner PR created :)
https://github.com/python/cpython/pull/17088
History
Date User Action Args
2019-11-08 19:30:25ned.deilysetnosy: + davin
2019-11-08 00:33:45jaketeslersetmessages: + msg356218
2019-11-08 00:24:04jaketeslersetkeywords: + patch
stage: patch review
pull_requests: + pull_request16596
2019-11-07 16:46:51vstinnersetmessages: + msg356202
2019-11-05 22:42:59jaketeslersetnosy: + pitrou
2019-11-05 22:39:45jaketeslercreate