This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Logging displays wrong "processName" if "sys.modules" is cleared in child process
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.10
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Delgan, iritkatriel, serhiy.storchaka, vinay.sajip
Priority: normal Keywords: patch

Created on 2019-11-10 21:53 by Delgan, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
test.py Delgan, 2019-11-10 21:53 Reproducible example
Pull Requests
URL Status Linked Edit
PR 22063 closed iritkatriel, 2020-09-02 11:11
PR 22142 merged iritkatriel, 2020-09-07 18:24
Messages (8)
msg356338 - (view) Author: Delgan (Delgan) * Date: 2019-11-10 21:53
Hi.

In order to display the process name in logs, the "logging" module checks for the presence of "multiprocessing" in "sys.modules" before calling "current_process()". If "multiprocessing" is not found in "sys.modules", it assumes that the current process is the main one.

See : https://github.com/python/cpython/blob/af46450bb97ab9bd38748e75aa849c29fdd70028/Lib/logging/__init__.py#L340-L341

However, nothing prevents a child process to delete "sys.module['multiprocessing']", which causes the process name to be wrongly displayed as "MainProcess".

I attached a reproducible example, but this is straightforward to understand. Although it should not happen very often in practice, it is still theoretically possible. Obviously, one could say "just don't clear sys.modules", but I suppose there might exist tools doing such thing for good reasons (like resetting the test environment).

Issues which lead to the current implementation:
- issue4301
- issue7120
- issue8200

Possible fixes: 
- Force import "multiprocessing.current_process()" even if not already loaded
- Add function "os.main_pid()" and set "processName" to "MainProcess" only if "os.getpid() == os.main_pid()"
msg356357 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2019-11-11 11:02
This will be a rare use case, and I would expect anyone clearing sys.modules in the child process to handle this consequence themselves (e.g. by reimporting multiprocessing after clearing out the other modules, since they apparently still want some of multiprocessing's functionality in the child process). I'll certainly look at PRs to address the issue using an import of multiprocessing if it's not already there. I'm not so keen on the other approaches you've suggested.
msg376295 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2020-09-03 11:23
See also bpo-8200, which relates to the way the code is currently.
msg376517 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-09-07 17:12
So this will make logging depending on the multiprocess module? Even if you do not use multiprocessing it will be imported in any case when you use logging.
msg376522 - (view) Author: Irit Katriel (iritkatriel) * (Python committer) Date: 2020-09-07 17:42
As the PR currently is, it will, unless you turn this off by setting logging.logMultiprocessing=False.

The other alternative we are discussing is: revert back to not importing multiprocessing if it's not there.

If it's there we use it, but if it's not we set processName to something like f"pid={os.getpid()}" instead of "MainThread".

(1) os is already imported in logging.

(2) In most cases where multiprocessing is not imported it's because there is only one process, so processName will not likely be used as all.

(3) In the edge case of shutdown or someone clearing sys.modules the processName will not be pretty, but it will at least be correct.
msg376523 - (view) Author: Irit Katriel (iritkatriel) * (Python committer) Date: 2020-09-07 18:25
I just pushed a second PR with the alternative solution - it calculates the name on a best-effort basis, and doesn't import anything that's not already there.
msg376579 - (view) Author: Irit Katriel (iritkatriel) * (Python committer) Date: 2020-09-08 15:52
We've decided for now to leave the code's behavior as it is (the alternatives are worse) and to keep only the test improvements. See discussion on PR 22142.
msg376591 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2020-09-08 19:40
New changeset 3fd69991f47a6672c510fafd76bf183f17ac52ec by Irit Katriel in branch 'master':
bpo-38762: Extend logging.test_multiprocessing to cover missing cases. (GH-22142)
https://github.com/python/cpython/commit/3fd69991f47a6672c510fafd76bf183f17ac52ec
History
Date User Action Args
2022-04-11 14:59:23adminsetgithub: 82943
2020-09-09 07:54:04vinay.sajipsetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2020-09-08 19:40:08vinay.sajipsetmessages: + msg376591
2020-09-08 15:52:32iritkatrielsetmessages: + msg376579
versions: + Python 3.10, - Python 3.9
2020-09-07 18:25:49iritkatrielsetmessages: + msg376523
2020-09-07 18:24:48iritkatrielsetpull_requests: + pull_request21226
2020-09-07 17:42:21iritkatrielsetmessages: + msg376522
2020-09-07 17:12:13serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg376517
2020-09-03 11:23:16vinay.sajipsetmessages: + msg376295
2020-09-02 11:11:18iritkatrielsetkeywords: + patch
nosy: + iritkatriel

pull_requests: + pull_request21154
stage: patch review
2019-11-11 11:02:53vinay.sajipsetmessages: + msg356357
2019-11-11 04:36:50xtreaksetnosy: + vinay.sajip
2019-11-10 21:53:15Delgancreate