classification
Title: marshal load bypass code.__new__ audit event
Type: security Stage: resolved
Components: Versions: Python 3.11, Python 3.10, Python 3.9, Python 3.8
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: steve.dower Nosy List: gvanrossum, miss-islington, steve.dower, tkmk
Priority: normal Keywords: easy (C), patch

Created on 2020-07-01 06:58 by tkmk, last changed 2021-06-30 19:37 by steve.dower. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 21271 merged tkmk, 2020-07-02 03:24
PR 21300 merged miss-islington, 2020-07-03 20:56
PR 21301 merged miss-islington, 2020-07-03 20:57
PR 26961 merged steve.dower, 2021-06-30 01:28
PR 26970 merged steve.dower, 2021-06-30 16:33
PR 26971 merged steve.dower, 2021-06-30 16:34
PR 26972 merged steve.dower, 2021-06-30 16:46
Messages (13)
msg372733 - (view) Author: Yunfan Zhan (tkmk) * Date: 2020-07-01 06:58
While `code.__new__` is being audited, using `marshal.loads` to create a code object will trigger no events. Therefore, either `marshal.load(s)` event itself should be audited, or `code.__new__` should be triggered when marshal type is TYPE_CODE.

Considering that importing from a pyc file also relys on unmarshalling code objects, and they have already been audited as `import`, I'm also wondering if auditing twice should be avoided for performance.
msg372771 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2020-07-01 17:06
I like using the existing event for unmarshalling code objects, assuming we have all the arguments available.

I'm not sure whether it's worth auditing all marshal.load() calls (just as we don't audit all pickle.load() calls). But depending on the code paths we may not have a choice.

Auditing the .pyc import twice (or more) is basically unavoidable, but it's not terrible anyway. When no hooks are set, it's negligible impact, and when someone adds a hook they are in control of how much work it does. Providing both events means they can choose to only handle one and still get the coverage.

Marking this as easy (C), since the hardest part is just going to be finding the right place to add the PySys_Audit call.
msg372876 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2020-07-02 19:51
Actually, a quick search of codeobject.c and a look at tkmk's PR makes it seem like the audit event should be being raised from inside PyCode_NewWithPosOnlyArgs anyway (which IIRC didn't exist when I first added the event, though it was probably there before it was merged).

In general, events should be raised after parameters have been validated, but before any work is done. And when all the other calls feed through a single function, just auditing that one function is enough.
msg372900 - (view) Author: Yunfan Zhan (tkmk) * Date: 2020-07-03 03:08
Before this, we only audit code.__new__ and code.replace, as these methods allow constructing arbitrary code objects, and we don't audit code object coming from the normal way (like compile,exec,eval).
If the event is raised in PyCode_NewWithPosOnlyArgs, is it ok that the compiled code is also audited?
msg372956 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2020-07-03 20:55
Ah, you're right. Thanks for double checking me :)

I'll merge the PR and do the backports. Thanks!
msg372957 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2020-07-03 20:56
New changeset d160e0f8e283d0a8737644588b38e8c6a07c134f by tkmikan in branch 'master':
bpo-41180: Audit code.__new__ when unmarshalling (GH-21271)
https://github.com/python/cpython/commit/d160e0f8e283d0a8737644588b38e8c6a07c134f
msg372959 - (view) Author: miss-islington (miss-islington) Date: 2020-07-03 21:13
New changeset c1d916595eb6979d4d87cc3e5216e26b3c6fac25 by Miss Islington (bot) in branch '3.8':
bpo-41180: Audit code.__new__ when unmarshalling (GH-21271)
https://github.com/python/cpython/commit/c1d916595eb6979d4d87cc3e5216e26b3c6fac25
msg372960 - (view) Author: miss-islington (miss-islington) Date: 2020-07-03 21:16
New changeset 1c776541a8805576c0a4363ca28c1d29423f02f6 by Miss Islington (bot) in branch '3.9':
bpo-41180: Audit code.__new__ when unmarshalling (GH-21271)
https://github.com/python/cpython/commit/1c776541a8805576c0a4363ca28c1d29423f02f6
msg396757 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2021-06-29 22:20
I'm going to revert this and replace it with a marshal.loads (and dumps) event instead.

The performance impact on loading .pyc files is too great, as it triggers the hook for each function. Without severely modifying importlib we can't bypass the call that's accessible to Python code, and the important part to detect is marshal calls outside of regular imports.

And wanted to mention that Yunfan Zhan suggested adding these events originally and I chose to go the other way. So my bad, and let's get it right now!
msg396782 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2021-06-30 16:22
New changeset 139de04518bd98a975b7c98ab8a38e570dc585e4 by Steve Dower in branch 'main':
bpo-41180: Replace marshal code.__new__ audit event with marshal.load[s] and marshal.dumps (GH-26961)
https://github.com/python/cpython/commit/139de04518bd98a975b7c98ab8a38e570dc585e4
msg396786 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2021-06-30 17:52
New changeset a5764d3d96341441d3f70fb5c96a82610a3f4842 by Steve Dower in branch '3.10':
bpo-41180: Replace marshal code.__new__ audit event with marshal.load[s] and marshal.dumps (GH-26970)
https://github.com/python/cpython/commit/a5764d3d96341441d3f70fb5c96a82610a3f4842
msg396787 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2021-06-30 17:52
New changeset 863e3d5c7e037b24b8294b041ed7686b522973d8 by Steve Dower in branch '3.9':
bpo-41180: Replace marshal code.__new__ audit event with marshal.load[s] and marshal.dumps (GH-26971)
https://github.com/python/cpython/commit/863e3d5c7e037b24b8294b041ed7686b522973d8
msg396788 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2021-06-30 17:53
New changeset 95919b0d2744adb87acf696ae1de905cf02a95a6 by Steve Dower in branch 'main':
bpo-41180: Fixes documentation to specify correct event name and add versionchanged (GH-26972)
https://github.com/python/cpython/commit/95919b0d2744adb87acf696ae1de905cf02a95a6
History
Date User Action Args
2021-06-30 19:37:54steve.dowersetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2021-06-30 17:53:16steve.dowersetmessages: + msg396788
2021-06-30 17:52:42steve.dowersetmessages: + msg396787
2021-06-30 17:52:32steve.dowersetmessages: + msg396786
2021-06-30 16:46:03steve.dowersetpull_requests: + pull_request25536
2021-06-30 16:34:19steve.dowersetpull_requests: + pull_request25535
2021-06-30 16:33:43steve.dowersetpull_requests: + pull_request25534
2021-06-30 16:22:07steve.dowersetmessages: + msg396782
2021-06-30 01:28:22steve.dowersetstage: needs patch -> patch review
pull_requests: + pull_request25527
2021-06-29 22:20:12steve.dowersetstatus: closed -> open

assignee: steve.dower
versions: + Python 3.11
nosy: + gvanrossum

messages: + msg396757
resolution: fixed -> (no value)
stage: resolved -> needs patch
2021-03-05 17:22:18tkmksetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2020-07-03 21:16:26miss-islingtonsetmessages: + msg372960
2020-07-03 21:13:36miss-islingtonsetmessages: + msg372959
2020-07-03 20:57:13miss-islingtonsetpull_requests: + pull_request20450
2020-07-03 20:56:49miss-islingtonsetnosy: + miss-islington
pull_requests: + pull_request20449
2020-07-03 20:56:37steve.dowersetmessages: + msg372957
2020-07-03 20:55:04steve.dowersetmessages: + msg372956
2020-07-03 03:08:53tkmksetmessages: + msg372900
2020-07-02 19:51:16steve.dowersetmessages: + msg372876
2020-07-02 03:24:08tkmksetkeywords: + patch
stage: patch review
pull_requests: + pull_request20421
2020-07-01 17:06:00steve.dowersetkeywords: + easy (C)

messages: + msg372771
2020-07-01 06:58:32tkmkcreate