This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: [doc] sys.addaudithook() documentation should be more explicit on its limitations
Type: Stage:
Components: Documentation Versions: Python 3.10
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: docs@python Nosy List: christian.heimes, docs@python, frankli, gousaiyang, steve.dower, vstinner, zkonge
Priority: normal Keywords:

Created on 2021-03-08 20:26 by vstinner, last changed 2022-04-11 14:59 by admin.

Messages (10)
msg388299 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-03-08 20:26
Recently, the PEP 578 audit hooks was used to build a Capture The Flag (CTF) security challenge, AntCTF x D^3CTF: https://d3ctf.io/

Multiple issues have been reported to the Python Security Response Team (PSRT) from this challenge. It seems like there was a misunderstanding on the intent of the PEP 578.

Building a sandbox using audit hooks is *explicitly* excluded from the PEP 578 design:
https://www.python.org/dev/peps/pep-0578/#why-not-a-sandbox

See also the PEP 551 for more details.

The problem is that these two PEPs are not well summarized in the Python documentation, especially in the sys.addaudithook() documentation:
https://docs.python.org/dev/library/sys.html#sys.addaudithook

The documentation should better describe limitations of audit hooks, and may also point to these two PEPs for more information (PEP 578 is already mentioned).

The bare minimum should be to explicitly say that it should not be used to build a sandbox. 

By design, audit events is a whack a mole game. Rather than starting from a short "allow list", it is based on a "deny list", so it cannot be safe or complete by design. Every "forgotten" audit event can be "abused" to take the control on the application. And that's perfectly *fine*. It should just be documented.
msg388301 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-03-08 20:32
See also bpo-43439: [security] Add audit events on GC functions giving access to all Python objects.
msg388304 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2021-03-08 21:06
To clarify my position on this (as the PEP author):

* audit hooks added *after* initialization (including via the Python API) are not intended for security, but for logging/debugging, and so bypasses are not considered security issues
* audit hooks added *before* Python is initialized should not be able to be bypassed *without* prior events indicating that a bypass is going to occur. Ways of bypassing/removing them without prior indicators should be reported as security issues

And note that all compile()d, imported or exec()d code should have been collected, which means any security bypass has to happen without arbitrary code execution.

These hooks are only one tool necessary to create a more secured environment, not the whole thing. (And note that I said "more secured" not "secure", because it's only as secure as you make it. The relative descriptor is deliberate.)
msg388305 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2021-03-08 21:30
I agree with both of you.

The documention should explicitly state that the audit hooks are for auditing. They are not designed to sandbox Python. When used correctly, they can help to capture and analyze an event post-mortem.

The documentation of sys.addaudithook() should also mention that hooks may not trigger under some conditions, e.g. very late in the interpreter shutdown phase.
msg388313 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-03-08 22:05
Another example of recent Capture The Flag challenge which used audit hooks: https://bugs.python.org/issue42800#msg384143
msg388462 - (view) Author: Frank (frankli) Date: 2021-03-10 21:59
PEP 551 is confusing. It looked suggesting that it's a "security tool" that "detects, identifies and analyzes misuse of Python" to me (and apparently many others).

examples shown in the PEP includes WannaCrypt, APTs, all of which involves the good old remote code execution, which is basically a sandboxed environment it self, at least in some way.

also, the challenges provided the contestants with a "background story" that enables an attacker to execute arbitrary code doesn't mean that one HAVE to gain code execution to achieve the goal of bypassing the aevents. in this case, one only have to find the list object which contains the audit hooks registered, and clear it(or replace it). this clearly breaks the promise made in PEP 578 (Hooks cannot be removed or replaced). THIS SHOULD BE FIXED.

ALSO(again), the software is not always doing what it's designed to do. maybe, I mean maybe, developers should make changes according to what users are doing. I don't know, really.
msg388476 - (view) Author: Saiyang Gou (gousaiyang) * Date: 2021-03-11 03:16
We understand that audit hooks should not be used to build a sandbox with Python. It is natural for audit hooks to appear in CTF challenges though, as many CTF challenges intentionally try to use a wrong way to secure a system (and let players prove it wrong).

With that being said, audit hooks should still be robust, even for logging purposes. We are no trying to prevent all kinds of malicious behaviors, but we want to detect them *as much as possible*. If audit hooks can be easily removed while triggering very few seemingly non-sensitive audit events (in this CTF challenge, only "import gc" is triggered, which probably looks "no so suspicious"), this allows attackers to hide details of further malicious behavior without being audited, which violated the motivation of audit hooks (to increase security transparency).

The recent gc patch introduced new events which will make the attack in that CTF challenge look more suspicious. But probably it is still better to harden the current data structure used to store per interpreter audit hooks. If an attacker happens to gain a reference to the list holding the hooks (although I'm not sure how that will still be possible without using `gc`), they can easily remove the hooks at the Python language level. Probably a Python tuple is already better than a Python list to store the hooks, since tuples are immutable at the language level. Although that means we should build new a tuple each time a new hook is added.

If the hook itself is fragile (e.g. a hook written in Python which relies on global variables), it is a user fault. But if the hook function itself is good, it shouldn't be too easy to remove. Any successful attempts to remove the hook must have already "pwned" the Python interpreter (i.e. gained arbitrary memory read/write or native code execution ability), either by using ctypes, by open('/proc/self/mem'), by crafting bytecode (which triggers code.__new__) or importing modules written in native code. (Overwriting hook.__code__ triggers object.__setattr__.)
msg388490 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2021-03-11 09:31
Python's dynamic nature makes it hard to implement and reason about audit hooks written in Python. sys.addaudithook() is really only design for testing, debugging, and playing around with auditing. You absolutely have to write a custom interpreter if you want to take auditing serious.

Please also keep in mind that sys.addaudithook() does **not** add a global hook. The function adds a per-interpreter hook. It just looks global to most people because a process typically has just one interpreter. I have filed bpo-43472 to track the issue.

$ cat auditsub.py 
import sys
import _xxsubinterpreters

def hook(*args):
    print(args)

sys.addaudithook(hook)

import os
os.system('echo main interpreter')

sub = _xxsubinterpreters.create()
_xxsubinterpreters.run_string(sub, "import os; os.system('echo you got pwned')", None)

$ ./python auditsub.py 
('os.system', (b'echo main interpreter',))
main interpreter
you got pwned
msg388517 - (view) Author: Saiyang Gou (gousaiyang) * Date: 2021-03-11 19:10
> Please also keep in mind that sys.addaudithook() does **not** add a global hook. The function adds a per-interpreter hook.

Yes, I'm aware of this. And this should be better documented. When I was playing around with audit hooks and reading the source code, I see hooks written in Python (added via sys.addaudithook) are per-interpreter (stored in a Python list object as part of per-interpreter state) while hooks written in C (added via PySys_AddAuditHook) are global (stored in a C linked list). C hooks are more robust. The Python hooks are not inherited in a child process created via multiprocessing when the start method is "spawn" or "forkserver" (not "fork") since they are per-interpreter. I'm not sure whether we should audit the creation of new processes/threads via multiprocessing/threading modules (and also creation of sub-interpreters when PEP 554 gets accepted).
msg388569 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2021-03-12 23:55
If someone offers a patch for replacing the list of per-interpreter hooks with something not easily discoverable via gc, I'm sure we'd take it. It's all internal, so just hiding it from the list of bases should be fine (there should never be more than one reference), but turning it into a new form of dynamically expanding array/list would also work, providing whoever reviews it doesn't think it's adding too much complexity/instability.

But file that as a separate issue, please. It's somewhat debatable, while fixing the docs _needs_ to happen. (And I will get to it eventually if nobody else does, but I'm just as happy to review someone else's contribution. And MUCH happier to review a contribution than to argue about it on the tracker :) )
History
Date User Action Args
2022-04-11 14:59:42adminsetgithub: 87604
2021-03-12 23:55:44steve.dowersetmessages: + msg388569
2021-03-11 19:10:29gousaiyangsetmessages: + msg388517
2021-03-11 09:31:26christian.heimessetmessages: + msg388490
2021-03-11 03:16:07gousaiyangsetnosy: + gousaiyang
messages: + msg388476
2021-03-10 21:59:01franklisetnosy: + frankli
messages: + msg388462
2021-03-09 14:28:32zkongesetnosy: + zkonge
2021-03-08 22:05:09vstinnersetmessages: + msg388313
2021-03-08 21:30:46christian.heimessetmessages: + msg388305
2021-03-08 21:06:56steve.dowersetmessages: + msg388304
2021-03-08 20:32:43vstinnersetmessages: + msg388301
2021-03-08 20:26:14vstinnercreate