This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Use-after-free by mutating set during set operations
Type: crash Stage: patch review
Components: Interpreter Core Versions: Python 3.11, Python 3.10, Python 3.9
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Dennis Sweeney, miss-islington, rhettinger, serhiy.storchaka, tim.peters
Priority: normal Keywords: patch

Created on 2022-02-02 18:01 by Dennis Sweeney, last changed 2022-04-11 14:59 by admin.

Files
File name Uploaded Description Edit
picklecrasher.py Dennis Sweeney, 2022-02-19 05:24 Randomized crasher for uses of PyDict_Next in _pickle.c
Pull Requests
URL Status Linked Edit
PR 31120 merged Dennis Sweeney, 2022-02-04 04:24
PR 31284 merged miss-islington, 2022-02-11 20:09
PR 31312 merged Dennis Sweeney, 2022-02-13 02:47
Messages (13)
msg412386 - (view) Author: Dennis Sweeney (Dennis Sweeney) * (Python committer) Date: 2022-02-02 18:01
Maybe related to https://bugs.python.org/issue8420

Somewhat obscure, but using only standard Python, and no frame- or gc-hacks, it looks like we can get a use-after-free:

from random import random

BADNESS = 0.0

class Bad:
    def __eq__(self, other):
        if random() < BADNESS:
            set1.clear()
        if random() < BADNESS:
            set2.clear()
        return True
    def __hash__(self):
        return 42

SIZE = 100
TRIALS = 10_000

ops = [
    "|", "|=",
    "==", "!=",
    "<", "<=",
    ">", ">=",
    # "&",  # crash!
    # "&=", # crash!
    "^",
    # "^=", # crash
    # "-", # crash
    "-=",
]

for op in ops:
    stmt = f"set1 {op} set2"
    print(stmt, "...")
    for _ in range(TRIALS):
        BADNESS = 0.00
        set1 = {Bad() for _ in range(SIZE)}
        set2 = {Bad() for _ in range(SIZE)}
        BADNESS = 0.02
        exec(stmt)
    print("ok.")
msg412387 - (view) Author: Dennis Sweeney (Dennis Sweeney) * (Python committer) Date: 2022-02-02 18:12
replacing `return True` with `return random() < 0.5` makes *all* of the operations crash, except for `|` and `|=`.
msg412389 - (view) Author: Dennis Sweeney (Dennis Sweeney) * (Python committer) Date: 2022-02-02 18:17
set1.isdisjoint(set2) also crashes
msg412444 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2022-02-03 14:22
The likely culprit is the set_next() loop.  Perhaps it is never safe to use set_next() because any lookup can callback to __eq__ which can mutate the set.

Since set_isdisjoint() method isn't a mutating method, that is the easiest place to start investigating.  Try disabling the exact set fast path to see if the issue persists.
msg412448 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2022-02-03 16:42
Presumably _PyDict_Next is also suspect.  Even the advertised "safe" calls to PyDict_SetItem() for existing keys would be a trigger.  Calling clear() in either __eq__ or __hash__ would suffice.

If the next loops are the culprint, the new challenge is figuring out how to fix it without wrecking code clarity and performance (and having to deprecate PyDict_Next() which is part of the stable ABI).
msg412487 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2022-02-04 01:19
Marking as low priority given that ehe next loop code has been deployed without incident for two decades (a little less for sets and a little more for dicts).
msg412489 - (view) Author: Dennis Sweeney (Dennis Sweeney) * (Python committer) Date: 2022-02-04 03:12
It looks like usages of the PyDict_Next API assume the resulting references are borrowed and so INCREF them.

Usages of set_next do not, but should.

It should hopefully be a straightforward fix of adding INCREF/DECREFs.
msg412494 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2022-02-04 05:42
Raised the priority back to normal.

I agree with Dennis's observation that PyDict_Next is safe, provided it's used as intended: it returns borrowed references, but to things that absolutely are legitimate at the time. In the presence of mutations, *what* it returns isn't defined at all, but I don't see a way for it to blow up (unless its caller screws up by believing it owns the references). It doesn't assume anything about the structure of the dict across calls.
msg413085 - (view) Author: Dennis Sweeney (Dennis Sweeney) * (Python committer) Date: 2022-02-11 16:25
New changeset 4a66615ba736f84eadf9456bfd5d32a94cccf117 by Dennis Sweeney in branch 'main':
bpo-46615: Don't crash when set operations mutate the sets (GH-31120)
https://github.com/python/cpython/commit/4a66615ba736f84eadf9456bfd5d32a94cccf117
msg413097 - (view) Author: miss-islington (miss-islington) Date: 2022-02-11 20:44
New changeset 1f5fe9962f768c8bfd4ed06a22532d31d3424dc9 by Miss Islington (bot) in branch '3.10':
bpo-46615: Don't crash when set operations mutate the sets (GH-31120)
https://github.com/python/cpython/commit/1f5fe9962f768c8bfd4ed06a22532d31d3424dc9
msg413175 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2022-02-13 10:30
New changeset c31b8a97a8a7e8255231c9e12ed581c6240c0d6c by Dennis Sweeney in branch '3.9':
bpo-46615: Don't crash when set operations mutate the sets (GH-31120) (GH-31312)
https://github.com/python/cpython/commit/c31b8a97a8a7e8255231c9e12ed581c6240c0d6c
msg413176 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2022-02-13 10:41
Thanks Dennis for your report and PRs.

Do you mind to analyze also uses of _PySet_NextEntry(), PyDict_Next() and _PyDict_Next()? Many of them look safe, but _pickle.c seems vulnerable.
msg413530 - (view) Author: Dennis Sweeney (Dennis Sweeney) * (Python committer) Date: 2022-02-19 05:24
It does look like there are some pickle situations that crash. Attached is a randomized crasher. I haven't done too much careful reasoning about it, but adding INCREFs everywhere seems to fix most of the issues.
History
Date User Action Args
2022-04-11 14:59:55adminsetgithub: 90773
2022-04-06 17:05:20serhiy.storchakaunlinkissue46721 dependencies
2022-02-19 05:24:06Dennis Sweeneysetfiles: + picklecrasher.py

messages: + msg413530
2022-02-13 10:41:21serhiy.storchakasetmessages: + msg413176
2022-02-13 10:30:20serhiy.storchakasetmessages: + msg413175
2022-02-13 02:47:09Dennis Sweeneysetpull_requests: + pull_request29473
2022-02-11 20:44:24miss-islingtonsetmessages: + msg413097
2022-02-11 20:09:01miss-islingtonsetnosy: + miss-islington
pull_requests: + pull_request29444
2022-02-11 16:25:17Dennis Sweeneysetmessages: + msg413085
2022-02-11 15:40:10serhiy.storchakalinkissue46721 dependencies
2022-02-04 05:42:31tim.peterssetpriority: low -> normal
nosy: + tim.peters
messages: + msg412494

2022-02-04 04:24:42Dennis Sweeneysetkeywords: + patch
stage: patch review
pull_requests: + pull_request29301
2022-02-04 03:12:21Dennis Sweeneysetmessages: + msg412489
2022-02-04 01:19:08rhettingersetpriority: normal -> low

messages: + msg412487
2022-02-03 16:42:14rhettingersetmessages: + msg412448
2022-02-03 14:22:04rhettingersetmessages: + msg412444
2022-02-03 09:06:00serhiy.storchakasetnosy: + serhiy.storchaka
2022-02-02 18:17:01Dennis Sweeneysetmessages: + msg412389
2022-02-02 18:15:04Dennis Sweeneysettitle: Segfault in set intersection (&) and difference (-) -> Use-after-free by mutating set during set operations
2022-02-02 18:12:23Dennis Sweeneysetmessages: + msg412387
2022-02-02 18:01:22Dennis Sweeneycreate