This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: accessing mmap of file that is overwritten causes bus error
Type: crash Stage:
Components: Library (Lib) Versions: Python 3.11, Python 3.10, Python 3.9
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: graingert, iritkatriel, mhvk, ronaldoussoren
Priority: normal Keywords:

Created on 2020-05-21 21:32 by mhvk, last changed 2022-04-11 14:59 by admin.

Messages (5)
msg369543 - (view) Author: Marten H. van Kerkwijk (mhvk) Date: 2020-05-21 21:32
While debugging a strange failure with tests and np.memmap, I realized that the following direct use of mmap reliably leads to a bus error. Here, obviously mmap'ing a file, closing it, opening the file for writing but not writing anything, and then again accessing the mmap is not something one should do (but a test case did it anyway), but it would nevertheless be nice to avoid a crash!
```
import mmap


with open('test.dat', 'wb') as fh:
    fh.write(b'abcdefghijklmnopqrstuvwxyz')


with open('test.dat', 'rb') as fh:
    mm = mmap.mmap(fh.fileno(), 0, access=mmap.ACCESS_READ)


with open('test.dat', 'wb') as fh:
    pass  # Note: if something is written, then I get no bus error.


mm[2]
```
msg369635 - (view) Author: Marten H. van Kerkwijk (mhvk) Date: 2020-05-22 20:08
I should probably have added that the bus error happens on linux. On Windows, the opening of the file for writing leads to an error, as the file is still opened for reading inside the mmap.
msg388337 - (view) Author: Thomas Grainger (graingert) * Date: 2021-03-09 09:26
I can confirm this happens on py3.5-3.10

```
import mmap
import pathlib
import tempfile


def main():
    with tempfile.TemporaryDirectory() as tmp:
        tmp_path = pathlib.Path(tmp)
        path = tmp_path / "eg"

        path.write_bytes(b"Hello, World!")

        with path.open("rb") as rf:
            mm = mmap.mmap(rf.fileno(), 0, mmap.MAP_SHARED, mmap.PROT_READ)
            path.write_bytes(b"")
            bytes(mm)

if __name__ == "__main__":
    main()
```
msg388339 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2021-03-09 09:58
What happens here is that the file is truncated, which (more or less) truncates the memory mapping. Accessing a memory mapping beyond the length of the file results in a SIGBUS signal.

I'm not sure if there is much Python can do about this other than shrinking the window for crashes like this by aggressively checking if the file size has changed (but even then a crash will happen if another proces truncates the file between the time the check is done and the memory is actually accessed).

---

Variant of the script that explicitly truncates the file:

def main():
    with tempfile.TemporaryDirectory() as tmp:
        tmp_path = pathlib.Path(tmp)
        path = tmp_path / "eg"

        path.write_bytes(b"Hello, World!")

        with path.open("r+b") as rf:
            mm = mmap.mmap(rf.fileno(), 0, mmap.MAP_SHARED, mmap.PROT_READ)
            rf.truncate(0)
            bytes(mm)

if __name__ == "__main__":
    main()
msg404230 - (view) Author: Irit Katriel (iritkatriel) * (Python committer) Date: 2021-10-18 22:07
Reproduced on 3.11.
History
Date User Action Args
2022-04-11 14:59:31adminsetgithub: 84897
2021-10-18 22:07:24iritkatrielsetnosy: + iritkatriel

messages: + msg404230
versions: + Python 3.11, - Python 3.6, Python 3.7, Python 3.8
2021-03-09 09:58:34ronaldoussorensetnosy: + ronaldoussoren
messages: + msg388339
2021-03-09 09:26:02graingertsetnosy: + graingert

messages: + msg388337
versions: + Python 3.6, Python 3.7, Python 3.9, Python 3.10
2020-05-22 20:08:56mhvksetmessages: + msg369635
2020-05-21 21:32:53mhvkcreate