classification
Title: Test leaks of memory not managed by Python allocator
Type: enhancement Stage:
Components: Tests Versions: Python 3.8
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: erlendaasland, serhiy.storchaka, vstinner, xiang.zhang
Priority: normal Keywords: patch

Created on 2018-10-23 19:36 by serhiy.storchaka, last changed 2021-04-16 09:55 by erlendaasland.

Files
File name Uploaded Description Edit
patch.diff erlendaasland, 2021-03-18 20:03
patch-with-simple-msize.diff erlendaasland, 2021-04-16 07:56 PoC, take 2
Messages (14)
msg328336 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-10-23 19:36
Would be nice to add a possibility to test memory leaks if memory is allocated not by Python allocators, but inside external libraries. This would allow to catch leaks on the bridge between Python and external libraries. See for example issue34794.
msg328346 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-10-23 21:42
You can try to use Valgrind for that?
msg328520 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-10-26 08:09
Can it be used on buildbots and in CI tests?
msg328530 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-10-26 10:18
> Can it be used on buildbots and in CI tests?

These kind of tool produces a lot of false alarms :-( They are also hard to use.
msg328531 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-10-26 10:20
Before seeing how to automate it, first someone should try to detect the bpo-34794 memory leak manually ;-)

By the way, when I implemented the PEP 445, I tried to configure OpenSSL to use Python memory allocators (to benefit of tracemalloc), but the OpenSSL API for that didn't work at all: bpo-18227.
msg328532 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-10-26 10:27
About automation, regrtest -R checks memory leaks using sys.getallocatedblocks(). But this function only tracks PyMem_Malloc() and PyObject_Malloc() (which is now technically the same memory allocator, since Python 3.6: https://docs.python.org/dev/c-api/memory.html#default-memory-allocators )

I tried to track PyMem_RawMalloc() using regrtest but... the raw memory allocator is not really "deterministic", it's hard to get reliable behavior. See: bpo-26850.

Tracking memory leaks is an hard topic :-) I'm happy that I finally fixed tracemalloc to track properly objects in free lists: bpo-35053!
msg328533 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-10-26 10:39
It seems like the easiest thing to do thta would directly benefit (to tracemalloc users) is to continue the implementation of bpo-18227:

* _sqlite: call sqlite3_config(SQLITE_CONFIG_MALLOC, pMem) to use PyMem_RawMalloc()
* _ssl: try again CRYPTO_set_mem_functions()

Python modules already using Python memory allocators:

* zlib: "zst.zalloc = PyZlib_Malloc" which calls PyMem_RawMalloc
* _decimal: mpd_mallocfunc = PyMem_Malloc
* _lzma: "self->alloc.alloc = PyLzma_Malloc" which calls PyMem_RawMalloc
* pyexpat: XML_ParserCreate_MM(encoding, &ExpatMemoryHandler,...) with ExpatMemoryHandler = {PyObject_Malloc, PyObject_Realloc, PyObject_Free}
* _bz2: "bzalloc = BZ2_Malloc" which calls PyMem_RawMalloc()

Using Python memory allocators gives access to Python builtin "memory debugger", even in release mode using PYTHONMALLOC=debug or -X dev.
msg389039 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2021-03-18 20:03
> _sqlite: call sqlite3_config(SQLITE_CONFIG_MALLOC, pMem) to use PyMem_RawMalloc()

SQLite requires the xSize member of sqlite3_mem_methods to be implemented for this to work, so we'd have to implement msize(). The msize() idea seems to have been rejected in PEP 445, though it mentions using debug hooks to implement it.

See also https://www.sqlite.org/c3ref/mem_methods.html

Anyway, attached is a PoC patch with a fixed 10k mem pool for the sqlite3 module :)
msg391125 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2021-04-15 10:13
Victor, Serhiy: Would this issue be "large" or important enough to re-raise the debate about implementing an msize() function in the PyMem_ API, or is it not worth it? I guess no; it has been discussed numerous times before. Else, I can start a thread on Discourse.
msg391130 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-04-15 12:33
Multiple C library don't provide msize() function. If you seriously consider to add it, you should conduct a study on all platforms.
msg391170 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2021-04-16 07:41
No, I did not mean using msize() or something like. Since memory is managed outside of Python, we have no a list of allocated blocks.

I meant that we can get the total memory used by the Python process (using OS-specific methods) and compare it between iterations. If it continues to grow, there is a leak. It perhaps is not able to detect small leaks (less than the page size), but large leaks are more important.
msg391171 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2021-04-16 07:46
The msize() talk is referring to msg389039 and msg328533. If we are to use Python memory allocators for the sqlite3 extension module, we need some sort of msize() function; overriding SQLite's memory functions requires msize() support.
msg391178 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-04-16 09:44
patch-with-simple-msize.diff: I suggest you opening a new issue to propose using the Python memory allocators in the sqlite module. This issue is about C extension modules which don't use the Python memory allocator.
msg391181 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2021-04-16 09:55
> This issue is about C extension modules which don't use the Python memory allocator.

Yes, I know. Your proposal in msg328533 is to continue the implementation of bpo-18227:

> _sqlite: call sqlite3_config(SQLITE_CONFIG_MALLOC, pMem) to use PyMem_RawMalloc()

Thus, I thought using this issue would be ok, but I can split the sqlite3 details out in a separate issue. Using sqlite3_config(SQLITE_CONFIG_MALLOC, ...) _requires_ msize().
History
Date User Action Args
2021-04-16 09:55:04erlendaaslandsetmessages: + msg391181
2021-04-16 09:44:39vstinnersetmessages: + msg391178
2021-04-16 07:56:14erlendaaslandsetfiles: + patch-with-simple-msize.diff
2021-04-16 07:46:03erlendaaslandsetmessages: + msg391171
2021-04-16 07:41:01serhiy.storchakasetmessages: + msg391170
2021-04-15 12:33:43vstinnersetmessages: + msg391130
2021-04-15 10:13:47erlendaaslandsetmessages: + msg391125
2021-03-18 20:03:39erlendaaslandsetfiles: + patch.diff
keywords: + patch
messages: + msg389039
2021-02-23 09:27:18erlendaaslandsetnosy: + erlendaasland
2018-10-26 10:39:25vstinnersetmessages: + msg328533
2018-10-26 10:27:03vstinnersetmessages: + msg328532
2018-10-26 10:20:41vstinnersetmessages: + msg328531
2018-10-26 10:18:55vstinnersetmessages: + msg328530
2018-10-26 08:09:22serhiy.storchakasetmessages: + msg328520
2018-10-24 15:44:09xiang.zhangsetnosy: + xiang.zhang
2018-10-23 21:42:48vstinnersetmessages: + msg328346
2018-10-23 19:36:59serhiy.storchakacreate