Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test leaks of memory not managed by Python allocator #79237

Open
serhiy-storchaka opened this issue Oct 23, 2018 · 14 comments
Open

Test leaks of memory not managed by Python allocator #79237

serhiy-storchaka opened this issue Oct 23, 2018 · 14 comments
Labels
3.8 only security fixes tests Tests in the Lib/test dir type-feature A feature request or enhancement

Comments

@serhiy-storchaka
Copy link
Member

BPO 35056
Nosy @vstinner, @serhiy-storchaka, @zhangyangyu, @erlend-aasland
Files
  • patch.diff
  • patch-with-simple-msize.diff: PoC, take 2
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2018-10-23.19:36:59.714>
    labels = ['3.8', 'type-feature', 'tests']
    title = 'Test leaks of memory not managed by Python allocator'
    updated_at = <Date 2021-04-16.09:55:04.696>
    user = 'https://github.com/serhiy-storchaka'

    bugs.python.org fields:

    activity = <Date 2021-04-16.09:55:04.696>
    actor = 'erlendaasland'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Tests']
    creation = <Date 2018-10-23.19:36:59.714>
    creator = 'serhiy.storchaka'
    dependencies = []
    files = ['49890', '49963']
    hgrepos = []
    issue_num = 35056
    keywords = ['patch']
    message_count = 14.0
    messages = ['328336', '328346', '328520', '328530', '328531', '328532', '328533', '389039', '391125', '391130', '391170', '391171', '391178', '391181']
    nosy_count = 4.0
    nosy_names = ['vstinner', 'serhiy.storchaka', 'xiang.zhang', 'erlendaasland']
    pr_nums = []
    priority = 'normal'
    resolution = None
    stage = None
    status = 'open'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue35056'
    versions = ['Python 3.8']

    @serhiy-storchaka
    Copy link
    Member Author

    Would be nice to add a possibility to test memory leaks if memory is allocated not by Python allocators, but inside external libraries. This would allow to catch leaks on the bridge between Python and external libraries. See for example bpo-34794.

    @serhiy-storchaka serhiy-storchaka added 3.8 only security fixes tests Tests in the Lib/test dir type-feature A feature request or enhancement labels Oct 23, 2018
    @vstinner
    Copy link
    Member

    You can try to use Valgrind for that?

    @serhiy-storchaka
    Copy link
    Member Author

    Can it be used on buildbots and in CI tests?

    @vstinner
    Copy link
    Member

    Can it be used on buildbots and in CI tests?

    These kind of tool produces a lot of false alarms :-( They are also hard to use.

    @vstinner
    Copy link
    Member

    Before seeing how to automate it, first someone should try to detect the bpo-34794 memory leak manually ;-)

    By the way, when I implemented the PEP-445, I tried to configure OpenSSL to use Python memory allocators (to benefit of tracemalloc), but the OpenSSL API for that didn't work at all: bpo-18227.

    @vstinner
    Copy link
    Member

    About automation, regrtest -R checks memory leaks using sys.getallocatedblocks(). But this function only tracks PyMem_Malloc() and PyObject_Malloc() (which is now technically the same memory allocator, since Python 3.6: https://docs.python.org/dev/c-api/memory.html#default-memory-allocators )

    I tried to track PyMem_RawMalloc() using regrtest but... the raw memory allocator is not really "deterministic", it's hard to get reliable behavior. See: bpo-26850.

    Tracking memory leaks is an hard topic :-) I'm happy that I finally fixed tracemalloc to track properly objects in free lists: bpo-35053!

    @vstinner
    Copy link
    Member

    It seems like the easiest thing to do thta would directly benefit (to tracemalloc users) is to continue the implementation of bpo-18227:

    • _sqlite: call sqlite3_config(SQLITE_CONFIG_MALLOC, pMem) to use PyMem_RawMalloc()
    • _ssl: try again CRYPTO_set_mem_functions()

    Python modules already using Python memory allocators:

    • zlib: "zst.zalloc = PyZlib_Malloc" which calls PyMem_RawMalloc
    • _decimal: mpd_mallocfunc = PyMem_Malloc
    • _lzma: "self->alloc.alloc = PyLzma_Malloc" which calls PyMem_RawMalloc
    • pyexpat: XML_ParserCreate_MM(encoding, &ExpatMemoryHandler,...) with ExpatMemoryHandler = {PyObject_Malloc, PyObject_Realloc, PyObject_Free}
    • _bz2: "bzalloc = BZ2_Malloc" which calls PyMem_RawMalloc()

    Using Python memory allocators gives access to Python builtin "memory debugger", even in release mode using PYTHONMALLOC=debug or -X dev.

    @erlend-aasland
    Copy link
    Contributor

    _sqlite: call sqlite3_config(SQLITE_CONFIG_MALLOC, pMem) to use PyMem_RawMalloc()

    SQLite requires the xSize member of sqlite3_mem_methods to be implemented for this to work, so we'd have to implement msize(). The msize() idea seems to have been rejected in PEP-445, though it mentions using debug hooks to implement it.

    See also https://www.sqlite.org/c3ref/mem_methods.html

    Anyway, attached is a PoC patch with a fixed 10k mem pool for the sqlite3 module :)

    @erlend-aasland
    Copy link
    Contributor

    Victor, Serhiy: Would this issue be "large" or important enough to re-raise the debate about implementing an msize() function in the PyMem_ API, or is it not worth it? I guess no; it has been discussed numerous times before. Else, I can start a thread on Discourse.

    @vstinner
    Copy link
    Member

    Multiple C library don't provide msize() function. If you seriously consider to add it, you should conduct a study on all platforms.

    @serhiy-storchaka
    Copy link
    Member Author

    No, I did not mean using msize() or something like. Since memory is managed outside of Python, we have no a list of allocated blocks.

    I meant that we can get the total memory used by the Python process (using OS-specific methods) and compare it between iterations. If it continues to grow, there is a leak. It perhaps is not able to detect small leaks (less than the page size), but large leaks are more important.

    @erlend-aasland
    Copy link
    Contributor

    The msize() talk is referring to msg389039 and msg328533. If we are to use Python memory allocators for the sqlite3 extension module, we need some sort of msize() function; overriding SQLite's memory functions requires msize() support.

    @vstinner
    Copy link
    Member

    patch-with-simple-msize.diff: I suggest you opening a new issue to propose using the Python memory allocators in the sqlite module. This issue is about C extension modules which don't use the Python memory allocator.

    @erlend-aasland
    Copy link
    Contributor

    This issue is about C extension modules which don't use the Python memory allocator.

    Yes, I know. Your proposal in msg328533 is to continue the implementation of bpo-18227:

    _sqlite: call sqlite3_config(SQLITE_CONFIG_MALLOC, pMem) to use PyMem_RawMalloc()

    Thus, I thought using this issue would be ok, but I can split the sqlite3 details out in a separate issue. Using sqlite3_config(SQLITE_CONFIG_MALLOC, ...) _requires_ msize().

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.8 only security fixes tests Tests in the Lib/test dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants