classification
Title: Use Python memory allocators in external libraries like zlib or OpenSSL
Type: Stage:
Components: Interpreter Core Versions: Python 3.4
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: aliles, christian.heimes, haypo, python-dev
Priority: normal Keywords: patch

Created on 2013-06-15 23:27 by haypo, last changed 2013-12-16 23:12 by haypo. This issue is now closed.

Files
File name Uploaded Description Edit
set_custom_alloc.patch haypo, 2013-06-22 22:33
Messages (9)
msg191248 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-06-15 23:27
With the PEP 445 and the issue #3329, Python will get an API to setup custom memory allocators. To be able to configure how memory is handled in external libraries, some libraries allow to setup a custom allocator too. New functions PyMem_RawMalloc(), PyMem_GetRawAllocators() PyMem_SetRawAllocators() can be used for that.

The safest option is to only reuse custom allocators if a library allows to setup them for a specfic function call or a specific object, and not replace the memory allocators globally. For example, the lzma library allows to set memory allocators only for one compressor object: LzmaEnc_Create(&SzAllocForLzma);

We might change the global allocators of a library if Python is not embedded, but Python *is* the application (the standard "python" program).

I don't know yet if it is safe to reuse custom memory allocators.

Windows has for example a special behaviour: each DLL (dynamic library) has its own heap, memory allocator in a DLL cannot be released from another DLL. Would this issue introduce such bug?
msg191249 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-06-15 23:28
See also the issue #18203: "Replace calls to malloc() with PyMem_Malloc() or PyMem_RawMalloc()".
msg191250 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-06-15 23:43
Uncomplete(?) list of external libraries used by Python:

- libffi (_ctypes): has its own memory allocator (dlmalloc), is it used? see also issue #18178
- libmpdec (_decimal): already configured to reuse the PyMem_Malloc() family (see Modules/_decimal/_decimal.c: "Init libpdec")
- _sqlite (sqlite3: see http://www.sqlite.org/malloc.html
- expat (expact): ?
- zlib (zlib): http://www.zlib.net/manual.html#Usage
- OpenSSL (_ssl, hashlib): CRYPTO_set_mem_functions, http://git.openssl.org/gitweb/?p=openssl.git;a=blob;f=crypto/mem.c;h=f7984fa958eb1edd6c61f6667f3f2b29753be662;hb=HEAD#l124
- Tcl/Tk (_tkinter): http://tmml.sourceforge.net/doc/tcl/Alloc.html
- bz2: http://www.bzip.org/1.0.5/bzip2-manual-1.0.5.html
- libncurses (curses): ?
- dbm libraries: ndbm, gdbm, db, ... (dbm): ?
- lzma: http://www.asawicki.info/news_1368_lzma_sdk_-_how_to_use.html
- Windows API (_winapi, nt): ?
- readline (readline): ?
msg191610 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-06-21 21:19
It looks like CRYPTO_set_mem_functions() of OpenSSL 1.0.1e-4.fc18 does not work: CRYPTO_set_mem_functions() calls indirectly CRYPTO_malloc() which sets "allow_customize = 0;" and so CRYPTO_set_mem_functions() does nothing (just return 0, instead of 1).

Gdb trace with a modified _ssl module:

#0  0x0000003803463100 in CRYPTO_malloc () from /lib64/libcrypto.so.10
#1  0x0000003803542fae in FIPS_drbg_new () from /lib64/libcrypto.so.10
#2  0x00000038035448e1 in FIPS_drbg_health_check () from /lib64/libcrypto.so.10
#3  0x0000003803542e88 in FIPS_drbg_init () from /lib64/libcrypto.so.10
#4  0x00000038034cf9d1 in RAND_init_fips () from /lib64/libcrypto.so.10
#5  0x0000003803465764 in OPENSSL_init_library () from /lib64/libcrypto.so.10
#6  0x0000003803462c61 in CRYPTO_set_mem_functions () from /lib64/libcrypto.so.10
#7  0x00007ffff135bc6c in PyInit__ssl () at /home/haypo/prog/python/default/Modules/_ssl.c:3180

See the code:
http://git.openssl.org/gitweb/?p=openssl.git;a=blob;f=crypto/mem.c;h=f7984fa958eb1edd6c61f6667f3f2b29753be662;hb=HEAD#l124
msg191676 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-06-22 22:33
Here is an initial attempt: set a custom allocator for bz2, lzma and zlib modules. The allocator is only replaced for an instance of a compressor or decompress, the change does not affect the library globally.

PyMem_RawMalloc() is used instead of PyMem_Malloc() because the GIL is always released.
msg192568 - (view) Author: Roundup Robot (python-dev) Date: 2013-07-07 14:50
New changeset a876d9d2e4fc by Victor Stinner in branch 'default':
Issue #18227: Use PyMem_RawAlloc() in bz2, lzma and zlib modules
http://hg.python.org/cpython/rev/a876d9d2e4fc
msg192571 - (view) Author: Roundup Robot (python-dev) Date: 2013-07-07 15:26
New changeset 12f26c356611 by Victor Stinner in branch 'default':
Issue #18227: "Free" function of bz2, lzma and zlib modules has no return value (void)
http://hg.python.org/cpython/rev/12f26c356611
msg192573 - (view) Author: Roundup Robot (python-dev) Date: 2013-07-07 15:35
New changeset 7f17c67b5bf6 by Christian Heimes in branch 'default':
Issue #18227: pyexpat now uses a static XML_Memory_Handling_Suite. cElementTree uses the same approach since at least Python 2.6
http://hg.python.org/cpython/rev/7f17c67b5bf6
msg206388 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-12-16 23:12
I modified modules when it was possible and easy to do. More modules should be modified, but it's more tricky. If you are interested, please open new issues.
History
Date User Action Args
2013-12-16 23:12:27hayposetstatus: open -> closed
resolution: fixed
messages: + msg206388
2013-07-07 15:35:19python-devsetmessages: + msg192573
2013-07-07 15:26:31python-devsetmessages: + msg192571
2013-07-07 14:50:40python-devsetnosy: + python-dev
messages: + msg192568
2013-06-22 22:33:52hayposetfiles: + set_custom_alloc.patch
keywords: + patch
messages: + msg191676
2013-06-21 21:19:28hayposetmessages: + msg191610
2013-06-16 07:29:52alilessetnosy: + aliles
2013-06-15 23:43:35hayposetmessages: + msg191250
2013-06-15 23:28:31hayposetmessages: + msg191249
2013-06-15 23:27:49haypocreate