This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Title: Add _Py_memset_s() to securely clear memory
Type: security Stage: resolved
Components: Versions: Python 3.7
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: Nosy List: benjamin.peterson, christian.heimes, gregory.p.smith, pitrou, vstinner
Priority: low Keywords: patch

Created on 2013-03-12 16:54 by christian.heimes, last changed 2022-04-11 14:57 by admin. This issue is now closed.

File name Uploaded Description Edit
pymemsets.patch christian.heimes, 2013-03-13 14:56 review
pymemsets.patch christian.heimes, 2013-06-16 19:38 review
Messages (12)
msg184032 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2013-03-12 16:54
Compilers like GCC optimize away code like memset(var, 0, sizeof(var)) if the code occurs at the end of a function and var is not used anymore [1]. But security relevant code like hash and encryption use this to overwrite sensitive data with zeros.

The code in _sha3module.c uses memset() to clear its internal state. The other hash modules don't clear their internal states yet.

There exists a couple of solutions for the problem:

 * C11 [ISO/IEC 9899:2011] has a memset_s() function
 * MSVC has SecureZeroMemory()
 * GCC can disable the optimization with #pragma GCC optimize ("O0") since GCC 4.4
 * [2] contains an example for a custom implementation of memset_s() with volatile.


msg184033 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2013-03-12 16:59
Even if you get the memset to actually run, that's hardly sufficient for security. The OS can could have swapped it to disk.
msg184034 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2013-03-12 17:09
mlock() can prevent swapping but it may need extra capabilities. A working memset_s() removes critical information from core dumps at least.

If we don't want to add _Py_memset_s() then I'm going to remove the dysfunctional clearstate macro from my sha3 module.
msg184036 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2013-03-12 17:17
I'm not saying don't add it, just that you can't really "win" in the securely deleting data game unless you have special hardware.
msg184044 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2013-03-12 19:19
I'd personally say don't bother with this.  Let people who _need_ this use their own C extension modules to handle all secure data as we're not in a position to make and test any guarantees about what happens to data anywhere within a Python VM.

If this is added, at least document it (comments since its an _internal function) as being best effort with no guarantee that it is better than nothing.
msg184084 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2013-03-13 14:56
Here is a patch that implements _Py_memset_s() according to C11 with a fallback to memset_s().

My linker fu seems to be weak. I had to use _Py_memset_s() in random.c otherwise the function is removed from the static Python binary. I double-checked with readelf -Ws: the function is defined in the archive file and is available in the shared library. But the linker removes it from the static binary so shared libraries fail to load.
msg184109 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-03-13 20:29
Right now I don't really see the point of this. The randomized hash is not cryptographically secure, so this sounds like premature securization to me.
msg191284 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2013-06-16 19:38
I finally figured out why _Py_memset_s() wasn't available inside extension modules. The linker removes object files from the main binary unless one or more symbols from an object files are referenced somewhere. Objects/object.c has a workaround for PyCapsule_Type:

/* Hack to force loading of pycapsule.o */
PyTypeObject *_PyCapsule_hack = &PyCapsule_Type;

I have updated my patch.
msg201825 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-10-31 15:16
I think I still don't understand the use case within Python. Why would you want to clear the internal state of a hash object?
If you can read the computer's memory, you probably have access to sensitive data already?
msg201853 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-10-31 22:19
Some comments:

- I don't have small files which just contain one function. Do you expect that we may add other security-related functions? You may add a "pysecurity.c" file. (It's maybe a stupid idea.)

- Why only a few hash functions (sha1, sha3)? We must use the same policy for all hash functions: always force memset() or never use memset().

- Why not touching the ssl module? PySSL_dealloc() and context_dealloc() for example.

- Would it be possible to use a custom memory allocator which would memset() the memory before releasing it for security related objects? If yes, would it be possible to switch it on or off at runtime? It may be interesting if memset() has a visible overhead on performances.

Antoine wrote:

"I think I still don't understand the use case within Python. Why would you want to clear the internal state of a hash object? If you can read the computer's memory, you probably have access to sensitive data already?"

Data are usually duplicated in many places. I'm also dubious that memset() adds any security. If it has no impact on performance, why not using memset() for hash functions and security modules like ssl.

But for example, ssl.RAND_bytes() stores its result in a common bytes object. The bytes type doesn't use a custom memory allocator, and so the secret random bytes will still be present in memory after the bytes object has been "deleted". If you really care of security, you may need a security allocator which reset all memory blocks on free(), not only a few modules.

And what happens when you pass data to a C module which copies the data somewhere. Does it later reset correctly the memory when data becomes useless?

@Christian: Do you have examples of other projects clearing the memory when objects are destroyed?
msg203152 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2013-11-17 13:53
I don't have enough time to work in this issue before 3.4 beta1.
msg404488 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2021-10-20 16:16
There haven't been any activity on this feature request for eight years. I'm no longer interested to implement my feature request. Closing...
Date User Action Args
2022-04-11 14:57:42adminsetgithub: 61607
2021-10-20 16:16:56christian.heimessetstatus: open -> closed
resolution: rejected
messages: + msg404488

stage: patch review -> resolved
2016-09-24 19:23:24christian.heimessetpriority: normal -> low
versions: + Python 3.7, - Python 3.5
2016-06-12 11:25:27christian.heimessetassignee: christian.heimes ->
2013-11-17 13:53:37christian.heimessetmessages: + msg203152
versions: + Python 3.5, - Python 3.4
2013-10-31 22:19:54vstinnersetmessages: + msg201853
2013-10-31 15:16:53pitrousetmessages: + msg201825
2013-10-31 15:03:14vstinnersetnosy: + vstinner
2013-06-16 19:38:51christian.heimessetfiles: + pymemsets.patch

messages: + msg191284
2013-03-13 20:29:36pitrousetnosy: + pitrou
messages: + msg184109
2013-03-13 14:56:11christian.heimessetfiles: + pymemsets.patch
keywords: + patch
messages: + msg184084

stage: needs patch -> patch review
2013-03-12 19:19:09gregory.p.smithsetmessages: + msg184044
2013-03-12 17:17:11benjamin.petersonsetmessages: + msg184036
2013-03-12 17:09:08christian.heimessetnosy: + gregory.p.smith
messages: + msg184034
2013-03-12 16:59:08benjamin.petersonsetnosy: + benjamin.peterson
messages: + msg184033
2013-03-12 16:54:12christian.heimescreate