This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: SharedMemory.close() destroys memory
Type: behavior Stage:
Components: Documentation, Windows Versions: Python 3.11, Python 3.10, Python 3.9
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: docs@python Nosy List: docs@python, eryksun, paul.moore, ronny-rentner, steve.dower, tim.golden, zach.ware
Priority: normal Keywords:

Created on 2022-03-01 11:39 by ronny-rentner, last changed 2022-04-11 14:59 by admin.

Messages (8)
msg414258 - (view) Author: Ronny Rentner (ronny-rentner) Date: 2022-03-01 11:39
According to https://docs.python.org/3/library/multiprocessing.shared_memory.html#multiprocessing.shared_memory.SharedMemory.close if I call close() on a shared memory, it shall not be destroyed.

Unfortunately this is only true for Linux but not for Windows.

I've tested this in a Windows VM on VirtualBox like this:

```
Python 3.10.2 (tags/v3.10.2:a58ebcc, Jan 17 2022, 14:12:15) [MSC v.1929 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import multiprocessing.shared_memory
>>> creator = multiprocessing.shared_memory.SharedMemory(create=True, name='mymemory', size=10000)
>>> creator.buf[0] = 1
>>> creator.buf[0]
1
>>> # According to  close() is supposed to not destroy 'mymemory' but it does destroy it.
>>> creator.close()
>>>
>>> user = multiprocessing.shared_memory.SharedMemory(name='mymemory')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\User\AppData\Local\Programs\Python\Python310\lib\multiprocessing\shared_memory.py", line 161, in __init__
    h_map = _winapi.OpenFileMapping(
FileNotFoundError: [WinError 2] The system cannot find the file specified: 'mymemory'
>>> # Shared memory was destroyed by close()
```
msg414259 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2022-03-01 12:17
Yes, named memory mappings only exist on Windows until the last 
reference is closed, so this is a difference due to the underlying OS. 
The implementation of unlink() recognises this (the entire body is under 
a _USE_POSIX check), but the docs do not reflect it.

The only workaround I can think of would be to create a real file, then 
open it with mmap and give it a tagname, then pass that tagname as the 
name argument of the shared memory object when creating it. It's a bit 
of a pain, but I don't think there's any option on our side given the 
way the API has been designed.

It's not possible for N SharedMemory instances to independently agree 
which one will ignore the close() call and do it in unlink() instead, 
and in any case that still wouldn't "unlink" the name until all other 
references had also been closed.

Can you tell us a bit more about what you're trying to achieve? Perhaps 
knowing how this is actually being relied upon will inspire some ideas 
for how to make it work better.
msg414260 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2022-03-01 13:03
> Yes, named memory mappings only exist on Windows until the last 
> reference is closed, so this is a difference due to the underlying OS. 

That's true for the Windows API, so it's true for all practical purposes. In the underlying NT API, creating a permanent kernel object is possible by setting OBJ_PERMANENT in the initial object attributes [1], or subsequently via the undocumented system call NtMakePermanentObject(handle). Creating a permanent object requires SeCreatePermanentPrivilege, which normally is granted to just the SYSTEM account. An administrator can grant this privilege to any user, group, or well-known group, but creating permanent objects should generally be limited to drivers and system services. An object can be reverted back to a temporary object via NtMakeTemporaryObject(handle).

A named section object (i.e. file mapping) can also be created as a global name, i.e. r"Global\{object name}", which is accessible to all sessions. This requires SeCreateGlobalPrivilege, which by default is granted to system service accounts and administrators. This is separate from whether the section is temporary or permanent, but a permanent section object is more likely to be needed in the global namespace.

---

[1] https://docs.microsoft.com/en-us/windows/win32/api/ntdef/nf-ntdef-initializeobjectattributes
[2] https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/wdm/nf-wdm-zwmaketemporaryobject
msg414324 - (view) Author: Ronny Rentner (ronny-rentner) Date: 2022-03-02 03:51
Thanks for your quick response.

My bigger scope is real time audio and video processing where I use multiple processes that need to be synchronized. I use shared memory for that.

As a small spin off, I've hacked together a dict that is using shared memory as a storage.

It works like this: It uses one shared memory space for streaming updates. This is efficient because only changes are transferred. Once the streaming shared memory buffer is full or if any single update to the dict is larger than the streaming buffer, it creates a full dump of the whole dict in a new shared memory that is just as big as needed. Any user of the dict would then consume the full dump.

On Linux that works great. Any user of the dict can create a full dump in a new shared memory and all other users of the same dict can consume it.

On Windows, the issue is if the creator process of the full dump goes away, the shared memory goes away. This is in contrast to the Python docs, unfortunately.

I don't fully understand the underlying implementations, but I've been looking at https://docs.microsoft.com/en-us/dotnet/standard/io/memory-mapped-files and I understand there are 2 main modes.

The persistent mode sounds just like Python shared memory also works on Linux (where I can have these /dev/shm/* files even after the Python process ends) but I think on Windows, Python is not using the persistent mode and thus the shared memory goes away, in contrast to how it works on Linux.

PS: You can find the code for this shared dict here https://github.com/ronny-rentner/UltraDict - Please note, it's an eary lack and not well tested.
msg414331 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2022-03-02 07:33
> The persistent mode sounds just like Python shared memory also works
> on Linux (where I can have these /dev/shm/* files even after the 
> Python process ends) but I think on Windows, Python is not using 
> the persistent mode and thus the shared memory goes away, in contrast
> to how it works on Linux.

Unix has the notion that everything is a file as a central organizing concept. As such, Linux opts to implement POSIX shm_open() [1] with a tmpfs [2] filesystem that's mounted on /dev/shm. tmpfs filesystems exist in virtual memory. They're not persistent in the sense of being created on a physical disk that provides persistent storage.

Windows has the notion that everything is an object as a central organizing concept. Internally, the system has a "\" root object directory, which contains other object directories and object symbolic links (unrelated to filesystem symlinks). Each object directory, including the root directory, contains named kernel objects. 

Named device objects are normally created in r"\Device", such as r"\Device\HarddiskVolume2", and global symbolic links to devices are created in r"\GLOBAL??", such as r"\GLOBAL??\C:" -> r"\Device\HarddiskVolume2" for the "C:" drive. The root registry key object is r"\REGISTRY", which contains dynamic key objects such as r"\REGISTRY\MACHINE" (referenced via the pseudohandle HKEY_LOCAL_MACHINE) and "\REGISTRY\USER" (referenced via the pseudohandle HKEY_USERS), which in turn contain other keys such as "\REGISTRY\MACHINE\SOFTWARE" on which registry hives are mounted.

For naming global kernel objects and session 0 objects, the Windows API internally uses the directory r"\BaseNamedObjects". For interactive Windows sessions, it uses r"\Sessions\<session number>\BaseNamedObjects" and, for app containers, subdirectories of r"\Sessions\<session number>\AppContainerNamedObjects". To explicitly name an object in the global directory, use the path r"Global\<object name>". The "Global" prefix is implemented as an object symbolic link to r"\BaseNamedObjects". Of course, that's an internal implementation detail; the API just refers to the r"Global\\" prefix.

Naming a kernel object in r"\BaseNamedObjects" is nearly equivalent to creating a file in a tmpfs filesystem in Linux, with one major difference. Objects are reference counted, with a handle reference count and a pointer reference count. Opening a handle increments both counters, but kernel code can use just a pointer reference instead of opening a handle. By default, objects are temporary. As such, when their pointer reference count is decremented to 0, they get unlinked from the object namespace, if they're named objects, and deallocated.

It's possible to create a permanent object in the object namespace, either initially via the OBJ_PERMANENT attribute [3], or later on via NtMakePermanentObject(handle). However, making an object permanent is considered a super privilege in Windows, so privileged in fact that the Windows API doesn't support this capability, at least not as far as I know. By default, SeCreatePermanentPrivilege is only granted to the SYSTEM account. Also, creating a named section object (i.e. file mapping object) in a session-restricted directory such as r"\BaseNamedObjects" requires SeCreateGlobalPrivilege, which is only granted by default to administrators and system service accounts. 

Unix and Linux are less concerned about creating 'permanent' files and global shared memory in virtual filesystems such as tmpfs. The lifetime while the system is running is left up to the creator, which has to manually remove the file, such as via shm_unlink() [4]. Stale files could accumulate in "/dev/shm" when processes crash, which is why a separate resource tracker is required, implementing what the Windows kernel provides by default.

---
[1] https://pubs.opengroup.org/onlinepubs/9699919799/functions/shm_open.html
[2] https://www.kernel.org/doc/Documentation/filesystems/tmpfs.txt
[3] https://docs.microsoft.com/en-us/windows/win32/api/ntdef/ns-ntdef-_object_attributes
[4] https://pubs.opengroup.org/onlinepubs/9699919799/functions/shm_unlink.html
msg414335 - (view) Author: Ronny Rentner (ronny-rentner) Date: 2022-03-02 10:55
Many thanks for your explanation. It makes sense now.

One thing I've learned is that I need to get rid of the resource trackers for shared memory, so I've applied the monkey patch fix mentioned in https://bugs.python.org/issue38119

The issue is that you need shared memory in a multi processing environment, but the resource tracker can hardly know about other processes using the shared memory unless you tell it explicitly.

It looks like the resource tracker is guessing that all users of a shared memory are somehow forked from the same main process but I'm using spawn and not fork because my software should run on Windows as well.

Regarding /dev/shm on Linux: As far as I know there's an option 'RemoveIPC' to the systemd daemon that will delete shared memory on logout of the user which is turned on by default. I also remember seeing cronjobs for cleaning up /dev/shm on Debian, but not sure if that is the current approach anymore.
msg414338 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2022-03-02 11:25
Eryk's post is useful background information, but not helpful for this particular case ;)

From Windows's POV, there is no "creating" process of the shared memory. If it's going away, it's because none of the other processes are not keeping it open - simple refcounting. That may be knowledge you can use to work around it.

Alternatively, the workaround I suggested in my first post could also help. If the multiprocessing SharedMemory object is critical, you need the extra steps, but if all the processes can receive a filesystem path instead of however they're getting the reference today, you can use a real file with mmap to achieve exactly the same thing. (All that's missing is that SharedMemory won't take an open file object or a path, which is reasonable, but not helpful right here.)

Switching this to a documentation bug: we should clarify that unlink() has no effect on Windows and shared memory blocks go away when the last SharedMemory object is close()d.
msg414339 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2022-03-02 11:33
Putting words into action, here's an example of what a privileged process (e.g. running as SYSTEM) can do if a script or application is written to call the undocumented NT API function NtMakePermanentObject(). A use case would be a script running as a system service that needs a shared-memory section object to persist after the service is stopped or terminated, e.g. such that the section is still accessible if the service is resumed. Anyway, this is just an example of what's possible, from a technical perspective. Also, note that I granted SeCreatePermanentPrivilege to my current user for this example, so it's certainly not required to use the SYSTEM account. An administrator can grant this privilege to any account.

By default, named kernel objects are temporary:

    >>> import os, _winapi, ctypes
    >>> from multiprocessing.shared_memory import SharedMemory
    >>> name = f'spam_{os.getpid()}'
    >>> m = SharedMemory(name, True, 8192)
    >>> m.close()
    >>> try: SharedMemory(name)
    ... except OSError as e: print(e)
    ...
    [WinError 2] The system cannot find the file specified: 'spam_6592'

Global permanent example:

Enable privileges for the current thread:

    >>> import win32api; from win32security import *
    >>> ImpersonateSelf(SecurityImpersonation)
    >>> da = TOKEN_QUERY | TOKEN_ADJUST_PRIVILEGES
    >>> ht = OpenThreadToken(win32api.GetCurrentThread(), da, False)
    >>> ps = [[0, SE_PRIVILEGE_ENABLED], [0, SE_PRIVILEGE_ENABLED]]
    >>> ps[0][0] = LookupPrivilegeValue(None, 'SeCreatePermanentPrivilege')
    >>> ps[1][0] = LookupPrivilegeValue(None, 'SeCreateGlobalPrivilege')
    >>> AdjustTokenPrivileges(ht, False, ps)
    ((16, 0),)

Create a global section object, and make it permanent:

    >>> name = rf'Global\spam_{os.getpid()}'
    >>> m = SharedMemory(name, True, 8192)
    >>> from win32con import DELETE
    >>> h = _winapi.OpenFileMapping(DELETE, False, name)
    >>> ntdll = ctypes.WinDLL('ntdll')
    >>> ntdll.NtMakePermanentObject(h)
    0
    >>> _winapi.CloseHandle(h)

A permanent object persists after the last handle is closed:

    >>> m.close()
    >>> m = SharedMemory(name) # This works now.

Make the section object temporary again:

    >>> h = _winapi.OpenFileMapping(DELETE, False, name)
    >>> ntdll.NtMakeTemporaryObject(h)
    0
    >>> _winapi.CloseHandle(h)
    >>> m.close()
    >>> try: SharedMemory(name)
    ... except OSError as e: print(e)
    ...
    [WinError 2] The system cannot find the file specified: 'Global\\spam_6592'
History
Date User Action Args
2022-04-11 14:59:56adminsetgithub: 91044
2022-03-02 11:33:21eryksunsetmessages: + msg414339
2022-03-02 11:25:22steve.dowersetversions: + Python 3.9, Python 3.11
nosy: + docs@python

messages: + msg414338

assignee: docs@python
components: + Documentation
2022-03-02 10:55:02ronny-rentnersetmessages: + msg414335
2022-03-02 07:33:09eryksunsetmessages: + msg414331
2022-03-02 03:51:18ronny-rentnersetmessages: + msg414324
2022-03-01 13:03:37eryksunsetnosy: + eryksun
messages: + msg414260
2022-03-01 12:17:47steve.dowersetmessages: + msg414259
2022-03-01 11:39:08ronny-rentnercreate