This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: wrong behavior with fork and mmap
Type: behavior Stage:
Components: Library (Lib) Versions: Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: btiplitz, r.david.murray, vstinner
Priority: normal Keywords:

Created on 2013-12-23 21:05 by btiplitz, last changed 2022-04-11 14:57 by admin.

Messages (5)
msg206871 - (view) Author: Brett Tiplitz (btiplitz) Date: 2013-12-23 21:05
When running the example mmap library (with a slight modification, plus I did not handle all the changes for the 3.3 string handling as the example posted does not work with 3.x) 

When looking at the subprocess, the spawned process will have all the mmap'd file descriptors open.  The spawned process has the responsibility of closing any FD's that are in use.  However, since the shared memory segment get's closed and the program has no knowledge of private FD's, the mmap's private FD becomes a leak in the FD table.  It seems python should set the close-on-exec attribute on the dup'd FD that it maintains.  Examples of fixing this issue are found on http://stackoverflow.com/questions/1643304/how-to-set-close-on-exec-by-default
import mmap,os

# write a simple example file
with open("hello.txt", "wb") as f:
    f.write(bytes("Hello Python!\n", 'UTF-8'))

with open("hello.txt", "r+b") as f:
    # memory-map the file, size 0 means whole file
    os.system("/bin/ls -l /proc/"+str(os.getpid())+"/fd")

    mm = mmap.mmap(f.fileno(), 0)
    os.system("/bin/ls -l /proc/"+str(os.getpid())+"/fd")
    os.system("/bin/ls -l /proc/self/fd")

    # read content via standard file methods
    t1 = mm.readline() # used to print out
  # prints "Hello Python!"
    # read content via slice notation
    t2=mm[:5]
#    print mm[:5]  # prints "Hello"
    # update content using slice notation;
    # note that new content must have same size
    mm[6:] = bytes(" world!\n", 'UTF-8')
    # ... and read again using standard file methods
    mm.seek(0)
    t3=mm.readline()
 #   print mm.readline()  # prints "Hello  world!"
    # close the map
    mm.close()
~
msg206872 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2013-12-23 21:13
It seems very likely that this is addressed by PEP 446.  Since that is not a behavior change that can be backported, I think this issue should probably be closed as out of date.
msg206876 - (view) Author: Brett Tiplitz (btiplitz) Date: 2013-12-23 21:37
Changing the code to 
    subprocess.call(["/bin/ls", "-l", "/proc/self/fd"])
and running this on Python 3.3 does show this as being resolved by the broader fix implemented in PEP 446.  It does seem bad that the os.system call remains in place with bad behavior as I know it's widely used.
msg206877 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-12-23 22:04
This issue is not specific to mmap. Many other functions and libraries may
use private inheritable file descriptors. Python 3.4 does not fix the issue
for third party libraries.

os.system() must be avoided, use subprocess.call() instead. It avoids an
useless shell process and closes all fds by default.

Is it a documentation issue?
msg206878 - (view) Author: Brett Tiplitz (btiplitz) Date: 2013-12-23 22:10
Man page currently says as follows: (this does not says it's deprecated or that files have to be closed on exec)...  So I'd think some more comments would help. And as mentioned, which a user can close his own fd's, the mmap call creates a special problem since the user can't work around the issue cleanly though fixed in the subprocess calls.


s.system(command)

    Execute the command (a string) in a subshell. This is implemented by calling the Standard C function system(), and has the same limitations. Changes to sys.stdin, etc. are not reflected in the environment of the executed command.

    On Unix, the return value is the exit status of the process encoded in the format specified for wait(). Note that POSIX does not specify the meaning of the return value of the C system() function, so the return value of the Python function is system-dependent.

    On Windows, the return value is that returned by the system shell after running command, given by the Windows environment variable COMSPEC: on command.com systems (Windows 95, 98 and ME) this is always 0; on cmd.exe systems (Windows NT, 2000 and XP) this is the exit status of the command run; on systems using a non-native shell, consult your shell documentation.

    The subprocess module provides more powerful facilities for spawning new processes and retrieving their results; using that module is preferable to using this function. See the Replacing Older Functions with the subprocess Module section in the subprocess documentation for some helpful recipes.

    Availability: Unix, Windows.
History
Date User Action Args
2022-04-11 14:57:55adminsetgithub: 64256
2013-12-23 22:10:48btiplitzsetmessages: + msg206878
2013-12-23 22:04:15vstinnersetmessages: + msg206877
2013-12-23 21:37:33btiplitzsetmessages: + msg206876
2013-12-23 21:13:23r.david.murraysetnosy: + r.david.murray, vstinner
messages: + msg206872
2013-12-23 21:05:18btiplitzcreate