classification
Title: tempfile mkstemp() leaks file descriptors if os.close() is not called
Type: resource usage Stage: patch review
Components: Library (Lib) Versions: Python 3.8
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: StillSubjectToChange, andrei.avk, mieczyslaw.torchala, thatiparthy
Priority: normal Keywords: patch

Created on 2021-01-05 11:28 by mieczyslaw.torchala, last changed 2021-07-09 20:49 by andrei.avk.

Pull Requests
URL Status Linked Edit
PR 27075 open thatiparthy, 2021-07-09 05:12
Messages (8)
msg384384 - (view) Author: Mieczysław Torchała (mieczyslaw.torchala) Date: 2021-01-05 11:28
tempfile mkstemp() documentation says: "Unlike TemporaryFile(), the user of mkstemp() is responsible for deleting the temporary file when done with it."

mkstemp() returns a tuple:

file_descriptor, file_path = mkstemp()

Calling only 

os.unlink(file_path) 

removes the file, but causes leaking file descriptors and when the number of temporary files created is higher than `ulimit -n`, the process crashes (see /proc/$pid/fd in real time until crash).

The solution I found is to also call on descriptor:

os.close(file_descriptor)

but the documentation doesn't mention that (i.e. releasing file descriptor in addition to removing temporary file).

For many users it doesn't matter as they create a few files and when the process finishes, leaking file descriptors are released. 

However, when a lot of files is created during the execution, it will finally crash (unless someone has a huge ulimit -n setting).

If this is not a bug, at least the documentation should mention that both the temp file needs to be removed and the file descriptor released. However, this means calling two commands when only one command was used to create the temporary file. Therefore, maybe adding a function to tempfile library to fully remove a file without a leaking file descriptor is a solution.
msg385850 - (view) Author: Isaac Young (StillSubjectToChange) Date: 2021-01-28 11:03
Perhaps the documentation should be more explicit, but I wouldn't say this is an issue. Both mkstemp and mkdtemp are low level functions which are intended to have this kind of flexibility.

The os.unlink, and the equivalent os.remove, are POSIX defined functions which always deletes the name from the filesystem but the file will remain in memory so long as there are file descriptors referencing it. So using os.close(file_descriptor) is actually how you are expected to use this API.

Is there any reason you don't want to use [Named]TemporaryFile? They are  high level interfaces which handle the cleanup.
msg385898 - (view) Author: Mieczysław Torchała (mieczyslaw.torchala) Date: 2021-01-29 07:54
Yes, having it written explicitly in docs (that remove means not only a file removal with unlink/remove and warn about lacking file descriptors), or/and providing a remove function in tempfile which will both unlink and close, would be valuable.

The use case is that when the process crashes during some scientific computations, I still want to keep the file to investigate. Of course, one can argue that try/except could copy the file somewhere else, but having it still in /tmp (and pointing to this file in the log) is pretty useful.
msg397180 - (view) Author: Andrei Kulakov (andrei.avk) * (Python triager) Date: 2021-07-09 02:02
It seems reasonable to add a paragraph like:

    If a large number of file descriptors are created, your program may run into a per-process limit. You can avoid this issue by closing file descriptors with :func:`os.close`.

I also noticed that `mkstemp` doc refers to file handles, but links to `os.open` which uses 'file descriptors'. It may be good if along with this change, existing text was updated to refer to 'file descriptors'.
msg397185 - (view) Author: Srinivas Reddy Thatiparthy(శ్రీనివాస్ రెడ్డి తాటిపర్తి) (thatiparthy) * Date: 2021-07-09 05:13
I have raised a docs PR.
msg397209 - (view) Author: Andrei Kulakov (andrei.avk) * (Python triager) Date: 2021-07-09 19:13
Can someone w/access to a Linux system confirm if running out of FDs raises an OSError?

Something like this:

N = `ulimit -n` + 5

for x in range(N): tempfile.mkstemp()
msg397218 - (view) Author: Mieczysław Torchała (mieczyslaw.torchala) Date: 2021-07-09 20:42
Hi Andrei! 

In my case it was:

OSError: [Errno 24] Too many open files
msg397221 - (view) Author: Andrei Kulakov (andrei.avk) * (Python triager) Date: 2021-07-09 20:49
Mieczysław: perfect, thanks!
History
Date User Action Args
2021-07-09 20:49:19andrei.avksetmessages: + msg397221
2021-07-09 20:42:14mieczyslaw.torchalasetmessages: + msg397218
2021-07-09 19:13:09andrei.avksetmessages: + msg397209
2021-07-09 05:13:47thatiparthysetmessages: + msg397185
2021-07-09 05:12:52thatiparthysetkeywords: + patch
nosy: + thatiparthy

pull_requests: + pull_request25625
stage: patch review
2021-07-09 02:02:18andrei.avksetnosy: + andrei.avk
messages: + msg397180
2021-01-29 07:54:41mieczyslaw.torchalasetmessages: + msg385898
2021-01-28 11:03:11StillSubjectToChangesetnosy: + StillSubjectToChange
messages: + msg385850
2021-01-05 11:28:47mieczyslaw.torchalacreate