classification
Title: Python 3.4: tempfile.close attribute does not work
Type: behavior Stage: needs patch
Components: Library (Lib) Versions: Python 3.6, Python 3.4, Python 3.5
process
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: Nosy List: Eduardo.Seabra, berker.peksag, eryksun, georg.brandl, ncoghlan, pitrou, r.david.murray, serhiy.storchaka, socketpair, vstinner
Priority: normal Keywords: easy, patch

Created on 2014-05-26 05:47 by socketpair, last changed 2015-12-31 06:11 by eryksun. This issue is now closed.

Files
File name Uploaded Description Edit
issue21579.patch Eduardo.Seabra, 2014-06-17 00:10 review
test_tempfile.py socketpair, 2015-11-16 12:56 example of real usage
Messages (21)
msg219132 - (view) Author: Марк Коренберг (socketpair) * Date: 2014-05-26 05:47
Suppose code:
=====================================
import os
import tempfile

want_to_replace = 'zxc.dat'
tmpdir=os.path.dirname(os.path.realpath(want_to_replace))
with tempfile.NamedTemporaryFile(dir=tmpdir) as fff:
    # do somewhat with fff here... and then:
    fff.flush()
    os.fdatasync(fff)
    os.rename(fff.name, want_to_replace)
    fff.delete = False
=====================================
In python 3.3 and lower that works FINE. In Python 3.4 the fff._closer attribute was introduced, so fff.close=False stopped to work. I think this is major loss of functionality. The "close" attribute was not marked as private, so may be used in past.
msg219137 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2014-05-26 06:16
See issue 18879 for more information about the change.
msg219146 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-05-26 08:46
In any case this trick didn't work on Windows. And it can't work on Linux too when use new O_TMPFILE flag (issue21515).
msg219147 - (view) Author: Марк Коренберг (socketpair) * Date: 2014-05-26 08:57
Yes, but O_TMPFILE should be set ONLY  when used with TemporaryFile, not with NamedTemporaryFile. My problem refer only NamedTemporaryFile.
msg219269 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-05-28 14:19
OK, O_TMPFILE is not related. But Windows is related.
msg219273 - (view) Author: Марк Коренберг (socketpair) * Date: 2014-05-28 15:12
So, maybe API change? like delete=True|False|Maybe ? don't think that this is good decisions.

My approach is based on ext4 behaviour about delayed allocation and atomic file replacements.
In my case, either old file (with contents) or new file appear.
In any case, corrupted data cannot appear.
This is required behaviour for my application.


https://www.kernel.org/doc/Documentation/filesystems/ext4.txt
http://lwn.net/Articles/322823/

=======================================================================
auto_da_alloc(*)	Many broken applications don't use fsync() when 
noauto_da_alloc		replacing existing files via patterns such as
			fd = open("foo.new")/write(fd,..)/close(fd)/
			rename("foo.new", "foo"), or worse yet,
			fd = open("foo", O_TRUNC)/write(fd,..)/close(fd).
			If auto_da_alloc is enabled, ext4 will detect
			the replace-via-rename and replace-via-truncate
			patterns and force that any delayed allocation
			blocks are allocated such that at the next
			journal commit, in the default data=ordered
			mode, the data blocks of the new file are forced
			to disk before the rename() operation is
			committed.  This provides roughly the same level
			of guarantees as ext3, and avoids the
			"zero-length" problem that can happen when a
			system crashes before the delayed allocation
			blocks are forced to disk.
=======================================================================

So, this is essential functionality.
msg219274 - (view) Author: Марк Коренберг (socketpair) * Date: 2014-05-28 15:15
why not to make tmpfileobj.delete as property that really sets self._closer.delete ? this will fix my problem.
msg219289 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-05-28 16:22
Yes, it would be easy to restore this behavior on Posix.
msg220783 - (view) Author: Eduardo Seabra (Eduardo.Seabra) * Date: 2014-06-17 00:10
I've attached a patch with @mmarkk proposal.
msg230738 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-11-06 13:15
If it is possible to cancel the effect of the O_TEMPORARY flag, we can use it to implement this feature on all platforms. But if it is not possible, we have several options:

1. Just close this issue and do nothing more. This was undocumented and non-portable feature.

2. Implement delete as readonly property. This is similar to 1 but makes the failure loud.

3. Implement this feature on non-Windows, document that it is non-Windows only feature, and raise an exception in delete setter on Windows.

4. Same as 3, but emit deprecation warning in delete setter on non-Windows. In future this should be replaced with solution 2.
msg230740 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-11-06 13:42
See also issue14243.
msg254731 - (view) Author: Марк Коренберг (socketpair) * Date: 2015-11-16 12:56
Please merge patch, proposed by Eduardo Seabra (Eduardo.Seabra). And document that this is UNIX-only solution.
msg255509 - (view) Author: Марк Коренберг (socketpair) * Date: 2015-11-27 23:23
No activity last week ? Why not to merge ?
msg256771 - (view) Author: Марк Коренберг (socketpair) * Date: 2015-12-20 16:17
ping. THis is essential in our software, so we use private field in order to work-around this bug.
msg256775 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-12-20 17:10
I think it is a reasonable backward-compatibility fix to make it work as badly as it previously did ;).  I would not document it, and I do not consider it a high priority (as Serhiy said, this usage reached into object internals and thus was dangerous from the start).

Unless I'm missing something, there is no need to add any new features to support your use case, Марк, or for us to apply a backward compatibility fix for you to get your code working agian.  Your use case is already supported:

    fd, name = tempfile.mkstemp(...)
    with io.open(fd, ...) as f:
        ...

This has the added advantage that it will work on all python versions.  You can easy wrap it into a function for use in your application if you don't like needing two lines.

To extend support for this to Windows, we can add a feature to mkstmp to not use O_TEMPORARY.  Also, it would probably be worth adding the above as an example to the docs.
msg256805 - (view) Author: Марк Коренберг (socketpair) * Date: 2015-12-21 19:13
So, I'm required to re-implement NamedTemporaryFile by hand like that:


    fd, name = tempfile.mkstemp(...)
    try:
        with io.open(fd, ...) as fff:
            fff.write('hello')
            fff.flush()
            os.fdatasync(fff)
            os.rename(name, ...)
    except:
        os.unlink(name)
        raise


This is sad to see...
msg256806 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-12-21 20:14
You'd have to do that anyway if we implemented a delete=False constructor argument, since you want it deleted if there are any errors, and that's not what a delete=False API would do.  

If it were me, I'd write it (untested)

  @contextlib.contextmanager
  def open_for_atomic_replace(fn):
    try:
        fd, name = tempfile.mkstemp()
        with io.open(fd) as fff:
            yield fff
        fff.flush()
        os.fdatasync(fff)
        os.rename(name, fn)
    except BaseException:
        os.unlink(name)
        raise

which would make your code simpler than it is now.

Naming it 'open_for_atomic_replace' reminded me of issue 8604, which is what you really want, not delete=False.
msg256816 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2015-12-21 23:54
> To extend support for this to Windows, we can add a 
> feature to mkstmp to not use O_TEMPORARY

O_TEMPORARY is only used for NamedTemporaryFile, not mkstemp. 

Regarding NamedTemporaryFile, that can be worked around to keep Windows from deleting the file. Just open a second handle with DELETE (0x00010000) access before closing the first one. Then close the first handle. Finally, use the second handle to call SetFileInformationByHandle [1] to remove the delete disposition. 

Since using O_TEMPORARY opens the file with DELETE access, you have to open the new handle with delete sharing. Unfortunately the CRT only provides O_TEMPORARY for opening a file with delete access and delete sharing, so just call CreateFile [2] directly instead. For example:

    >>> import os, tempfile, msvcrt, ctypes
    >>> kernel32 = ctypes.WinDLL('kernel32', use_last_error=True)
    >>> f1 = tempfile.NamedTemporaryFile()
    >>> f1.write(b'spam')
    4
    >>> h = kernel32.CreateFileW(f1.name, 0xC0010000, 7,
    ...                          None, 3, 0x80, None)
    >>> f1.close()
    >>> os.path.exists(f1.name) # really, it does exist
    False
    >>> info = (ctypes.c_int * 1)(0)
    >>> kernel32.SetFileInformationByHandle(h, 4, info, 4)
    1
    >>> os.path.exists(f1.name)
    True
    >>> f2 = os.fdopen(msvcrt.open_osfhandle(h, os.O_RDWR), 'r+')
    >>> f2.read()
    'spam'

Notice that right after f1 is closed, os.path.exists(f1.name) lies, saying the file no longer exists. The problem is you can't stat the file at this point because you can't open a *new* handle to a file that's marked for deletion. But you can use the existing handle to remove the delete disposition, after which the file is resurrected.

Also note that the FILE_DISPOSITION_INFO [3] docs say "t]his member has no effect if the handle was opened with FILE_FLAG_DELETE_ON_CLOSE". That's referring literally to the handle passed to SetFileInformationByHandle. 

Microsoft publishes the source for the FAT filesystem, so I could provide links for how this is implemented. Basically every kernel file object has a context control block (CCB) that points at an underlying kernel filesystem structure such as a file control block (FCB) or NTFS link/stream control block (LCB / SCB). When the file object is closed, the delete-on-close flag in the CCB gets transferred over to the underlying control structure and the delete disposition is set. You can't remove the CCB flag from the kernel file object (as the documentation correctly notes), but as long as you have an open handle to a file object whose CCB does not have this flag, you can use this other file handle to remove the delete disposition and make the file permanent. Just make sure to first close all file objects that do have the CCB delete-on-close flag because otherwise they'll just set the delete disposition again when closed.

[1]: https://msdn.microsoft.com/en-us/library/aa365539
[2]: https://msdn.microsoft.com/en-us/library/aa363858
[3]: https://msdn.microsoft.com/en-us/library/aa364221
msg256818 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-12-22 00:20
Oh, you are right of course.  I thought I was looking at _mkstemp_inner but in fact my edit window was over NamedTemporaryFile...I just wasn't paying attention.

I have no opinion myself as to whether it is worth the effort/code complexity to implement this behavior cross platform.
msg256853 - (view) Author: Марк Коренберг (socketpair) * Date: 2015-12-22 18:52
David Murray, in your code, if temporary file cannot be created, it attempts to unlink unknown name :).

But, I already have almost the same solution in production. I mention it sources in issue8604.
msg257249 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2015-12-31 06:11
For anyone interested, this issue is solvable on Windows by working around how O_TEMPORARY is implemented. To do this the _winapi module would need a wrapper for SetFileInformationByHandle (available in Vista+), which would need to support at least FileDispositionInfo. 

That said, the use case of replacing a file is better supported by calling the Windows API function ReplaceFile anyway. It isn't atomic, however. I don't think that's possible in general on Windows. The problem is that, unlike on Unix, an open file on Windows can't be anonymous. So if the replaced file is already open, it has to first be renamed before the replacement file can take its place. That creates a small window for things to go wrong.
History
Date User Action Args
2015-12-31 06:11:50eryksunsetmessages: + msg257249
2015-12-30 17:53:49socketpairsetstatus: open -> closed
resolution: wont fix
2015-12-22 18:52:08socketpairsetmessages: + msg256853
2015-12-22 00:20:09r.david.murraysetmessages: + msg256818
2015-12-21 23:54:18eryksunsetnosy: + eryksun
messages: + msg256816
2015-12-21 20:14:01r.david.murraysetmessages: + msg256806
2015-12-21 19:13:10socketpairsetmessages: + msg256805
2015-12-20 17:10:50r.david.murraysetnosy: + r.david.murray
messages: + msg256775
2015-12-20 16:17:53socketpairsetmessages: + msg256771
2015-11-27 23:23:27socketpairsetmessages: + msg255509
2015-11-16 12:56:36socketpairsetfiles: + test_tempfile.py

messages: + msg254731
versions: + Python 3.5, Python 3.6
2014-11-06 13:42:22serhiy.storchakasetmessages: + msg230740
2014-11-06 13:15:25serhiy.storchakasetmessages: + msg230738
2014-06-17 00:10:46Eduardo.Seabrasetfiles: + issue21579.patch

nosy: + Eduardo.Seabra
messages: + msg220783

keywords: + patch
2014-05-28 16:22:06serhiy.storchakasetkeywords: + easy

messages: + msg219289
stage: needs patch
2014-05-28 15:15:09socketpairsetmessages: + msg219274
2014-05-28 15:12:49socketpairsetmessages: + msg219273
2014-05-28 14:19:32serhiy.storchakasetmessages: + msg219269
2014-05-26 08:57:46socketpairsetmessages: + msg219147
2014-05-26 08:47:12serhiy.storchakasetnosy: + vstinner
2014-05-26 08:46:22serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg219146
2014-05-26 08:22:40pitrousetnosy: + georg.brandl, ncoghlan
2014-05-26 06:16:01berker.peksagsetnosy: + berker.peksag, pitrou
messages: + msg219137
2014-05-26 05:47:59socketpaircreate