Issue14243
Created on 2012-03-10 02:14 by dabrahams, last changed 2020-12-14 09:43 by ev2geny.
Files | ||||
---|---|---|---|---|
File name | Uploaded | Description | Edit | |
ntempfile.py | dlenski, 2012-06-30 05:45 | |||
share.py | sbt, 2012-07-02 13:34 |
Pull Requests | |||
---|---|---|---|
URL | Status | Linked | Edit |
PR 22431 | open | ev2geny, 2020-12-14 09:43 |
Messages (57) | |||
---|---|---|---|
msg155278 - (view) | Author: Dave Abrahams (dabrahams) | Date: 2012-03-10 02:14 | |
NamedTemporaryFile is too hard to use portably when you need to open the file by name after writing it. To do that, you need to close the file first (on Windows), which means you have to pass delete=False, which in turn means that you get no help in cleaning up the actual file resource, which as you can see from the code in tempfile.py is devilishly hard to do correctly. The fact that it's different on posix (you can open the file for reading by name without closing it first) makes this problem worse. What we really need for this use-case is a way to say, "delete on __del__ but not on close()." |
|||
msg155309 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2012-03-10 13:44 | |
This is quite silly indeed, and is due to the use of O_TEMPORARY in the file creation flags. |
|||
msg155316 - (view) | Author: Nick Coghlan (ncoghlan) * ![]() |
Date: 2012-03-10 15:03 | |
What's the proposal here? If delete is True, close() must delete the file. It is not acceptable for close() and __del__() to behave differently. OTOH, if the proposal is merely to change the way the file is opened on Windows so that it can be opened again without closing it first, that sounds fine. |
|||
msg155317 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2012-03-10 15:17 | |
> OTOH, if the proposal is merely to change the way the file is opened > on Windows so that it can be opened again without closing it first, > that sounds fine. That would be my proposal. It probably needs getting rid of O_TEMPORARY, exposing CreateFile and _open_osfhandle, and using the FILE_SHARE_DELETE open mode. |
|||
msg155333 - (view) | Author: Dave Abrahams (dabrahams) | Date: 2012-03-10 18:20 | |
I disagree that it's unacceptable for close() and __del__() to behave differently. The acceptable difference would be that __del__() closes (if necessary) /and/ deletes the file on disk, while close() merely closes the file. If you can in fact "change the way the file is opened on Windows so that it can be opened again without closing it first," that would be fine with me. It isn't clear to me that Windows supports that option, but I'm not an expert. Another possibility, of course, is something like what's implemented in: https://github.com/dabrahams/zeroinstall/commit/d76de038ef51bd1dae36280f8743e06c7154b44a#L3R44 (an optional argument to close() that prevents deletion). |
|||
msg155365 - (view) | Author: Nick Coghlan (ncoghlan) * ![]() |
Date: 2012-03-11 02:21 | |
The whole point of close() methods is to offer deterministic resource management to applications that need it. Pointing out to applications when they're relying on CPython's refcounting for prompt resource cleanup is why many of the standard types now trigger ResourceWarning for any application that relies on the GC to clean up such external resources in __del__. So, no, we're not going to back away from the explicit guarantee in the NamedTemporaryFile docs: "If delete is true (the default), the file is deleted as soon as it is closed." (Especially since doing so would also breach backward compatibility guarantees) However, you're right that the exclusive read lock in the current implementation makes the default behaviour of NamedTemporaryFile significantly less useful on Windows than it is on POSIX systems, so the implementation should be changed to behave more like the POSIX variant. |
|||
msg155374 - (view) | Author: Dave Abrahams (dabrahams) | Date: 2012-03-11 03:30 | |
If file.close() "offers deterministic resource management," then you have to consider the file's open/closed state to be a resource separate from its existence. A NamedTemporaryFile whose close() deterministically managed the open/closed state but not the existence of the file would be consistent with file. That said, I understand the move toward deprecating (in the informal sense) cleanups that rely on GC. I'm not suggesting breaking backward compatibility, either. I'm suggesting that it might make sense to allow an explicit close-without-delete as an /extension/ of the current interface. Given the move away from GC-cleanups, you'd probably want an explicit unlink() method as well in that case. |
|||
msg155375 - (view) | Author: Nick Coghlan (ncoghlan) * ![]() |
Date: 2012-03-11 03:45 | |
Dave, decoupling the lifecycle of the created file from the object that created it is exactly what delete=False already covers. The complicated dance in NamedTemporaryFile is only to make *__del__* work a bit more reliably during process shutdown (due to some messy internal problems with what CPython is doing at that point). If you're doing deterministic cleanup (even via atexit), you don't need any of that - you can just use os.unlink(). |
|||
msg155457 - (view) | Author: Dave Abrahams (dabrahams) | Date: 2012-03-12 17:59 | |
Nick, not to belabor this, but I guess you don't understand the use-case in question very well, or you'd see that delete=False doesn't cover it. The use case is this: I have to write a test for a function that takes a filename as a parameter and opens and reads from the file with that name. The test should conjure up an appropriate file, call the function, check the results, and clean up the file afterwards. It doesn't matter when the file gets cleaned up, as long as it is cleaned up "eventually." Having to explicitly delete the file is exactly the kind of boilerplate one wants to avoid in situations like this. Even if Windows allows a file to be opened for reading (in some circumstances) when it is already open for writing, it isn't hard to imagine that Python might someday have to support an OS that didn't allow it under any circumstances. It is also a bit perverse to have to keep the file open for writing after you're definitively done writing it, just to prevent it from being deleted prematurely. I can understand most of the arguments you make against close-without-delete, except those (like the above) that seem to come from a "you shouldn't want that; it's just wrong" stance. |
|||
msg157639 - (view) | Author: R. David Murray (r.david.murray) * ![]() |
Date: 2012-04-06 02:55 | |
See issue 14514 for an alternate proposal to solve this. I did search before I opened that issue, but search is currently somewhat broken and I did not find this issue. I'm not marking it as a dup because my proposal is really a new feature. |
|||
msg157925 - (view) | Author: Nick Coghlan (ncoghlan) * ![]() |
Date: 2012-04-10 01:29 | |
I agree we need to add something here to better support the idiom where the "close" and "delete" operations on a NamedTemporaryFile are decoupled without the delete becoming a completely independent call to os.unlink(). I agree with RDM's proposal in issue 14514 that the replacement should be "delete on __exit__ but not on close". As with generator context managers, I'd also add in the "last ditch" cleanup behaviour in __del__. Converting the issue to a feature request for 3.3 - there's no bug here, just an interaction with Windows that makes the existing behavioural options inconvenient. After all, you can currently get deterministic cleanup (with a __del__ fallback) via: @contextmanager def named_temp(name): f = NamedTemporaryFile(name, delete=False) try: yield f finally: try: os.unlink(name) except OSError: pass You need to be careful to make sure you keep the CM alive (or it will delete the file behind your back), but the idiom RDM described in the other issues handles that for you: with named_temp(fname) as f: data = "Data\n" f.write(data) f.close() # Windows compatibility with open(fname) as f: self.assertEqual(f.read(), data) As far as the API goes, I'm inclined to make a CM with the above behavour available as a new class method on NamedTemporaryFile: with NamedTemporaryFile.delete_after(fname) as f: # As per the workaround |
|||
msg157927 - (view) | Author: Nick Coghlan (ncoghlan) * ![]() |
Date: 2012-04-10 01:33 | |
Although, for the stdlib version, I wouldn't suppress the OS Error (I'd follow what we currently do for TemporaryDirectory) |
|||
msg157946 - (view) | Author: R. David Murray (r.david.murray) * ![]() |
Date: 2012-04-10 12:18 | |
"delete_after" what? I know it is somewhat implicit in the fact that it is a context manager call, but that is not the only context the method name will be seen in. (eg: 'dir' list of methods, doc index, etc). Even as a context manager my first thought in reading it was "delete after what?", and then I went, "oh, right". How about "delete_on_exit"? |
|||
msg157947 - (view) | Author: R. David Murray (r.david.murray) * ![]() |
Date: 2012-04-10 12:29 | |
By the way, I still think it would be nicer just to have the context manager work as expected with delete=True (ie: doesn't delete until the end of the context manager, whether the file is closed or not). I'm OK with being voted down on that, though. |
|||
msg157948 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2012-04-10 12:31 | |
> By the way, I still think it would be nicer just to have the context > manager work as expected with delete=True (ie: doesn't delete until > the end of the context manager, whether the file is closed or not). > I'm OK with being voted down on that, though. Indeed, the current behaviour under Windows seems to be kind of a nuisance, and having to call a separate method doesn't sound very user-friendly. |
|||
msg157949 - (view) | Author: Jason R. Coombs (jaraco) * ![]() |
Date: 2012-04-10 13:08 | |
I agree. If the primary usage of the class does not work well on Windows, developers will continue to write code using the primary usage because it works on their unix system, and it will continue to cause failures when run on windows. Because Python should run cross-platform, I consider this a bug in the implementation and would prefer it be adapted such that the primary use case works well on all major platforms. If there is a separate class method for different behavior, it should be for the specialized behavior, not for the preferred, portable behavior. I recognize there are backward-compatibility issues here, so maybe it's necessary to deprecate NamedTemporaryFile in favor of a replacement. |
|||
msg157952 - (view) | Author: R. David Murray (r.david.murray) * ![]() |
Date: 2012-04-10 13:42 | |
Well, fixing NamedTemporaryFile in either of the ways we've discussed isn't going to fix people writing non-portable code. A unix coder isn't necessarily going to close the file before reading it. However, it would at least significantly increase the odds that the code would be portable, while the current situation *ensures* that the code is not portable. |
|||
msg164358 - (view) | Author: Tim Golden (tim.golden) * ![]() |
Date: 2012-06-29 22:01 | |
Daniel. If you have any interest in this issue, would you mind summarising the state of affairs, please? I have no direct interest in the result but I'm happy to commit a patch or even to work one up if somone can come up with a single, concrete suggestion. |
|||
msg164369 - (view) | Author: Daniel Lenski (dlenski) * | Date: 2012-06-30 05:45 | |
Tim Golden, My preferred solution would be to replace the binary delete argument of the current NamedTemporaryFile implementation with finer-grained options: delete=False # don't delete delete=True # delete after file closed, current behavior delete=AFTER_CLOSE # delete after file closed delete=AFTER_CM_EXIT # delete after context manager exits delete=AFTER_CM_EXIT_NO_EXCEPTION # delete after CM exit, unless this is due to an exception I have implemented a Windows-friendly solution to the latter case using Nick Coghlan's code. My version does not delete the file until the context manager exits, and *if* the context manager exits due to an exception it leaves the file in place and reports its location, to aid me in debugging. |
|||
msg164375 - (view) | Author: Davide Rizzo (davide.rizzo) * | Date: 2012-06-30 08:46 | |
Daniel, Nick, shouldn't the context manager yield f within a with block? |
|||
msg164392 - (view) | Author: Richard Oudkerk (sbt) * ![]() |
Date: 2012-06-30 16:07 | |
Rather than add a NamedTemporaryFile.delete_after() classmethod, would it not be simpler to just add a close_without_unlink() method to NamedTemporaryFile? with NamedTemporaryFile() as f: <write to f> f.close_without_unlink() with open(f.name, 'rb') as f: <read from f> |
|||
msg164433 - (view) | Author: Daniel Lenski (dlenski) * | Date: 2012-06-30 23:01 | |
Davide, the @contextlib.contextmanager decorator effectively wraps the yield statement in the necessary glue so that everything prior to the yield statement occurs in the __enter__() method of the contextmanager, while everything subsequent occurs in the __exit__() method. On Sat, Jun 30, 2012 at 1:46 AM, Davide Rizzo <report@bugs.python.org>wrote: > > Davide Rizzo <sorcio@gmail.com> added the comment: > > Daniel, Nick, shouldn't the context manager yield f within a with block? > > ---------- > nosy: +davide.rizzo > > _______________________________________ > Python tracker <report@bugs.python.org> > <http://bugs.python.org/issue14243> > _______________________________________ > |
|||
msg164487 - (view) | Author: Daniel Lenski (dlenski) * | Date: 2012-07-01 17:19 | |
Richard, I think the problem with this is that it spreads the non-portable or OS-dependent parts of the code over several places rather than concentrating them all in one place. After close_without_unlink(), what would happen when the context manager exits or when the object is garbage collected? Would it then get unlinked? My preference would be to specify the behavior of close/__exit__/GC operations at the time of the NamedTemporaryFile creation, so that the rest of the code can be left unchanged. |
|||
msg164495 - (view) | Author: Richard Oudkerk (sbt) * ![]() |
Date: 2012-07-01 20:30 | |
The webpage http://msdn.microsoft.com/en-us/library/aa273350(v=vs.60).aspx describes the sopen() function which is like open() but has an extra shflag parameter for specifying the sharing allowed. If sopen() and the associated constants SH_DENYRD, SH_DENYWR, SH_DENYRW and SH_DENYNO were exposed in the os module, then maybe tempfile could use os.sopen() on Windows instead of os.open() to allow the file to be reopened without closing. |
|||
msg164496 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2012-07-01 20:37 | |
> If sopen() and the associated constants SH_DENYRD, SH_DENYWR, SH_DENYRW > and SH_DENYNO were exposed in the os module, then maybe tempfile could > use os.sopen() on Windows instead of os.open() to allow the file to be > reopened without closing. Sounds like a good way forward. |
|||
msg164497 - (view) | Author: Tim Golden (tim.golden) * ![]() |
Date: 2012-07-01 20:38 | |
On 01/07/2012 21:37, Antoine Pitrou wrote: > > Antoine Pitrou <pitrou@free.fr> added the comment: > >> If sopen() and the associated constants SH_DENYRD, SH_DENYWR, SH_DENYRW >> and SH_DENYNO were exposed in the os module, then maybe tempfile could >> use os.sopen() on Windows instead of os.open() to allow the file to be >> reopened without closing. > > Sounds like a good way forward. Agreed. Richard: do you have time to put something together? I'm happy to try if you don't. |
|||
msg164503 - (view) | Author: Richard Oudkerk (sbt) * ![]() |
Date: 2012-07-01 22:46 | |
> Agreed. Richard: do you have time to put something together? > I'm happy to try if you don't. I'm looking into it. Unfortunately, it seems that you need to use non-default flags when reopening a shared file. Eg, if the file is currently opened with SH_DENYNO and O_TEMPORARY, then you must reopen it using SH_DENYNO and O_TEMPORARY. However, I have an initial implementation of os.sopen() which makes the following work: import os, tempfile FNAME = "foo.txt" DATA = "hello bob" def opener(name, flag, mode=0o777): return os.sopen(name, flag | os.O_TEMPORARY, os.SH_DENYNO, mode) with open(FNAME, "w", opener=opener) as f: f.write(DATA) f.flush() with open(FNAME, "r", opener=opener) as f: assert f.read() == DATA assert not os.path.exists(FNAME) BTW, Maybe it would be better to add a keyword-only shareflag argument to os.open() rather than add os.sopen(). |
|||
msg164504 - (view) | Author: Richard Oudkerk (sbt) * ![]() |
Date: 2012-07-01 23:45 | |
I checked the source in c:/Program Files (x86)/Microsoft Visual Studio 10.0/VC/crt/src/open.c and it seems that on Windows open() is more or less implemented as a wrapper of sopen(..., ..., SH_DENYNO, ...). So the only reason that trying to reopen a NamedTemporaryFile fails on Windows is because when we reopen we need to use O_TEMPORARY. The following works for unmodified python: import os, tempfile DATA = b"hello bob" def temp_opener(name, flag, mode=0o777): return os.open(name, flag | os.O_TEMPORARY, mode) with tempfile.NamedTemporaryFile() as f: f.write(DATA) f.flush() with open(f.name, "rb", opener=temp_opener) as f: assert f.read() == DATA assert not os.path.exists(f.name) So maybe we should just define tempfile.opener(). |
|||
msg164505 - (view) | Author: Nick Coghlan (ncoghlan) * ![]() |
Date: 2012-07-02 00:45 | |
Alternatively, perhaps it would make sense to have a "reopen()" method on file objects that covers the necessary dance to reopen with the correct flags? That would solve more problems than just this one (possibly including making it possible to "reopen" StringIO and BytesIO objects). |
|||
msg164509 - (view) | Author: Tim Golden (tim.golden) * ![]() |
Date: 2012-07-02 08:53 | |
On 30/06/2012 06:45, Daniel Lenski wrote: > My preferred solution would be to replace the binary delete argument of the current NamedTemporaryFile implementation with finer-grained options: > delete=False # don't delete > delete=True # delete after file closed, current behavior > delete=AFTER_CLOSE # delete after file closed > delete=AFTER_CM_EXIT # delete after context manager exits > delete=AFTER_CM_EXIT_NO_EXCEPTION # delete after CM exit, unless this is due to an exception I'm aware that Richard & others are fleshing out alternatives. But my having asked you to propose something I wanted to come back on this particular suggestion. I think it's just too complex an API. Not least because, on Windows, we're making use of a filesystem feature which will delete on closure regardless (so the implementation on Windows skips the context-based delete). I'm not sure what we'll end up with but I'm more inclined towards the sort of method-based closer/reopener which is more explicit. |
|||
msg164514 - (view) | Author: Richard Oudkerk (sbt) * ![]() |
Date: 2012-07-02 13:34 | |
I wrote in an earlier message that a file opened with O_TEMPORARY must be reopened with O_TEMPORARY. This is not quite accurate. Using O_TEMPORARY causes the FILE_SHARE_DELETE sharing mode to be used, and a file currently opened with FILE_SHARE_DELETE can only be reopened with FILE_SHARE_DELETE. Unfortunately using O_TEMPORARY is the only way allowed by msvcrt to get FILE_SHARE_DELETE, even though it also has the orthogonal effect of unlinking the file when all handles are closed. The nice thing about FILE_SHARE_DELETE is that it gives Unix-like behaviour: the file can be renamed or deleted while you have an open handle, and you can still continue to use the handle. Attached is a mostly untested attempt at writing replacements for open() and os.open() which use the FILE_SHARE_DELETE sharing mode. Among other things, these can be used for reopening temporary files. Even if tempfile does not use make use of this, I think something similar would be useful in the stdlib. |
|||
msg164617 - (view) | Author: Richard Oudkerk (sbt) * ![]() |
Date: 2012-07-03 19:22 | |
I have opened Issue #15244 with a patch to add a share module to the stdlib. After monkey patching builtins.open(), io.open() and os.open() to be equivalents using FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE, the regression test suite still runs successfully. |
|||
msg184053 - (view) | Author: Piotr Dobrogost (piotr.dobrogost) | Date: 2013-03-12 22:02 | |
@sbt > (...) and it seems that on Windows open() is more or less implemented > as a wrapper of sopen(..., ..., SH_DENYNO, ...). > So the only reason that trying to reopen a NamedTemporaryFile fails on > Windows is because when we reopen we need to use O_TEMPORARY. Could you elaborate on this? What's the relation between SH_DENYNO argument to sopen() and O_TEMPORARY flag? |
|||
msg184056 - (view) | Author: Richard Oudkerk (sbt) * ![]() |
Date: 2013-03-12 22:49 | |
Sorry, I was not very clear. If you use the O_TEMPORARY flag with open() to get a file handle, then the share mode used with the underlying CreateFile() function is X = FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE whereas, if you don't use O_TEMPORARY then the share mode is Y = FILE_SHARE_READ | FILE_SHARE_WRITE While a handle is open with share mode X, you can only reopen the file if you also use share mode X. Therefore (using the msvcrt) you can only reopen it using O_TEMPORARY.* * sopen() does give some extra control over the share mode, but you still can't use it to get share mode X without also using O_TEMPORARY. |
|||
msg184088 - (view) | Author: Piotr Dobrogost (piotr.dobrogost) | Date: 2013-03-13 15:40 | |
@sbt Thanks for info. Also you mentioned looking at c:/Program Files (x86)/Microsoft Visual Studio 10.0/VC/crt/src/open.c What version of Visual Studio/SDK this file is available in? Also I'd like to point out that this problem came up at Stack Overflow in question "How to create a temporary file that can be read by a subprocess?" (http://stackoverflow.com/q/15169101/95735) |
|||
msg228371 - (view) | Author: Mark Lawrence (BreamoreBoy) * | Date: 2014-10-03 20:36 | |
This is one of several Windows file related issues, see also #12939, #21859, #15244 and possibly others. How can we take this forward? If it's of any use I can help with testing as I've a new Windows 8.1 HP laptop with 8G of memory. |
|||
msg244762 - (view) | Author: Carl Osterwisch (Carl Osterwisch) | Date: 2015-06-03 15:29 | |
I need to have an external DLL open and read the named file but this current NamedTemporaryFile implementation prevents that from working on Windows. |
|||
msg288473 - (view) | Author: John Florian (John Florian) | Date: 2017-02-23 19:44 | |
I just stumbled onto this though I'm not writing for Windows. Instead, I'm on Fedora 25 with Python 3.5.2 and I went nearly crazy tracing down what seemed to be inconsistent behavior. My use case has Python using NamedTemporaryFile(delete=True) in a CM to produce content fed into a subprocess. The code had been reliably working and then it just didn't. The only thing changing was the content being written, an rendered Jinja2 template. I believe the fate is determined by the content length. In debugging another problem, I'd been trivializing the template and once it got down to about 3k (rendered) the subprocess began seeing a file whose length was 0 bytes. Make the template bigger and all works again. Calling close() resolves the issue, but of course requires delete=False which removed much of the value of this object. Preliminary testing looks like flush() may also resolve the issue. Have I just been naive and getting lucky all along because this is expected or is there something else fishy here worth investigation? |
|||
msg288504 - (view) | Author: Nick Coghlan (ncoghlan) * ![]() |
Date: 2017-02-24 07:04 | |
John, your problem sounds different - if you're opening the files in binary mode, then you'll be getting a default buffer that's probably 4k or 8k in size, so if you're writing less content than that, the subprocess won't see anything until you explicitly flush() the buffer to disk (and even if you're writing more than that, the subprocess may see a truncated version without an explicit flush()). By contrast, the issue here relates to the fact that on Windows it's currently necessary to do the following in order to get multiple handles to a NamedTemporaryFile: 1. Open the file in the current process 2. Write content to the file 3*. Close the file in the current process 4. Open the file by name in another process (or the current process) 5. Read content from the file 6. Close the second file handle 7*. Delete the file *On POSIX systems, you can just skip step 3 and leave closing the original file handle until step 7, and that's the design that the current NamedTemporaryFile is built around. Most of the discussion above is about figuring out how to make that same approach "just work" on Windows (perhaps with some additional flags used in step 4), rather than providing a third option beyond the current delete-on-close and delete-manually. |
|||
msg288520 - (view) | Author: John Florian (John Florian) | Date: 2017-02-24 12:21 | |
Okay Nick. Thanks for the detailed info. I suspected buffering was a factor, but wasn't certain. Would it be worthwhile pursuing a note in the docs or would that constitute clutter over what should be a standard assumption? I was thrown off course for all the prior uses without issues, but in hindsight I don't know offhand how many involved a subprocess. |
|||
msg288521 - (view) | Author: Nick Coghlan (ncoghlan) * ![]() |
Date: 2017-02-24 12:28 | |
John: I don't think it would be clutter to have an explicit reminder about that point in the NamedTemporaryFile documentation, so feel free to file a separate enhancement issue for it. |
|||
msg288538 - (view) | Author: Eryk Sun (eryksun) * ![]() |
Date: 2017-02-25 04:49 | |
Nick wrote: > 1. Open the file in the current process > 2. Write content to the file > 3*. Close the file in the current process In step 1, do you mean calling NamedTemporaryFile with delete=False? In that case there's no immediate problem with opening the file again in Windows. If you mean calling NamedTemporaryFile with delete=True, then step 3 deletes the file. Adding support for Windows share modes would be useful in general and would help within the current process. However, users may also need to open the temporary file in another process. Most programs don't open their files with shared delete access. There's a workaround to allow the file to be opened normally, but it involves setting the delete disposition and then clearing it in a pointless dance. It would be better to implement an option such as delete=AFTER_CM_EXIT, to try to remove the file without relying on O_TEMPORARY. The downside is that the file won't be deleted if the interpreter crashes, gets terminated or if another process has the file open without delete sharing. |
|||
msg288540 - (view) | Author: Eryk Sun (eryksun) * ![]() |
Date: 2017-02-25 05:18 | |
Richard wrote: > while a handle is open with share mode X, you can only reopen > the file if you also use share mode X To clarify, the share mode is not a property of a handle. It's a property of a File object. A handle is a generic reference to any kind of kernel object, not just a File. The following is a brief discussion about access sharing and the way file deletion works in Windows. Both of these tend to frustrate Unix programmers who end up supporting Windows. In this discussion, a "File", with an uppercase 'F', is the Windows kernel object type that references an open device or file-system directory, file, or stream. A "file", with a lowercase 'f', is a data file. Shared access is implemented for File objects [1] and tracked in a SHARE_ACCESS record. When opening a file or directory, its shared access state is updated by the kernel function IoCheckShareAccess [2]. Discretionary shared access is primarily a concern for file systems, but volume devices and disk devices (e.g. \\.\C: and \\.\PhysicalDrive0) also use it. Devices that are flagged for mandatory exclusive access [3] (e.g. \\.\COM1) generally ignore the share mode. Some non-exclusive devices also ignore the share mode (e.g. \\.NUL and \\.CON). If it's not ignored, the share mode affects delete, write, read, and execute access. The following File access rights are affected: DELETE, FILE_WRITE_DATA, FILE_APPEND_DATA, FILE_READ_DATA, and FILE_EXECUTE. The share mode thus affects any combination of generic access -- GENERIC_ALL, GENERIC_WRITE, GENERIC_READ, GENERIC_EXECUTE. A File object's requested sharing is stored in its SharedDelete, SharedWrite, and SharedRead members. The granted access that's relevant to the share mode is stored in the DeleteAccess, WriteAccess (write/append), and ReadAccess (read/execute) members. Given these values, checking for a sharing violation and updating the shared access counts uses the following logic: RequireSharedDelete = DeleteAccessCount > 0; RequireSharedWrite = WriteAccessCount > 0; RequireSharedRead = ReadAccessCount > 0; DenyDeleteAccess = SharedDeleteCount < OpenCount; DenyWriteAccess = SharedWriteCount < OpenCount; DenyReadAccess = SharedReadCount < OpenCount; if (RequireSharedDelete && !SharedDelete || RequireSharedWrite && !SharedWrite || RequireSharedRead && !SharedRead || DenyDeleteAccess && DeleteAccess || DenyWriteAccess && WriteAccess || DenyReadAccess && ReadAccess) { return STATUS_SHARING_VIOLATION; } OpenCount++; DeleteAccessCount += DeleteAccess; WriteAccessCount += WriteAccess; ReadAccessCount += ReadAccess; SharedDeleteCount += SharedDelete; SharedWriteCount += SharedWrite; SharedReadCount += SharedRead; For example, to be granted delete access, all existing File object references must share delete access. However, if a file is opened with delete sharing but delete access hasn't been granted, then it can be opened again without delete sharing. The SHARE_ACCESS structure is usually stored in a file (or stream) control block (FCB/SCB), which is a structure that coordinates access to a file or directory across multiple File objects. The FsContext member of a File object points at the FCB. A file system stores its private state for a File in a context control block (CCB), to which the File's FsContext2 member points. The CCB is where a file system tracks, for example, whether the file should be deleted when the object is closed. Deleting a file sets a delete disposition in the FCB/SCB (or LCB if hard links are supported). The file can't be unlinked until all referencing File objects have been closed and the underlying FCB/SCB/LCB is closing. Until then a 'deleted' file is still linked in the parent directory and prevents the directory from being deleted. A deleted but still referenced file is in a semi-zombie state. Windows file systems don't allow opening such a file for any access, but it can still be accessed via existing objects. If one of these File objects has delete access, it can be used to unset the delete disposition (e.g. via SetFileInformationByHandle) to make the file accessible again. [1]: https://msdn.microsoft.com/en-us/library/ff545834 [2]: https://msdn.microsoft.com/en-us/library/ff548341 [3]: https://msdn.microsoft.com/en-us/library/ff563827 |
|||
msg376593 - (view) | Author: Chary Chary (chary314) | Date: 2020-09-08 20:33 | |
Dear all, are there any plans to move this quite old issue forward? I stumbled across this issue, because I found that at the moment there is no out of the box solution to use tempfile.NamedTemporaryFile in Windows in such scenario (which is often used in unit testing): * in test module: 1) create and open temporary file 2) write data to it 3) pass name of the temporary file to the operational code * In operational code, being tested 1) open file, using name of the temporary file 2) read data from this temporary file |
|||
msg376645 - (view) | Author: Steve Dower (steve.dower) * ![]() |
Date: 2020-09-09 15:57 | |
In general, if a bug here appears to be inactive, it's probably waiting on someone to volunteer to move it forward. Often merely posting to a thread is enough. For this case, I think the best thing we can probably do is change the default share mode for _all_ opens to include FILE_SHARE_DELETE. This would also help a number of other situations, as well as bringing the default Windows behaviour slightly more in line with how POSIX likes to do things. As far as I'm aware this would only be harmful in cases where people are trying to implicitly lock files on Windows by keeping an open handle, and are using a different code path on other platforms where that won't work. |
|||
msg376649 - (view) | Author: Chary Chary (chary314) | Date: 2020-09-09 17:04 | |
Steve Dower, thanks for looking at this. After reading the thread from my amature point of view I kind of liked suggestion of Daniel Lenski to replace the binary delete argument of the current NamedTemporaryFile implementation with finer-grained options https://bugs.python.org/issue14243#msg164369 This would also take care of the comment from Dave Abrahams, that <<Even if Windows allows a file to be opened for reading (in some circumstances) when it is already open for writing, it isn't hard to imagine that Python might someday have to support an OS that didn't allow it under any circumstances. It is also a bit perverse to have to keep the file open for writing after you're definitively done writing it, just to prevent it from being deleted prematurely.>> https://bugs.python.org/issue14243#msg155457 As for your comment to include FILE_SHARE_DELETE. If the decision is taken to go this path, shall we also not include FILE_SHARE_READ and FILE_SHARE_WRITE? https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-createfilea |
|||
msg376656 - (view) | Author: Eryk Sun (eryksun) * ![]() |
Date: 2020-09-09 18:36 | |
> For this case, I think the best thing we can probably do is change the > default share mode for _all_ opens to include FILE_SHARE_DELETE. The C runtime doesn't provide a way to share delete access, except for the O_TEMPORARY flag, so Python would have to re-implement open(). Also, this doesn't help with re-opening a temporary file in another process since most programs do not share file delete access. There could be an option to enable a context-manager delete that's independent of closing the file. For example, if delete=True and delete_on_close=False, then _TemporaryFileCloser.close doesn't delete the file. Instead the file would be deleted via os.unlink in _TemporaryFileWrapper.__exit__. The default would be delete=True and delete_on_close=True, which would use the O_TEMPORARY flag in Windows. Combining delete=False with delete_on_close=True would raise a ValueError. > bringing the default Windows behaviour slightly more in line with > how POSIX likes to do things. In Windows 10, using FILE_SHARE_DELETE is even closer to POSIX behavior when the filesystem is NTFS, which supports POSIX delete semantics by renaming the file to a hidden system directory ("\$Extend\$Deleted") and setting its delete disposition. WinAPI DeleteFileW has been updated to use POSIX semantics if the filesystem supports it: >>> f = tempfile.NamedTemporaryFile() >>> h = msvcrt.get_osfhandle(f.fileno()) >>> os.unlink(f.name) >>> info = GetFileInformationByHandleEx(h, FileStandardInfo) >>> info['DeletePending'] True >>> GetFinalPathNameByHandle(h, 0) '\\\\?\\C:\\$Extend\\$Deleted\\001800000002C4F4301F419F' |
|||
msg376659 - (view) | Author: Chary Chary (chary314) | Date: 2020-09-09 20:02 | |
Why do we need to use this O_TEMPORARY flag at all? I understand that we are using OS functionality, available on Windows, rather than implementing it in Python. But why doing this, if we already do this for none-nt systems in Python any way? Doesn't it just complicate the code? https://github.com/python/cpython/blob/fa8c9e70104b0aef966a518eb3a80a4881906ae0/Lib/tempfile.py#L423 |
|||
msg376662 - (view) | Author: Chary Chary (chary314) | Date: 2020-09-09 20:24 | |
I am not sure, this is the correct place to ask this "educational" question, but I will do this any way: where is this O_TEMPORARY flag defined? if I looks at [tempfile.py](https://github.com/python/cpython/blob/3ff6975e2c0af0399467f234b2e307cc76efcfa9/Lib/tempfile.py#L539) then appears, that it is defined in the os.py module. But in [os.py](https://github.com/python/cpython/blob/3ff6975e2c0af0399467f234b2e307cc76efcfa9/Lib/os.py) I can't find it |
|||
msg376663 - (view) | Author: Eryk Sun (eryksun) * ![]() |
Date: 2020-09-09 20:47 | |
> Why do we need to use this O_TEMPORARY flag at all? Using the O_TEMPORARY flag isn't necessary, but it's usually preferred because it ensures the file gets deleted even if the process is terminated abruptly. However, it opens the file with delete access, and most Windows programs don't share delete access for normal file opens. This can be worked around in Python code by using an opener that calls CreateFileW with delete-access sharing. But it can't be worked around in general. I prefer to provide a way to omit O_TEMPORARY, but still use it by default. When it's omitted, I'd also like to be able to close the file within the context block without deleting it, for which one use case is to reopen the file in a program that doesn't share read or write access. A new delete_on_close option would support this case, in addition to providing a way to omit the O_TEMPORARY flag. For example: with tempfile.NamedTemporaryFile(delete_on_close=False) as f: f.close() subprocess.run([cmd, f.name]) The file will still be deleted by the context manager, but the tradeoff is that it's not as reliable as the default delete-on-close behavior that uses the O_TEMPORARY flag. |
|||
msg376664 - (view) | Author: Steve Dower (steve.dower) * ![]() |
Date: 2020-09-09 20:58 | |
Issue41490 can also be fixed by using FILE_SHARE_DELETE on all opened files (and that's a release blocker, so we need to fix it somehow), and if DeleteFile has been updated as you suggest then it might even help with the "pip replacing its own script executable" issue. Nothing preventing someone from contributing the flag on open as well. There's definitely value there, but I think it's a workaround when we can make things Just Work more transparently. |
|||
msg376665 - (view) | Author: Eryk Sun (eryksun) * ![]() |
Date: 2020-09-09 21:22 | |
> Nothing preventing someone from contributing the flag on open as well. To be clear, supporting delete-access sharing would require re-implementing C _wopen in terms of CreateFileW, _open_osfhandle, etc. It could be implemented as _Py_wopen in Python/fileutils.c. |
|||
msg376666 - (view) | Author: Steve Dower (steve.dower) * ![]() |
Date: 2020-09-09 21:27 | |
The comment you quoted was referring to the NamedTemporaryFile(do_not_delete) flag. Yes, we'd have to reimplement the UCRT function using the system API. Ultimately, it's not a great compatibility layer if you want to match POSIX semantics and not just the C specification, which is why we do it so often :) |
|||
msg376684 - (view) | Author: Eryk Sun (eryksun) * ![]() |
Date: 2020-09-10 13:22 | |
> we'd have to reimplement the UCRT function using the system API. Could the implementation drop support for os.O_TEXT? I think Python 3 should have removed both os.O_TEXT and os.O_BINARY. 3.x has no need for the C runtime's ANSI text mode, with its Ctrl+Z behavior inherited from MS-DOS. I'd prefer that os.open always used _O_BINARY and raised a ValueError if passed any of the C runtime's text modes, including _O_TEXT, _O_WTEXT, _O_U16TEXT, and _O_U8TEXT. If _O_TEXT is supported, then we have to copy the C runtime's behavior, which truncates a Ctrl+Z from the end of the file if it's opened with read-write access. If Unicode text modes are supported, then we have to read the BOM, which can involve opening the file twice if the caller doesn't request read access. |
|||
msg376697 - (view) | Author: Steve Dower (steve.dower) * ![]() |
Date: 2020-09-10 22:58 | |
We'd CreateFile the file and then immediately pass it to _open_osfhandle, which would keep the semantics the same apart from the share flags. I'm not entirely against getting rid of O_TEXT support, but haven't taken the time to think through the implications. |
|||
msg376702 - (view) | Author: Eryk Sun (eryksun) * ![]() |
Date: 2020-09-11 07:57 | |
> We'd CreateFile the file and then immediately pass it to > _open_osfhandle Unlike _wopen, _open_osfhandle doesn't truncate Ctrl+Z (0x1A) from the last byte when the flags value contains _O_TEXT | _O_RDWR. _wopen implements this to allow appending data in text mode. The implementation is based on GetFileType (skip pipes and character devices), _lseeki64, _read, and _chsize[_s]. O_TEXT (ANSI text mode) has to be supported for now, but it doesn't properly fit in Python 3. The io module opens files using the CRT's binary mode. It doesn't implement newline translation for bytes I/O. And io.TextIOWrapper doesn't support Ctrl+Z as a logical EOF marker. As long as it's supported, O_TEXT should be made the default in os.open (but not in msvcrt.open_osfhandle), independent of the CRT default fmode (i.e. _get_fmode and _set_fmode). Many callers already assume that's the case. For example, tempfile.mkstemp with text=True uses tempfile._text_openflags, which doesn't include os.O_TEXT. That assumption is currently wrong if _set_fmode(_O_BINARY) is called. Thankfully, Python has never documented support for the _O_WTEXT, _O_U16TEXT, and _O_U8TEXT Unicode text modes in os.open. To my knownledge, there is no reasonable way to reimplement these modes. The C runtime doesn't expose a public interface to modify a file's internal text and Unicode modes, and _open_osfhandle only supports ANSI text mode. If _Py_wopen is implemented, it will have to fail the Unicode (UTF-16 or UTF-8) modes with EINVAL. Even without _Py_wopen, I'd prefer to modify os.open to fail them because wrapping a Unicode-mode fd with io.FileIO doesn't function reliably. FileIO doesn't guarantee wchar_t aligned reads and writes, which the CRT requires in Unicode mode. |
|||
msg377580 - (view) | Author: Evgeny (ev2geny) * | Date: 2020-09-27 21:23 | |
Hello, this is to let you know, that I have created a pull request for this issue https://github.com/python/cpython/pull/22431 I am not really an experienced programmer, but I will give it a try |
History | |||
---|---|---|---|
Date | User | Action | Args |
2020-12-14 09:43:08 | ev2geny | set | keywords:
+ patch stage: needs patch -> patch review pull_requests: + pull_request22618 |
2020-09-27 21:23:16 | ev2geny | set | nosy:
+ ev2geny messages: + msg377580 |
2020-09-11 07:57:19 | eryksun | set | messages: + msg376702 |
2020-09-10 22:58:48 | steve.dower | set | messages: + msg376697 |
2020-09-10 13:22:00 | eryksun | set | messages: + msg376684 |
2020-09-09 21:27:13 | steve.dower | set | messages: + msg376666 |
2020-09-09 21:22:03 | eryksun | set | messages: + msg376665 |
2020-09-09 20:58:15 | steve.dower | set | messages: + msg376664 |
2020-09-09 20:47:11 | eryksun | set | messages: + msg376663 |
2020-09-09 20:24:42 | chary314 | set | messages: + msg376662 |
2020-09-09 20:02:44 | chary314 | set | messages: + msg376659 |
2020-09-09 18:36:45 | eryksun | set | messages: + msg376656 |
2020-09-09 17:04:01 | chary314 | set | messages: + msg376649 |
2020-09-09 15:57:24 | steve.dower | set | messages:
+ msg376645 versions: + Python 3.10, - Python 3.7 |
2020-09-08 20:33:31 | chary314 | set | nosy:
+ chary314 messages: + msg376593 |
2018-05-16 11:35:29 | ethan smith | set | nosy:
+ ethan smith |
2017-08-01 08:21:18 | njs | set | nosy:
+ njs |
2017-02-25 05:18:30 | eryksun | set | messages: + msg288540 |
2017-02-25 04:49:40 | eryksun | set | nosy:
+ eryksun messages: + msg288538 versions: + Python 3.7, - Python 3.5 |
2017-02-24 12:28:05 | ncoghlan | set | messages: + msg288521 |
2017-02-24 12:21:20 | John Florian | set | messages: + msg288520 |
2017-02-24 07:04:09 | ncoghlan | set | messages: + msg288504 |
2017-02-24 03:07:26 | BreamoreBoy | set | nosy:
- BreamoreBoy |
2017-02-23 22:12:10 | jwilk | set | nosy:
+ jwilk |
2017-02-23 19:44:46 | John Florian | set | nosy:
+ John Florian messages: + msg288473 |
2015-06-03 15:29:03 | Carl Osterwisch | set | nosy:
+ Carl Osterwisch messages: + msg244762 |
2015-03-30 13:54:54 | paul.moore | set | nosy:
+ paul.moore |
2014-10-04 23:58:25 | martin.panter | set | nosy:
+ martin.panter |
2014-10-03 20:36:43 | BreamoreBoy | set | nosy:
+ BreamoreBoy, zach.ware, steve.dower, - brian.curtin messages: + msg228371 versions: + Python 3.5, - Python 3.4 |
2013-08-13 14:39:33 | Gabi.Davar | set | nosy:
+ Gabi.Davar |
2013-03-13 15:40:26 | piotr.dobrogost | set | messages: + msg184088 |
2013-03-12 22:49:11 | sbt | set | messages: + msg184056 |
2013-03-12 22:02:32 | piotr.dobrogost | set | messages: + msg184053 |
2013-03-05 22:07:52 | piotr.dobrogost | set | nosy:
+ piotr.dobrogost |
2012-07-03 19:22:32 | sbt | set | messages: + msg164617 |
2012-07-02 13:34:36 | sbt | set | files:
+ share.py messages: + msg164514 |
2012-07-02 08:53:38 | tim.golden | set | messages:
+ msg164509 title: tempfile.NamedTemporaryFile not particularly useful on Windows -> tempfile.NamedTemporaryFile not particularly useful on Windows |
2012-07-02 00:45:18 | ncoghlan | set | messages: + msg164505 |
2012-07-01 23:45:48 | sbt | set | messages: + msg164504 |
2012-07-01 22:46:14 | sbt | set | messages: + msg164503 |
2012-07-01 20:38:59 | tim.golden | set | messages: + msg164497 |
2012-07-01 20:37:10 | pitrou | set | stage: needs patch messages: + msg164496 components: + Windows versions: + Python 3.4, - Python 3.3 |
2012-07-01 20:30:37 | sbt | set | messages: + msg164495 |
2012-07-01 17:19:35 | dlenski | set | messages: + msg164487 |
2012-06-30 23:01:52 | dlenski | set | messages:
+ msg164433 title: tempfile.NamedTemporaryFile not particularly useful on Windows -> tempfile.NamedTemporaryFile not particularly useful on Windows |
2012-06-30 16:07:30 | sbt | set | nosy:
+ sbt messages: + msg164392 |
2012-06-30 08:46:28 | davide.rizzo | set | nosy:
+ davide.rizzo messages: + msg164375 |
2012-06-30 05:45:37 | dlenski | set | files:
+ ntempfile.py messages: + msg164369 |
2012-06-29 22:01:11 | tim.golden | set | messages: + msg164358 |
2012-06-29 21:28:52 | dlenski | set | nosy:
+ dlenski |
2012-04-10 13:42:46 | r.david.murray | set | messages: + msg157952 |
2012-04-10 13:08:58 | jaraco | set | messages:
+ msg157949 title: tempfile.NamedTemporaryFile not particularly useful on Windows -> tempfile.NamedTemporaryFile not particularly useful on Windows |
2012-04-10 12:31:30 | pitrou | set | messages: + msg157948 |
2012-04-10 12:29:46 | r.david.murray | set | messages: + msg157947 |
2012-04-10 12:18:29 | r.david.murray | set | messages: + msg157946 |
2012-04-10 01:33:23 | ncoghlan | set | messages: + msg157927 |
2012-04-10 01:31:35 | ncoghlan | link | issue14514 superseder |
2012-04-10 01:29:28 | ncoghlan | set | type: behavior -> enhancement title: NamedTemporaryFile unusable under Windows -> tempfile.NamedTemporaryFile not particularly useful on Windows messages: + msg157925 versions: - Python 2.7, Python 3.2 |
2012-04-06 02:55:39 | r.david.murray | set | nosy:
+ r.david.murray messages: + msg157639 |
2012-03-23 15:51:18 | jaraco | set | nosy:
+ jaraco |
2012-03-12 17:59:58 | eric.araujo | set | nosy:
+ eric.araujo |
2012-03-12 17:59:52 | dabrahams | set | messages: + msg155457 |
2012-03-11 03:45:33 | ncoghlan | set | messages: + msg155375 |
2012-03-11 03:30:15 | dabrahams | set | messages: + msg155374 |
2012-03-11 02:21:04 | ncoghlan | set | messages: + msg155365 |
2012-03-10 18:20:17 | dabrahams | set | messages: + msg155333 |
2012-03-10 15:20:07 | pitrou | set | nosy:
+ tim.golden, brian.curtin |
2012-03-10 15:17:31 | pitrou | set | messages: + msg155317 |
2012-03-10 15:03:00 | ncoghlan | set | messages: + msg155316 |
2012-03-10 13:44:25 | pitrou | set | nosy:
+ ncoghlan, pitrou title: NamedTemporaryFile usability request -> NamedTemporaryFile unusable under Windows messages: + msg155309 versions: + Python 3.2, Python 3.3 type: behavior |
2012-03-10 04:53:46 | eric.smith | set | nosy:
+ eric.smith |
2012-03-10 02:14:08 | dabrahams | create |