This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: tempfile.TemporaryFile: name of file descriptor cannot be reused in consecutive initialization
Type: behavior Stage:
Components: Library (Lib) Versions: Python 3.9, Python 3.8
process
Status: open Resolution: wont fix
Dependencies: Superseder:
Assigned To: Nosy List: tjollans, zhongxiang117
Priority: normal Keywords:

Created on 2021-05-02 22:33 by zhongxiang117, last changed 2022-04-11 14:59 by admin.

Files
File name Uploaded Description Edit
xtempfile.py zhongxiang117, 2021-05-02 22:33 codes reproduce
new-xtempfile.py zhongxiang117, 2021-06-05 05:46 bug reproduce: new testing and explanations
Messages (5)
msg392746 - (view) Author: Xiang Zhong (zhongxiang117) * Date: 2021-05-02 22:33
The variable of instance cannot be reused in two consecutive codes chunk combinations. Please check the difference in attached file, xtempfile.py, function: test_3 & test_4.

However, surprisingly, the problem can be fixed in test_5 and test_6.

Which may be helpful for debug: why does the value of "id(fd.name)" keep constant all the time inside those combinations?
msg395089 - (view) Author: Thomas Jollans (tjollans) Date: 2021-06-04 15:16
I think I know what's going on here.

The way you're using tempfile.TemporaryFile() is, shall we say, unusual. TemporaryFile() already gives you an open file, there's really no reason why you'd call open() again. In practice you'd usually want to write something like

with tempfile.TemporaryFile() as f:
    # do stuff

Back to the issue. Your code boils down to:

fd = tempfile.TemporaryFile()
with open(fd.fileno()) as f:
    pass
fd = tempfile.TemporaryFile()
fd.seek(0)

You had fd.name rather than fd.fileno(). On *nix, these are the same for a TemporaryFile, which has no name. (On Windows, open(fd.name) fails)

What happens? open(fd.fileno()) creates a new file object for the same low-level file descriptor. At the end of the with block, this file is closed. This is fine, but the object returned by TemporaryFile doesn't know this happened.

You then call tempfile.TemporaryFile() again, which opens a new file. The OS uses the next available file descriptor, which happens to be the one you just closed. THEN, the old TemporaryFile object gets deleted. It doesn't know you already closed its file, so it calls close(). On the FD that has just been reused.

This has nothing to do with reusing the same name, it's just about what order things happen in. This achieves the same effect:

tmp1 = tempfile.TemporaryFile()
os.close(tmp1.fileno())
tmp2 = tempfile.TemporaryFile()
del tmp1
tmp2.seek(0)

Deleting the first file object before creating the second one (like in your test_5) solves this problem. I'm not sure why your test_6 works.

As for why id(fd.name) was the same for you? It's because fd.name is just fd.fileno() in this case, which is an integer.

TL;DR:
 - having two file objects for the same file descriptor is asking for trouble
 - I'd say "not a bug"
msg395154 - (view) Author: Xiang Zhong (zhongxiang117) * Date: 2021-06-05 05:46
Dear Mr. Jollans, thanks for your comments and information.

I know my usage of tempfile.TemporaryFile() is unusual, technically I “open”ed it twice. The reason I coded them in this way, as a simple illustration, is I want to test/debug some codes of my early work which directly takes “real-file-names” as the input.

Good to know “fd.fileno()” is more advance than “fd.name”, thank you for your information.

For your writing, if my understanding is correct, you mean that os.close() and tempfile.TemporaryFile() are using the same low-level file descriptor sequences, which means they only keep track of their own file descriptor, and assign new file descriptor to the “just” next one.

For example, assume file descriptor integer numbers: 1,2,3,4… as the low-level sequences.

Firstly, call tempfile.TemporaryFile() will return 1 as the file descriptor.

Secondly, execute “with open..” will use 2 as the new one and finally it will be closed.

Third, call tempfile.TemporaryFile() again, problem in here is, it will sequentially take 2 (rather than 3) as the new file descriptor, because it only keeps its own orders.

Since the file descriptor 2 is already used and closed inside second step in with operation, so the OSError will happen, causing the file descriptor 2 cannot be reused.

Is my understanding correct?

However, I do not agree with that. To boil down my example codes, 


fd = tempfile.TemporaryFile()
with open(fd.fileno()) as f:
    pass             # I know fd is closed after with is done

fd = tempfile.TemporaryFile() # since tempfile.TemporaryFile() is 
                              # called again, it should return a 
                              # valid file descriptor, and the above 
                              # operations in “with” chunk should
                              # not influence its result

fd.seek(0)   # this should work, however, in reality, it does not


Thanks for your example codes, but they are different with my examples. I even found a new problem with it, please check my additional testing file: “new-xtempfile.py” (which is attached).

I think those testings can be somehow explained your writing, however, it should be a bug.

Finally, I would like to make some revisions,

TL;DR:
 - having two file objects for the same file descriptor is asking for trouble
 - __del__ method should not close its “just” next file descriptor, if its current input is closed.
 - I'd say they are "bug"(s)
msg395156 - (view) Author: Thomas Jollans (tjollans) Date: 2021-06-05 07:06
Hello Xiang Zhong,

You're almost correct, but not quite.

File descriptors/file numbers are just integers assigned by the operating system, and Python has little to no control over them. The OS keeps a numbered list of open files for each process, and Python can make system calls like "read 10 bytes from file 5" or "write these 20 bytes to file 1".

Also good to know: 0 is stdin, 1 is stdout, 2 is stderr, 3+ are other files.

Now, what happens:

# Python asks the OS to open a new file. The OS puts the new file on
# the list and gives it the lowest number that is not in use: 3.
fd = tempfile.TemporaryFile()
# fd is an file object for file no 3
with open(fd.fileno()) as f:
    # f is another object for file no 3
    pass             # I know fd is closed after with is done
# the with statement closes file no 3
# fd does not know that file no 3 is closed. (and has no way of knowing!)

# Python asks the OS to open a new file. The OS puts the new file on
# the list and gives it the lowest number that is not in use: 3.
_tmp = tempfile.TemporaryFile()
# A new temporary file object is created for file no 3
fd = _tmp
# The old fd is finalized. It still thinks it has control of file
# no 3, so it closes that. The new temporary file object is given
# the name fd.


Aside: f.fileno() is not more "advanced" than f.name. It's just that, in the case of tempfile.TemporaryFile, the file is immediately deleted and ends up having no name, so f.name falls back on returning the file number. I just preferred it here to be explicit about what's going on. Normally f.name would be a string (and it is for a TemporaryFile on Windows!); here, it's an int.
msg395291 - (view) Author: Xiang Zhong (zhongxiang117) * Date: 2021-06-07 22:11
Dear Mr. Jollans,

Thanks for your additions, I think I know well about the SHELL programming language, now, every thing about the low-level file descriptors is linked together. I guess now I completely know your writing, thanks again for your detail explanations!

The potential “bug” of your consideration is the post operation caused by OS.close() in the delay/conflict of tempfile.Temporary(), as a result, fd does not know whether its linked file descriptor is valid or not due to “with” method.

However, if you have a look on “test_4” in attached “xtempfile.py”, you will see that in the first fd generated by tempfile.Temporary(), it is assumed to be handled by the first “with”, no more other operations are executed. Even after first “with” it becomes invalid, there are no further executions performed.

On the second fd returned by tempfile.TemporaryFile(), it should be a valid file descriptor no matter how the first one is dealt with. From the coding perspective, these two variables are just “happen to” be the same chars.

Besides, from my testing, changing any one of them to the different variable name, then the problem is gone. Furthermore, if you have a look on “test_3”, “test_5” and “test_6” (xtempfile.py), especially on “test_5”, they all prove that it should be a bug.

To go deeper, I attached another test file “new-xtempfile.py”, on “test_11”, you will see that it should have some problems on "__del__" method if its linking file descriptor is closed by os.close().

They two may or may not be the same problem.


Thanks again for your illustration of “fd.fileno()” and “fd.name”.
History
Date User Action Args
2022-04-11 14:59:45adminsetgithub: 88179
2021-06-07 22:11:27zhongxiang117setmessages: + msg395291
2021-06-05 07:06:53tjollanssetmessages: + msg395156
2021-06-05 05:46:14zhongxiang117setfiles: + new-xtempfile.py
resolution: wont fix
messages: + msg395154
2021-06-04 15:16:14tjollanssetnosy: + tjollans
messages: + msg395089
2021-05-02 22:33:35zhongxiang117create