msg359888 - (view) |
Author: Robert Xiao (nneonneo) * |
Date: 2020-01-13 04:56 |
tempfile.NamedTemporaryFile creates its wrapper like so:
try:
file = _io.open(fd, mode, buffering=buffering,
newline=newline, encoding=encoding, errors=errors)
return _TemporaryFileWrapper(file, name, delete)
except BaseException:
_os.unlink(name)
_os.close(fd)
raise
If _TemporaryFileWrapper throws any kind of exception (even KeyboardInterrupt), this closes `fd` but leaks a valid `file` pointing to that fd. The `file` will later attempt to close the `fd` when it is collected, which can lead to subtle bugs. (This particular issue contributed to this bug: https://nedbatchelder.com/blog/202001/bug_915_please_help.html)
This should probably be rewritten as:
try:
file = _io.open(fd, mode, buffering=buffering,
newline=newline, encoding=encoding, errors=errors)
except:
_os.unlink(name)
_os.close(fd)
raise
try:
return _TemporaryFileWrapper(file, name, delete)
except BaseException:
_os.unlink(name)
file.close()
raise
or perhaps use nested try blocks to avoid the _os.unlink duplication.
|
msg359892 - (view) |
Author: Albert Zeyer (Albert.Zeyer) * |
Date: 2020-01-13 09:19 |
Instead of `except:` and `except BaseException:`, I think better use `except Exception:`.
For further discussion and reference, also see the discussion here: https://news.ycombinator.com/item?id=22028581
|
msg359935 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2020-01-13 21:14 |
This issue is more complex. I am working on it.
|
msg359957 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2020-01-14 09:07 |
Problems:
1. In general, it is hard to avoid leaks because an exception like KeyboardInterrupt or MemoryError can be raised virtually anywhere, even before we save a file descriptor. We may rewrite the code so that it will use few simple bytecode instructions and disable interruption before these instructions. This may solve the problem in CPython, but alternate implementation will need to handle this theirself.
2. It is not safe to close a file descriptor if io.open() is failed, because it can already be closed in io.open(), and we do not know where it was closed or no. It can be solved by using the opener argument, but we will need to patch the name attribute of the file returned by io.open(), and it is not easy, because the type of the result of io.open() depends on the mode and buffering arguments, and only FileIO has writable name attribute. This will need additional complex code. We can also change the behavior of io.open(), make it either always closing fd or never closing it in case of error. But this will break working code, or introduce leaks in working code, so such change cannot be backported. In any case we should revisit all other cases of using io.open() with an open file descriptor in the stdlib (including TemporaryFile).
3. I afraid that removing a file while the file descriptor is open may not work on Windows. Seems this case is not well tested.
As for using `except Exception` instead of `except BaseException`, it is better to use the later, or even the bare `except`.
I worked on this issue yesterday, but then found new problems, it will take several days or weeks to solve them. As a workaround I suggest to skip temporary the test which monkeypatches _TemporaryFileWrapper.
|
msg359959 - (view) |
Author: Albert Zeyer (Albert.Zeyer) * |
Date: 2020-01-14 09:13 |
Why is `except BaseException` better than `except Exception` here? With `except Exception`, you will never run into the problem of possibly closing the fd twice. This is the main important thing which we want to fix here. This is more important than missing maybe to close it at all, or unlink it.
|
msg360040 - (view) |
Author: Robert Xiao (nneonneo) * |
Date: 2020-01-15 10:30 |
Could this be solvable if `closefd` were a writable attribute? Then we could do
file = None
try:
file = io.open(fd, ..., closefd=False)
file.closefd = True
except:
if file and not file.closefd:
os.close(fd)
raise
I believe this should be bulletproof - a KeyboardInterrupt can happen anywhere in the `try` and we will not leak or double-close. Either file.closefd is set, in which case `file` owns the fd and will close it eventually, or file.closefd is not set in which case the fd needs to be manually closed, or file doesn't exist (exception thrown in io.open or while assigning file), in which case the fd still needs to be manually closed.
Of course, this can leak if a KeyboardInterrupt lands in the `except` - but that's not something we can meaningfully deal with. The important thing is that this shouldn't double-close if I analyzed it correctly.
This is a somewhat annoying pattern, though. I wonder if there's a nicer way to implement it so we don't end up with this kind of boilerplate everywhere.
|
msg360041 - (view) |
Author: Albert Zeyer (Albert.Zeyer) * |
Date: 2020-01-15 10:37 |
If you anyway accept that KeyboardInterrupt can potentially leak, by just using `except Exception`, it would also be solved here.
|
msg360071 - (view) |
Author: Paul Ollis (paul_ollis) * |
Date: 2020-01-15 19:58 |
I think it is worth pointing out that the semantics of
f = ``open(fd, closefd=True)``
are broken (IMHO) because an exception can result in an unreferenced file
object that has taken over reponsibility for closing the fd, but it can
also fail without creating the file object.
Therefore it should be considered a bad idea to use open in this way. So
NamedTemporaryFile should take responsibility for closing the fd; i.e. it
should used closefd=False.
I would suggest that NamedTemporaryFile's code should be:
try:
file = _io.open(fd, mode, buffering=buffering, closefd=False,
newline=newline, encoding=encoding, errors=errors)
return _TemporaryFileWrapper(file, name, delete)
except BaseException:
_os.close(fd)
try:
_os.unlink(name)
except OSError:
pass # On windows the file will already be deleted.
raise
And '_os.close(self.file.fileno())' should be added after each of the two calls
to 'self.file.close()' in _TemporaryFileCloser.
Some notes on the design choices here.
1. The exception handling performs the close *before* the unlink because:
- That is the normal order that people expect.
- It is most important that the file is closed. We cannot rule out the
possibility of the unlink raising an exception, which could leak the fd.
2. BaseException is caught because we should not leak a file descriptor for
KeyboardInterrupt or any other exception.
3. It is generally good practice to protect os.unlink with a try ... except
OSError. So I think this is an appropriate way to prevent the Windows
specific error.
It will also suppress some other, rare but possible, errors. I think that
this is legitimate behaviour, but it should be documented.
|
msg360084 - (view) |
Author: Albert Zeyer (Albert.Zeyer) * |
Date: 2020-01-15 22:54 |
> I think it is worth pointing out that the semantics of
>
> f = ``open(fd, closefd=True)``
>
> are broken (IMHO) because an exception can result in an unreferenced file
> object that has taken over reponsibility for closing the fd, but it can
> also fail without creating the file object.
I thought that if this raises a (normal) exception, it always means that it did not have overtaken the `fd`, i.e. never results in an unreferenced file object which has taken ownership of `fd`.
It this is not true right now, you are right that this is problematic. But this should be simple to fix on the CPython side, right? I.e. to make sure whenever some exception is raised here, even if some temporary file object already was constructed, it will not close `fd`. It should be consistent in this behavior, otherwise indeed, the semantics are broken.
|
msg360104 - (view) |
Author: Paul Ollis (paul_ollis) * |
Date: 2020-01-16 09:27 |
> I thought that if this raises a (normal) exception, it always means that it
> did not have overtaken the `fd`, i.e. never results in an unreferenced file
> object which has taken ownership of `fd`.
The current CPython implementation does not guard against this happening. Some
incorrect combinations of arguments will definitely cause the problem. It is
also possible that it could occur under other circumstances.
> It this is not true right now, you are right that this is problematic. But
> this should be simple to fix on the CPython side, right? I.e. to make sure
> whenever some exception is raised here, even if some temporary file object
> already was constructed, it will not close `fd`. It should be consistent in
> this behavior, otherwise indeed, the semantics are broken.
I agree. I think it should be fairly simple to fix for CPython.
|
|
Date |
User |
Action |
Args |
2022-04-11 14:59:25 | admin | set | github: 83499 |
2021-03-28 03:37:24 | eryksun | set | versions:
+ Python 3.10, - Python 3.7 |
2021-03-28 03:36:06 | eryksun | set | messages:
- msg359981 |
2020-02-07 02:35:54 | jstasiak | set | nosy:
+ jstasiak
|
2020-01-16 09:27:07 | paul_ollis | set | messages:
+ msg360104 |
2020-01-15 22:54:13 | Albert.Zeyer | set | messages:
+ msg360084 |
2020-01-15 22:40:00 | rekcartgubnohtyp | set | nosy:
+ rekcartgubnohtyp
|
2020-01-15 19:58:22 | paul_ollis | set | nosy:
+ paul_ollis messages:
+ msg360071
|
2020-01-15 10:37:44 | Albert.Zeyer | set | messages:
+ msg360041 |
2020-01-15 10:30:40 | nneonneo | set | messages:
+ msg360040 |
2020-01-14 16:58:51 | eryksun | set | nosy:
+ eryksun messages:
+ msg359981
|
2020-01-14 09:13:00 | Albert.Zeyer | set | messages:
+ msg359959 |
2020-01-14 09:07:54 | serhiy.storchaka | set | messages:
+ msg359957 |
2020-01-14 06:36:07 | hitbox | set | nosy:
+ hitbox
|
2020-01-13 22:08:47 | joernheissler | set | nosy:
+ joernheissler
|
2020-01-13 21:14:12 | serhiy.storchaka | set | messages:
+ msg359935 |
2020-01-13 13:34:20 | nedbat | set | keywords:
+ patch stage: patch review pull_requests:
+ pull_request17389 |
2020-01-13 11:40:11 | nedbat | set | nosy:
+ nedbat
|
2020-01-13 10:25:36 | serhiy.storchaka | set | assignee: serhiy.storchaka |
2020-01-13 10:15:38 | sanjioh | set | nosy:
+ sanjioh
|
2020-01-13 09:19:41 | Albert.Zeyer | set | nosy:
+ Albert.Zeyer messages:
+ msg359892
|
2020-01-13 08:31:10 | Seth.Troisi | set | nosy:
+ Seth.Troisi
|
2020-01-13 08:10:01 | xtreak | set | nosy:
+ serhiy.storchaka
versions:
- Python 3.5, Python 3.6 |
2020-01-13 04:56:47 | nneonneo | create | |