Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

_pyio.BufferedReader and _pyio.TextIOWrapper destructor don't emit ResourceWarning if the file is not closed #64028

Closed
vstinner opened this issue Nov 29, 2013 · 20 comments
Labels
performance Performance or resource usage topic-IO

Comments

@vstinner
Copy link
Member

BPO 19829
Nosy @pitrou, @vstinner, @benjaminp, @4kir4, @vadmium, @serhiy-storchaka
Files
  • del-detach.patch: Wrapper’s del() calls detach()
  • del-flush.patch: IOBase.del() only calls flush()
  • pyio_res_warn-2.patch
  • pyio_res_warn-3.patch
  • res_warn.py
  • iobase_finalizer.patch
  • bytesio_stringio.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2021-09-21.22:13:48.812>
    created_at = <Date 2013-11-29.10:37:01.896>
    labels = ['expert-IO', 'performance']
    title = "_pyio.BufferedReader and _pyio.TextIOWrapper destructor don't emit ResourceWarning if the file is not closed"
    updated_at = <Date 2021-09-21.22:13:48.811>
    user = 'https://github.com/vstinner'

    bugs.python.org fields:

    activity = <Date 2021-09-21.22:13:48.811>
    actor = 'vstinner'
    assignee = 'none'
    closed = True
    closed_date = <Date 2021-09-21.22:13:48.812>
    closer = 'vstinner'
    components = ['IO']
    creation = <Date 2013-11-29.10:37:01.896>
    creator = 'vstinner'
    dependencies = []
    files = ['38134', '38136', '42272', '42273', '42274', '42291', '42292']
    hgrepos = []
    issue_num = 19829
    keywords = ['patch']
    message_count = 20.0
    messages = ['204714', '204729', '224129', '235925', '235943', '235947', '262344', '262345', '262347', '262348', '262350', '262351', '262362', '262438', '262439', '262440', '262454', '262470', '262487', '402385']
    nosy_count = 9.0
    nosy_names = ['pitrou', 'vstinner', 'benjamin.peterson', 'stutzbach', 'Arfrever', 'akira', 'martin.panter', 'piotr.dobrogost', 'serhiy.storchaka']
    pr_nums = []
    priority = 'normal'
    resolution = 'out of date'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'resource usage'
    url = 'https://bugs.python.org/issue19829'
    versions = ['Python 3.5', 'Python 3.6']

    @vstinner
    Copy link
    Member Author

    $ ./python
    Python 3.4.0b1 (default:acabd3f035fe, Nov 28 2013, 15:04:09) 
    [GCC 4.8.2 20131017 (Red Hat 4.8.2-1)] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import _pyio
    >>> f=_pyio.open("/etc/issue"); f=None
    >>> f=_pyio.open("/etc/issue", "rb"); f=None
    >>> f=_pyio.open("/etc/issue", "rb", 0); f=None
    __main__:1: ResourceWarning: unclosed file <_io.FileIO name='/etc/issue' mode='rb'>
    >>> import io
    >>> f=io.open("/etc/issue"); f=None
    __main__:1: ResourceWarning: unclosed file <_io.TextIOWrapper name='/etc/issue' mode='r' encoding='UTF-8'>
    >>> f=io.open("/etc/issue", "rb"); f=None
    __main__:1: ResourceWarning: unclosed file <_io.BufferedReader name='/etc/issue'>
    >>> f=io.open("/etc/issue", "rb", 0); f=None
    __main__:1: ResourceWarning: unclosed file <_io.FileIO name='/etc/issue' mode='rb'>

    I expect the same behaviour when I use _pyio or io module.

    @serhiy-storchaka
    Copy link
    Member

    I think it will be good.

    @serhiy-storchaka serhiy-storchaka added topic-IO performance Performance or resource usage labels Nov 29, 2013
    @4kir4
    Copy link
    Mannequin

    4kir4 mannequin commented Jul 27, 2014

    Related bpo-21859 "Add Python implementation of FileIO"

    @vadmium
    Copy link
    Member

    vadmium commented Feb 13, 2015

    It looks like the C _io close() implementations delegate to the wrapped object’s undocumented _dealloc_warn() method to emit the warning if “self->finalizing” is set. For wrapped objects like BytesIO that do not have it, I guess the error due to the missing method is ignored.

    There is no Python implementation of FileIO yet (see bpo-21859), so the Python file wrapper classes still use the C FileIO implementation, which emits a resource warning. However for the wrapper classes, _pyio.IOBase.__del__() explicitly invokes the wrapper’s close() method, which will invoke FileIO.close() method, which bypasses the warning.

    Similarly, any object inheriting from the C implmentation’s _io.IOBase will also have close() explicitly invoked when being finalized, bypassing any warnings of the underlying wrapped object. The SocketIO instances returned by socket.makefile() are an example of this.

    @vadmium
    Copy link
    Member

    vadmium commented Feb 14, 2015

    One option would be for any wrapper-type class (e.g. BufferedReader, SocketIO) to override __del__(). Instead of calling close(), it should call detach() or equivalent, and delete the returned reference to the underlying wrapped object without explicitly closing it, to allow a warning to be generated. Maybe a WrappedIO mixin class could help with this.

    I am posting del-detach.patch which roughly implements the above option. It only seems to break the test_override_destructor() tests for the TextIOWrapperTest and CommonBufferedTests classes, which expect close() to be called.

    Another more radical option might be to make IOBase.__del__() only call flush(), not close(), except for the actual implementations that really should emit a warning, like FileIO. I will have a go at doing this option in another patch.

    Both these options would change the behaviour in some cases that rely on the garbage collector to close objects. They would defer the automatic closing until all references of the underlying object are garbage collected, instead of just when the wrapped object is garbage collected. A third option would be to add explicit __del__() methods emitting a warning in each class, along with any exception supression needed due to Python interpreter finalization quirks.

    @vadmium
    Copy link
    Member

    vadmium commented Feb 14, 2015

    Posting del-flush.patch, which only calls flush() instead of close() from __del__(), except for the FileIO class. Quick analysis of resulting test failures:

    These tests fail because they are obviously written to test that __del__ calls close():

    • IOTest.test_IOBase_destructor
    • IOTest.test_RawIOBase_destructor
    • IOTest.test_BufferedIOBase_destructor
    • IOTest.test_TextIOBase_destructor
    • test_override_destructor as mentioned above

    CBufferedReader/Writer/RandomTest.test_garbage_collection() fail because they do not free the reference to the wrapped raw file object. Similarly, CTextIOWrapperTest.test_garbage_collection() holds references to both levels of wrapped object.

    TextIOWrapperTest.test_destructor() fails because it assumes the wrapped close() will be called. The data is still written to the wrapped file though.

    MiscIOTest.test_warn_on_dealloc/_fd() fail because the warning message only refers to the FileIO object rather than the wrapper object.

    @vstinner
    Copy link
    Member Author

    Attached patch modifies _pyio to mimick better the reference io module:

    • Add IOBase._finalizing
    • IOBase.__del__() sets _finalizing to True
    • Add FileIO._dealloc_warn()
    • Remove FileIO.__del__()
    • FileIO.close(), _BufferedIOMixin.close() and TextIOWrapper.close() now calls _dealloc_warn() if _finalizing is true
    • Override closed() method in FileIO: needed because FileIO.close() now calls super().close() before logging the ResourceWarning, and the file is expected to see open in the warning
    • FileIO.close() now calls super().close() *before* closing the file descriptor

    I added hasattr(.., '_dealloc_warn') sanity checks because _dealloc_warn() are added to concrete classes, not to base classes.

    @vstinner
    Copy link
    Member Author

    (oops, there was a typo in my first patch)

    @vstinner
    Copy link
    Member Author

    Oh, the io module is more complex than what I expected :-) test_io fails with pyio_res_warn-2.patch.

    test_io now pass with pyio_res_warn-3.patch. I fixed FileIO.close() to always call super().close(), and set self._fd to -1 even if logging the ResourceWarning raised an exception.

    @vstinner
    Copy link
    Member Author

    Try attached res_warn.py script to test manually all ResourceWarning warnings.

    Ouput of python3.6 -Wd res_warn.py:
    --------------------------
    text
    <_pyio.TextIOWrapper name='/etc/issue' mode='r' encoding='UTF-8'>
    res_warn.py:9: ResourceWarning: unclosed file <_pyio.FileIO name='/etc/issue' mode='rb' closefd=True>
    f = None

    buffered
    <_pyio.BufferedReader name='/etc/issue'>
    res_warn.py:15: ResourceWarning: unclosed file <_pyio.FileIO name='/etc/issue' mode='rb' closefd=True>
    f=None

    raw
    <_pyio.FileIO name='/etc/issue' mode='rb' closefd=True>
    res_warn.py:21: ResourceWarning: unclosed file <_pyio.FileIO name='/etc/issue' mode='rb' closefd=True>
    f=None

    fileio
    <_pyio.FileIO name='/etc/issue' mode='rb' closefd=True>
    res_warn.py:27: ResourceWarning: unclosed file <_pyio.FileIO name='/etc/issue' mode='rb' closefd=True>
    f=None

    --------------------------

    Ouput of python3.6 -X tracemalloc=2 -Wd res_warn.py:
    --------------------------
    text
    <_pyio.TextIOWrapper name='/etc/issue' mode='r' encoding='UTF-8'>
    res_warn.py:9: ResourceWarning: unclosed file <_pyio.FileIO name='/etc/issue' mode='rb' closefd=True>
    f = None
    Object allocated at (most recent call first):
    File "/home/haypo/prog/python/default/Lib/_pyio.py", lineno 209
    closefd, opener=opener)
    File "res_warn.py", lineno 7
    f = _pyio.open(fn, "r")

    buffered
    <_pyio.BufferedReader name='/etc/issue'>
    res_warn.py:15: ResourceWarning: unclosed file <_pyio.FileIO name='/etc/issue' mode='rb' closefd=True>
    f=None
    Object allocated at (most recent call first):
    File "/home/haypo/prog/python/default/Lib/_pyio.py", lineno 209
    closefd, opener=opener)
    File "res_warn.py", lineno 13
    f = _pyio.open(fn, "rb")

    raw
    <_pyio.FileIO name='/etc/issue' mode='rb' closefd=True>
    res_warn.py:21: ResourceWarning: unclosed file <_pyio.FileIO name='/etc/issue' mode='rb' closefd=True>
    f=None
    Object allocated at (most recent call first):
    File "/home/haypo/prog/python/default/Lib/_pyio.py", lineno 209
    closefd, opener=opener)
    File "res_warn.py", lineno 19
    f = _pyio.open(fn, "rb", 0)

    fileio
    <_pyio.FileIO name='/etc/issue' mode='rb' closefd=True>
    res_warn.py:27: ResourceWarning: unclosed file <_pyio.FileIO name='/etc/issue' mode='rb' closefd=True>
    f=None
    Object allocated at (most recent call first):
    File "res_warn.py", lineno 25
    f = _pyio.FileIO(fn, "rb")

    --------------------------

    @vstinner
    Copy link
    Member Author

    del-detach.patch: I don't understand the change in _pyio.py. I think that the change on socket.py is no more need since the change 46329eec5515 (issue bpo-26590).

    del-flush.patch:

    • change on IOBase.__del__() change the behaviour compared to the io module, I don't think that it's correct. I may also break backward compatibility :-/
    • change on fileio.c implements a finalizer for FileIO: io.IOBase already implements a finalizer which calls close(), I'm not sure that this change is needed?
    • change on iobase.c removes io.IOBase finalizer. I don't understand this. It breaks ResourceWarning on io.Buffered* and io.TextIOWrapper classes, no?

    Well, I guess that these two patches were written before Serhiy reimplemented the FileIO in pure Python, and this change outdated your two patches.

    @serhiy-storchaka
    Copy link
    Member

    I don't like _dealloc_warn() at all. It looks as a dirty hack and doesn't work with custom file-like objects.

    I think we should use one of Martin's option. If there are any issues with garbage collecting, we should fix the garbage collector.

    @vstinner
    Copy link
    Member Author

    I don't like _dealloc_warn() at all. It looks as a dirty hack and doesn't work with custom file-like objects.

    I tried to support emitting a ResourceWarning when the close() method is overriden in a subclass, but then I saw that io doesn't support this use case neither:
    ---

    import io
    import os
    
    class MyFile(io.FileIO):
        def close(self):
            os.close(self.fileno())
    
    f = MyFile('/etc/issue')
    f = None

    => no warning

    Ok, now I have to read again Martin's patches since I didn't understand them at the first read :-)

    @vstinner
    Copy link
    Member Author

    I think we should use one of Martin's option. If there are any issues with garbage collecting, we should fix the garbage collector.

    Ok, here is a new simpler proposition: remove all _dealloc_warn() method, and simply emit the ResourceWarning in IOBase finalizer.

    With this change, all objects which inherit from io.IOBase (or _pyio.IOBase) now emits a ResourceWarning if they are not closed explicitly. You have to override the __del__() method to prevent this warning.

    A lot of new objects start to log ResourceWarning: io.SocketIO, _pyio.TextIOWrapper, io.BytesIO, http.client.HTTPResponse, etc.

    For io.BytesIO, I fixed the code to inherit correctly IOBase finalizer.

    --

    A lot of tests start to fail because they emit a lot of ResourceWarning warnings. I don't know yet if it's a feature or a bug :-)

    --

    With the patch, Python starts to logs warnings like:

    sys:1: ResourceWarning: unclosed file <_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>
    sys:1: ResourceWarning: unclosed file <_io.TextIOWrapper name='<stdin>' mode='r' encoding='UTF-8'>

    You can use my patch of the issue bpo-26642 to fix these warnings.

    ResourceWarning is emited even if FileIO.closefd is false.

    Note: It looks like the ResourceWarning is not always displayed. Yet another bug?

    @vstinner
    Copy link
    Member Author

    For io.BytesIO, I fixed the code to inherit correctly IOBase finalizer.

    Oh, I forgot to include it in my patch. But it's maybe better to have it in a different patch: see attached bytesio_stringio.patch which changes also io.StringIO to inherit the IOBase finalizer.

    Emitting a ResourceWarning in io.BytesIO and io.StringIO adds a *lot* of warnings when running the Python test suite. Since these objects only contain memory, no very limited resource like file descriptors, it's probably better to *not* log ResourceWarning for them.

    @vstinner
    Copy link
    Member Author

    Ok, I now understand the problem better. They are two kinds of io objects:

    (1) object which directly or indirectly owns a limited resource like file descriptor => must emit a ResourceWarning
    (2) object which don't own a limited resource => no ResourceWarning must be logged

    Examples of (1): FileIO, BufferedReader(FileIO), TextIOWrapper(BufferedReader(FileIO))

    Examples of (2): BytesIO, BuffereadReader(BytesIO), TextIOWrapper(BuffereadReader(BytesIO))

    The tricky part is to decide if an object owns a limited resource or not. Currently, the io module uses the _dealloc_warn() trick. A close() method tries to call _dealloc_warn(), but it ignores any exception when calling _dealloc_warn(). BufferedReader calls raw._dealloc_warn(). TextIOWrapper calls buffer._dealloc_warn().

    For case (1), BufferedReader(FileIO).close() calls FileIO._dealloc_warn() => ResourceWarning is logged

    For case (2), BufferedReader(BytesIO).close() calls BytesIO._dealloc_warn() raises an AttributeError => no warning is logged

    Well, we can call this a hack, but it works :-) pyio_res_warn-3.patch implements the same hack in _pyio.

    @serhiy-storchaka
    Copy link
    Member

    1. What if only object which directly owns a limited resource will emit a ResourceWarning? This can break some tests, but may be these tests are too strong? Current GC may be better than when io classes were implemented.

    2. An object which indirectly owns a limited resource don't know about this if use only public API. It can known about this only if the wrapped object provides _dealloc_warn(). Thus _dealloc_warn() extends the API. If allows an object which indirectly owns a limited resource emit a resource warning, it may be worth to expose _dealloc_warn() or something equivalent as public API.

    @vadmium
    Copy link
    Member

    vadmium commented Mar 25, 2016

    In the long term, I prefer not calling close() in __del__(), like del-flush.patch, although I admit it is an API change and should probably be limited to 3.6+. If people want to improve things in 3.5, privately implementing _dealloc_warn() like Victor’s pyio_res_warn-3.patch seems the best option.

    Serhiy: why did you add 2.7 to this bug? For 2.7, I don’t think anything should be done. There is no ResourceWarning in 2.7.

    In 3.5, _dealloc_warn() could also be implemented in SocketIO (possibly also HTTPResponse, GzipFile, etc). But it should not become a public API, and I don’t think it is important to make this sort of change to 3.5 anyway.

    In 3.6, if we stopped __del__() from calling close() like del-flush.patch, we would be changing the documented behaviour: <https://docs.python.org/3.6/library/io.html#io.IOBase.\_\_del__\> says “IOBase . . . calls the instance’s close() method.” But the end result seems cleaner to me. I think changing or adding an API (del, _dealloc_warn, etc) is necessary for a general solution to the problem.

    The effect of del-flush.patch will be that a wrapped FileIO or similar object will not be closed until all references are deleted.

    >>> file = open(os.devnull, "wb", 0)
    >>> print(file)
    <_io.FileIO name='/dev/null' mode='wb' closefd=True>
    >>> wrapper = BufferedWriter(file)

    In 3.5, deleting the wrapper produces a warning and closes the underlying file:

    >>> del wrapper
    __main__:1: ResourceWarning: unclosed file <_io.BufferedWriter name='/dev/null'>
    >>> print(file)
    <_io.FileIO [closed]>
    >>> file.write(b"more data")
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    ValueError: I/O operation on closed file

    I propose that in 3.6, deleting a wrapper should not automatically close or warn about the wrapped object until all references are deleted:

    >>> del wrapper  # No problem; we still have a reference to "file"
    >>> file.write(b"more data")  # File is still open
    9
    >>> del file  # Finally closes the file and triggers the warning
    ResourceWarning: unclosed file <FileIO>

    @serhiy-storchaka
    Copy link
    Member

    Serhiy: why did you add 2.7 to this bug? For 2.7, I don’t think anything should be done. There is no ResourceWarning in 2.7.

    Just for the case.

    Not calling close() in __del__() is one option, and it looks attractive. But there are possible hidden catches. There is no guarantee that flush() writes all buffered data, some stateful encoder or compressor can left some data in the buffer unless explicitly closed. close() can write required footer or close connection. If not call close() in __del__(), GzipFile will produce incorrect file.

    May be this is appropriate. This is why explicit close() should be called or context manager should be used.

    Other option is to use some API through all closable objects. _dealloc_warn() is not the only option.

    1. Add an optional parameter to close() methods to denote that it is called from __del__. This is backward incompatible option, it is too late to do this.

    2. __del__() calls special method instead of close() if exists.

    3. The _dealloc_warn() option (the name should be changed). The wrapper delegates emitting a warning to wrapped object by calling the _dealloc_warn() method if exists. Unlike to previous option this method is called in addition to close(), not instead.

    4. The wrapper checks special attribute (or calls a method) of wrapped object. If it exists and is true, the wrapper emits a warning.

    @vstinner
    Copy link
    Member Author

    While it would be great to also emit ResourceWarning, in the last 8 years, nobody managed to implement the feature. Also, this issue has no activity. I prefer to close the issue.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    performance Performance or resource usage topic-IO
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants