Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Built-in module _io can lose data from buffered files in reference cycles #62052

Closed
arigo mannequin opened this issue Apr 27, 2013 · 75 comments
Closed

Built-in module _io can lose data from buffered files in reference cycles #62052

arigo mannequin opened this issue Apr 27, 2013 · 75 comments
Assignees
Labels
3.7 (EOL) end of life topic-IO type-bug An unexpected behavior, bug, or error

Comments

@arigo
Copy link
Mannequin

arigo mannequin commented Apr 27, 2013

BPO 17852
Nosy @tim-one, @nascheme, @pitrou, @vstinner, @njsmith, @methane, @vadmium, @serhiy-storchaka, @jstasiak, @xgid, @nitishch, @Arusekk, @miss-islington, @Volker-Weissmann
PRs
  • bpo-17852: Maintain a list of BufferedWriter objects. Flush them on exit. #1908
  • Revert "bpo-17852: Maintain a list of BufferedWriter objects. Flush them on exit." #3337
  • bpo-17852: Maintain a list of BufferedWriter objects. Flush them on exit. #3372
  • bpo-17852: Maintain a list of BufferedWriter objects. Flush them on exit. #3372
  • bpo-17852: Revert incorrect fix based on misunderstanding of _Py_PyAtExit() semantics #4826
  • bpo-17852: Fix PR #3372, flush BufferedWriter objects on exit. #4847
  • bpo-17852: Fixed the documentation about closing files #23135
  • [3.9] bpo-17852: Doc: Fix the tutorial about closing files (GH-23135) #23527
  • [3.8] bpo-17852: Doc: Fix the tutorial about closing files (GH-23135) #23528
  • Files
  • y.py
  • gcio.py
  • pyio.diff
  • buffer_crash.py
  • buffer_not_flushed.py
  • buffer_register_flush.txt
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/nascheme'
    closed_at = None
    created_at = <Date 2013-04-27.07:33:30.917>
    labels = ['type-bug', '3.7', 'expert-IO']
    title = 'Built-in module _io can lose data from buffered files in reference cycles'
    updated_at = <Date 2020-12-01.10:53:50.492>
    user = 'https://github.com/arigo'

    bugs.python.org fields:

    activity = <Date 2020-12-01.10:53:50.492>
    actor = 'miss-islington'
    assignee = 'nascheme'
    closed = False
    closed_date = None
    closer = None
    components = ['IO']
    creation = <Date 2013-04-27.07:33:30.917>
    creator = 'arigo'
    dependencies = []
    files = ['30030', '37357', '37359', '47121', '47332', '47340']
    hgrepos = []
    issue_num = 17852
    keywords = ['patch']
    message_count = 71.0
    messages = ['187890', '187898', '187903', '187922', '187966', '188075', '188077', '188095', '188130', '188137', '231376', '231386', '231391', '231420', '232103', '232109', '232113', '232114', '232116', '232117', '232118', '232120', '232121', '232122', '232123', '232124', '232126', '232127', '232129', '232131', '232132', '232133', '232134', '232137', '232139', '232140', '232146', '294868', '294910', '294956', '301295', '301297', '301329', '301331', '301353', '301361', '301363', '302758', '305653', '305665', '308170', '308177', '308236', '308330', '308332', '308412', '308417', '308419', '308424', '308666', '308668', '308681', '308787', '310602', '310648', '310659', '355064', '381924', '381925', '381926', '382232']
    nosy_count = 14.0
    nosy_names = ['tim.peters', 'nascheme', 'pitrou', 'vstinner', 'njs', 'methane', 'martin.panter', 'serhiy.storchaka', 'jstasiak', 'xgdomingo', 'nitishch', 'Arusekk', 'miss-islington', 'Volker Wei\xc3\x9fmann']
    pr_nums = ['1908', '3337', '3372', '3372', '4826', '4847', '23135', '23527', '23528']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue17852'
    versions = ['Python 3.6', 'Python 3.7']

    Linked PRs

    @arigo
    Copy link
    Mannequin Author

    arigo mannequin commented Apr 27, 2013

    In Python 2, a buffered file opened for writing is flushed by the C library when the process exit. In Python 3, the _pyio and _io modules don't do it reliably. They rely on __del__ being called, which is not neccesarily the case. The attached example ends with the file 'foo' empty (tested with Python 3.3 and the current trunk). The same example in Python 2 using the built-in open() cannot end with 'foo' empty.

    @arigo arigo mannequin added topic-IO type-bug An unexpected behavior, bug, or error labels Apr 27, 2013
    @pitrou
    Copy link
    Member

    pitrou commented Apr 27, 2013

    There are actually several issues here:

    • collection of globals at shutdown is wonky: you should add an explicit "del a,f; gc.collect()" at the end of the script

    • order of tp_clear calls, or another issue with TextIOWrapper: if you open the file in binary mode ("f = open('...', 'wb'); f.write(b'bar')"), the data gets flushed during the GC run

    @neologix
    Copy link
    Mannequin

    neologix mannequin commented Apr 27, 2013

    Code relying on garbage collection/shutdown hook to flush files is borked:

    • it's never been guaranteed by the Python specification (neither does Java/C#...)
    • even with an implementation based on C stdio streams, streams won't get flushed in case of _exit()/abort()/asynchronous signal

    Furthermore, I think it's inherently unsafe: in a multi-threaded program, flushing requires locking, and doing this at shutdown is a recipe for deadlock/arbitrary delay (image you have a daemon thread currently holding the lock, e.g. for stdout).

    In short, that's an application bug to me.

    @arigo
    Copy link
    Mannequin Author

    arigo mannequin commented Apr 27, 2013

    It used to be a consistently reliable behavior in Python 2 (and we made it so in PyPy too), provided of course that the process exits normally; but it no longer is in Python 3. Well I can see the reasons for not flushing files, if it's clearly documented somewhere as a change of behavior from Python 2.

    However I'm complaining about the current behavior: files are flushed *most of the time*. That's a behavior that is clearly misleading, or so I would think. I'm rather sure that there are many small scripts and large programs out there relying on automatic flushing, and then one day they'll hit a case where the file is not flushed and get the worst kind of error: a file unexpectedly truncated at 99% of its length, in a way that cannot be reproduced by small examples.

    Feel free to close anyway as not-a-bug; I won't fight the Python 3 behavior, because Python 2 works as expected.

    @neologix
    Copy link
    Mannequin

    neologix mannequin commented Apr 28, 2013

    It used to be a consistently reliable behavior in Python 2 (and we made it so in PyPy too), provided of course that the process exits normally; but it no longer is in Python 3. Well I can see the reasons for not flushing files, if it's clearly documented somewhere as a change of behavior from Python 2.

    When you say Python 2, I assume you mean CPython 2, right?
    Because - AFAICT - files got flushed only by accident, not by design.
    For example, I suspect that Jython doesn't flush files on exit (since
    the JVM doesn't), and I guess IronPython neither.

    However I'm complaining about the current behavior: files are flushed *most of the time*.

    That's the problem with implementation-defined behavior ;-)

    @vstinner
    Copy link
    Member

    "In Python 2, a buffered file opened for writing is flushed by the C library when the process exit."

    "When you say Python 2, I assume you mean CPython 2, right?
    Because - AFAICT - files got flushed only by accident, not by design."

    It looks to be a feature of the standard C library, at least the GNU libc. Its libio library installs an exit handler flushing all open files. You can see it if you set a breaking on write() using gdb:

    Breakpoint 5, write () at ../sysdeps/unix/syscall-template.S:82
    82 ../sysdeps/unix/syscall-template.S: Aucun fichier ou dossier de ce type.

    (gdb) where
    #0 write () at ../sysdeps/unix/syscall-template.S:82
    #1 0xb7e33ba5 in _IO_new_file_write (f=0x8254ef0, data=0x81f9798, n=3) at fileops.c:1289
    #2 0xb7e33a84 in new_do_write (fp=0x8254ef0, data=0x81f9798 "bar\267\300\207\366\267\220\227\037\b\220\227\037\b\004d", to_do=3) at fileops.c:543
    #3 0xb7e350fe in _IO_new_do_write (fp=0x8254ef0, data=0x81f9798 "bar\267\300\207\366\267\220\227\037\b\220\227\037\b\004d", to_do=3) at fileops.c:516
    #4 0xb7e354b5 in _IO_new_file_overflow (f=0x8254ef0, ch=-1) at fileops.c:894
    #5 0xb7e36c40 in _IO_flush_all_lockp (do_lock=0) at genops.c:849
    #6 0xb7e36d8e in _IO_cleanup () at genops.c:1010
    #7 0xb7df5f92 in __run_exit_handlers (status=0, listp=0xb7f683e4, run_list_atexit=true) at exit.c:91
    #8 0xb7df5fdd in __GI_exit (status=0) at exit.c:100
    #9 0xb7ddc4db in __libc_start_main (main=0x8058f90 <main>, argc=2, ubp_av=0xbffff1b4, init=0x8156960 <__libc_csu_init>, fini=0x81569d0 <__libc_csu_fini>, rtld_fini=0xb7fed280 <_dl_fini>, stack_end=0xbffff1ac)
    at libc-start.c:258
    #10 0x08058fd1 in _start ()

    Source of _IO_flush_all_lockp() in the GNU libc:

    http://sourceware.org/git/?p=glibc.git;a=blob;f=libio/genops.c;h=390d8d24b5fb04f6a35b8fec27e700b9a9d641d4;hb=HEAD#l816

    So the glibc maintains a list of open "FILE*" files: _IO_list_all, which is protected by list_all_lock lock.

    @neologix
    Copy link
    Mannequin

    neologix mannequin commented Apr 29, 2013

    "When you say Python 2, I assume you mean CPython 2, right?
    Because - AFAICT - files got flushed only by accident, not by design."

    It looks to be a feature of the standard C library, at least the GNU libc. Its libio library installs an exit handler flushing all open files. You can see it if you set a breaking on write() using gdb:

    Yes, it's guaranteed by POSIX/ANSI (see man exit).
    I was refering to the fact that the automatic flushing of files upon
    exit is a mere side effect of the implementation based atop stdio
    stream in cpython 2. It's no guaranteed by any Python spec (and I
    can't really think of any platform other than C that makes such
    guarantee).

    @vstinner
    Copy link
    Member

    > It looks to be a feature of the standard C library, at least the GNU libc.
    Yes, it's guaranteed by POSIX/ANSI (see man exit).

    Hum, POSIX (2004) is not so strict:
    "Whether open streams are flushed or closed, or temporary files are
    removed is implementation-defined."
    http://pubs.opengroup.org/onlinepubs/009695399/functions/exit.html

    2013/4/29 Charles-François Natali <report@bugs.python.org>:

    Charles-François Natali added the comment:

    > "When you say Python 2, I assume you mean CPython 2, right?
    > Because - AFAICT - files got flushed only by accident, not by design."
    >
    > It looks to be a feature of the standard C library, at least the GNU libc. Its libio library installs an exit handler flushing all open files. You can see it if you set a breaking on write() using gdb:

    Yes, it's guaranteed by POSIX/ANSI (see man exit).
    I was refering to the fact that the automatic flushing of files upon
    exit is a mere side effect of the implementation based atop stdio
    stream in cpython 2. It's no guaranteed by any Python spec (and I
    can't really think of any platform other than C that makes such
    guarantee).

    ----------


    Python tracker <report@bugs.python.org>
    <http://bugs.python.org/issue17852\>


    @neologix
    Copy link
    Mannequin

    neologix mannequin commented Apr 30, 2013

    Hum, POSIX (2004) is not so strict:
    "Whether open streams are flushed or closed, or temporary files are
    removed is implementation-defined."
    http://pubs.opengroup.org/onlinepubs/009695399/functions/exit.html

    You're looking at the wrong section:

    "The exit() function shall then flush all open streams with unwritten
    buffered data, close all open streams"

    Then a couple lines below:

    "The _Exit() [CX] and _exit() functions shall not call functions
    registered with atexit() nor any registered signal handlers. Whether
    open streams are flushed or closed, or temporary files are removed is
    implementation-defined."

    It's guaranteed for exit(), not _Exit()/_exit().

    @vstinner
    Copy link
    Member

    It's guaranteed for exit(), not _Exit()/_exit().

    Oops ok, thanks.

    2013/4/30 Charles-François Natali <report@bugs.python.org>:

    Charles-François Natali added the comment:

    > Hum, POSIX (2004) is not so strict:
    > "Whether open streams are flushed or closed, or temporary files are
    > removed is implementation-defined."
    > http://pubs.opengroup.org/onlinepubs/009695399/functions/exit.html

    You're looking at the wrong section:

    "The exit() function shall then flush all open streams with unwritten
    buffered data, close all open streams"

    Then a couple lines below:

    "The _Exit() [CX] and _exit() functions shall not call functions
    registered with atexit() nor any registered signal handlers. Whether
    open streams are flushed or closed, or temporary files are removed is
    implementation-defined."

    It's guaranteed for exit(), not _Exit()/_exit().

    ----------


    Python tracker <report@bugs.python.org>
    <http://bugs.python.org/issue17852\>


    @serhiy-storchaka
    Copy link
    Member

    In 3.4+ the example script always writes string "bar" to file "foo". Tested by running it in a loop hundreds times. Cleaning up at shutdown was enhanced in 3.4.

    @vstinner
    Copy link
    Member

    Is anyone interested to work on the "maintain a list of open file objects" idea?

    I consider that Python 3 does its best to flush data at exit, but it's not a good practice to rely on the destructors to flush data. I mean, there is no warranty that files will be written in the correct order and that files will stay consistent. It's much safer to explicitly call the clode() method.

    By the way, Python 3 now emits ResourceWarning warnings when files are destroyed without being explicitly closed. I enhanced the code handling warnings during Python shutdown to show these warnings in more use cases, but it is still not perfect. I opened the issue bpo-21788 as a reminder that the code can still be enhanced.

    For the "maintain a list of objects" idea, I expect race conditions with threads, signals and processes (fork). We already have known issues with locks + fork. I expect even more issues if we maintain a list of open files.

    I propose to close the issue as "wontfix". Python 3 does its best, but please call explicitly the close() method!

    Maybe the undefined behaviour should be documented (with a warning?) in the buffered writer of the io module, and maybe also in the open() function?

    @serhiy-storchaka
    Copy link
    Member

    I'm +1 on closing this. Agree with Charles-François that it's never been guaranteed by the Python specification. Pythonic way to work with files is to use the "with" statement, or, if you need long living file stream, careful close files in the "finally" block or in __exit__ method of some class which is used as context manager. Non-observance of this rule can be considered as a bug (see bpo-22831).

    @arigo
    Copy link
    Mannequin Author

    arigo mannequin commented Nov 20, 2014

    Victor: there is the GIL, you don't need any locking.

    @nikratio
    Copy link
    Mannequin

    nikratio mannequin commented Dec 4, 2014

    Serhiy, I believe this still happens in Python 3.4, but it is harder to reproduce. I couldn't get Armin's script to produce the problem either, but I'm pretty sure that this is what causes e.g. https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=771452#60.

    @neologix
    Copy link
    Mannequin

    neologix mannequin commented Dec 4, 2014

    Serhiy, I believe this still happens in Python 3.4, but it is harder to reproduce. I couldn't get Armin's script to produce the problem either, but I'm pretty sure that this is what causes e.g. https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=771452#60.

    Maybe, as it's been said several times, it's an application bug: files
    should be closed explicitly (or better with a context manager of
    course).

    @arigo
    Copy link
    Mannequin Author

    arigo mannequin commented Dec 4, 2014

    I hate to repeat myself, but if the C standard says "files are flushed at exit"; if Python 2 follows this standard; and if Python 3 follows it most of the time but *not always*... then it seems to me that something is very, very buggy in the worst possible way (rare data loss).

    @serhiy-storchaka
    Copy link
    Member

    Do you have working in 3.4+ example?

    @arigo
    Copy link
    Mannequin Author

    arigo mannequin commented Dec 4, 2014

    If I understood correctly, Python 3.4 tries harder to find cycles and call destructors at the end of the program, but that's not a full guarantee. For example you can have a reference from a random C extension module.

    While trying to come up with an example, I found one that I don't fully understand, but the point is that it shows how easy it is to defeats it:

        import sys
        f = open('foo.txt', 'w')
        f.write('abc')
        def func(*args):
            return func
        sys.settrace(func)

    @arigo
    Copy link
    Mannequin Author

    arigo mannequin commented Dec 4, 2014

    (Ah, it's probably a reference from the trace function -> func_globals -> f).

    @pitrou
    Copy link
    Member

    pitrou commented Dec 4, 2014

    Here is an example.

    @pitrou
    Copy link
    Member

    pitrou commented Dec 4, 2014

    Note the example I posted doesn't involve the shutdown sequence. It calls gc.collect() explicitly.

    @arigo
    Copy link
    Mannequin Author

    arigo mannequin commented Dec 4, 2014

    To add to the confusion: Antoine's example produces an empty file on the current trunk cd282dd0cfe8. When I first tried it on a slightly older trunk (157 changesets ago), it correctly emitted a file with "barXXX ", but only if "gc.collect()" was present. Without "gc.collect()" it didn't. This seems at odds with the explanation that "gc.collect()" should be what occurs at shut-down now.

    @pitrou
    Copy link
    Member

    pitrou commented Dec 4, 2014

    The problem is that the order of tp_finalize calls is arbitrary when there is a reference cycle (same thing, of course, with tp_clear). So depending on the exact layout of the garbage list, the TextIOWrapper could be collected before the BufferedWriter and the FileIO (good), or after (bad).

    I don't see an easy way to solve this. Either _io should provide hints to the GC (which kind of hints?), or the tp_finalize should be customized to somehow call a dependent wrapper's tp_finalize (how? by keeping a weakref?).

    @arigo
    Copy link
    Mannequin Author

    arigo mannequin commented Dec 4, 2014

    Maybe accepting the fact that relying on finalizers is a bad idea here?

    @pitrou
    Copy link
    Member

    pitrou commented Dec 4, 2014

    You mean encourage people to explicitly use "with" or call close()? Yes, that's the simplest answer. It's why we added ResourceWarning.

    @vstinner
    Copy link
    Member

    vstinner commented Dec 4, 2014

    The problem is that the order of tp_finalize calls is arbitrary when there is a reference cycle

    If we want to guaranty that all files are properly flushed at exit, the best option is to maintain a list of open files.

    I'm not interested to implement that, it sounds too complex. For example, open("text.txt", "w") creates 3 objects: FileIO, BufferedWriter, TextIOWrapper. You have to register these 3 objects and flush them in the right order. Python is dynamic, you are free to modify objects, for example using the detach() method or by modifying the buffer attribute of the TextIOWrapper.

    How do you decide which object should be flushed? In which order? We have to take care of signals, threads and forks, stay portable, etc.

    @pitrou
    Copy link
    Member

    pitrou commented Dec 4, 2014

    Le 04/12/2014 12:05, STINNER Victor a écrit :

    > The problem is that the order of tp_finalize calls is arbitrary
    > when
    there is a reference cycle

    If we want to guaranty that all files are properly flushed at exit,
    the best option is to maintain a list of open files.

    I am not talking about files flushing at exit.

    @pitrou pitrou reopened this Dec 12, 2017
    @vstinner
    Copy link
    Member

    New changeset 317def9 by Victor Stinner (Antoine Pitrou) in branch 'master':
    bpo-17852: Revert incorrect fix based on misunderstanding of _Py_PyAtExit() semantics (bpo-4826)
    317def9

    @nascheme
    Copy link
    Member

    I created a new PR which uses the atexit module instead of using _Py_PyAtExit. I think registering in io.py is okay.

    I see that atexit is now implemented in C. Rather than registering in io.py, we could create a C API to register callbacks (i.e. atexit_register). That would perhaps be a bit cleaner.

    @nascheme
    Copy link
    Member

    Attached is a script that triggers the non-flushing behaviour for me. I don't think it is reliable since it depends on the order that FileIO AND BufferedWriter are finalized when the gc finds them in a reference cycle.

    BTW, it is arguable that the root cause of this bug is that we no longer to topological ordering when calling finalizers. Originally, the cycle GC would refuse to call __del__ methods because once they are part of a cycle, there is no defined topological ordering. So, we linked that garage to gc.garbage rather than calling __del__ and have potentially undefined results. Now the GC has been changed to call __del__ anyhow. I haven't studied how this has been changed but this non-flushing bug is a result. Having the buffer added to gc.garbage would also result in the data not being flushed but arguably it would be more understandable what's going on. I'm not arguing that we should go back to that, just that current behaviour can be subtle and confusing.

    @pitrou
    Copy link
    Member

    pitrou commented Dec 14, 2017

    The reason Python 2 did well here is simply that Python 2 had a single Python object (the file object) for each actual file. Python 3 has several of them (the raw IO object, the buffered IO object, possibly the text IO wrapper), and so suddenly the finalization order matters.

    @nascheme
    Copy link
    Member

    In the process of trying to write a test for this, I now realize that PR 4847 is not really a fix. If the underlying file gets closed during an earlier gc.collect() and not during shutdown, the extra flushing step is not going to help. So, using atexit is not really a fix. Maybe there is something else we can do, need to think about it more.

    @serhiy-storchaka
    Copy link
    Member

    I'm not well known with GC, but if it be possible to call destructors in the reversed chronological order, this could help with this issue.

    @nascheme
    Copy link
    Member

    Using reversed chronological order would work in 99% of the cases probably but then you would have that rare case where it didn't work. So, I'm not too keen on that approach.

    I think this is a classic problem with finalization and GC, probably should do some actual reading of literature. One idea that occurs to me is that the underlying file object could keep a weakref to the higher level (e.g. buffer) objects. Then, if the underlying file gets closes, it could make sure to call flush on the higher level objects first. The weakref would ensure that a ref cycle is not created.

    @nikratio
    Copy link
    Mannequin

    nikratio mannequin commented Dec 15, 2017

    Given the apparent difficulties getting this right, how about not attempting to implicitly flush buffers in the finalizer at all? This means scripts relying on this will break, but in contrast to the current behavior they will break consistently so it's easy to find and fix the problem.

    @nascheme
    Copy link
    Member

    Welp, another day another attempt. As mentioned in the PR 4847, atexit is not the answer. If the raw/buffered file pair are part of a reference cycle and the GC cleans it before atexit runs, then the buffered data can get lost.

    I attempted to implement my weakref idea (i.e. raw file keeps a weakref to the buffered file, calls flush before the raw file gets closed). That doesn't work either because the GC clears the weakref before calling __del__.

    The attached patch "buffer_register_flush.txt" does seem to work. The downside is that it creates a reference cycle between the raw and buffered file objects. Perhaps that is not a problem since unless you call close() on the raw file, you will be leaking resources anyhow. In the patch, calling close() removes the reference cycle.

    I still feel like this is worth fixing, as ugly as the implementation is.

    @nascheme nascheme self-assigned this Dec 19, 2017
    @pitrou
    Copy link
    Member

    pitrou commented Dec 19, 2017

    I think it would be quite disruptive to create a reference cycle each time open() is called. It may also break user scripts.

    @nascheme
    Copy link
    Member

    Yeah, I think you are correct. Currently files not part of reference cycles get properly flushed and closed by the reference counter. Implementing my "buffer_register_flush" patch would cause files to be closed only by the cyclic garbage collector (if not explicitly closed). That would mean a script that opens a large number of files could run out of file descriptors. Basically, we lose one of the main advantages of reference counting.

    Probably the GC should keep track of how many files are open and call collect() when a threshold is reached. Still, forcing every file to be collected only by the cyclic GC seems too ugly of a solution to this issue.

    @pitrou
    Copy link
    Member

    pitrou commented Dec 20, 2017

    Here's another idea: have a new type field (a tp_something) that exposes the GC priority for this type. It could be an integer between -8 and 7, for example (0 being the default). Then tp_finalize would be called in sorted priority order.

    The tricky part of course is to make this acceptable performance-wise (i.e. avoid walking the collectable list 16 times, once for each priority level). For example by creating bucket lists for each priority level (only for objects with a non-NULL tp_finalize).

    @Arusekk
    Copy link
    Mannequin

    Arusekk mannequin commented Jan 24, 2018

    Since the issue seems to have been active lately, may I suggest my view on solving it.

    One solution that comes to my mind is to keep a weak reference to the file, and to register an atexit function per file (and to unregister it when the file is closed). Example concept-illustrating python code for the _pyio module:

    import atexit, weakref

    # ...

    class TextIOWrapper(TextIOBase):
        def __init__(self):
            # ...
            self._weakclose = operator.methodcaller('close')
            atexit.register(self._weakclose, weakref.proxy(self))
    # ...
    
        def close(self):
            atexit.unregister(self._weakclose)
            # and now actually close the file

    There is a possibility of a negative impact arising from the use of operator.methodcaller, because it may return something cached in future versions of python. There is also the issue of unregistering an atexit function during atexit processing. But the general idea seems to be simple and worth considering.

    @nascheme
    Copy link
    Member

    Using atexit is not the solution because the data can be lost even while the program is running, not just at interpreter shutdown. The problem occurs if the buffered file object and the underlying file object are both part of a reference cycle. Then, when the cycle is destroyed by the GC module, if the file __del__ is called before the buffer __del__, data is lost.

    So far, the best solution I've come up with is as follows: split the buffered file object into two objects, a wrapper object that will be returned from open() and an underlying object that holds the actual buffer. Make the underlying file object keep references to all the buffers that are using the file. When the file _del__ gets called, first flush all of the buffers and then close the file. Splitting the buffered file object is required so that we don't create reference cycles.

    @njsmith
    Copy link
    Contributor

    njsmith commented Jan 25, 2018

    I attempted to implement my weakref idea (i.e. raw file keeps a weakref to the buffered file, calls flush before the raw file gets closed). That doesn't work either because the GC clears the weakref before calling __del__.

    This may be a bit of a left-field or too-big-a-hammer suggestion, but as far I can tell from this thread [1] it probably is technically possible to modify the GC to clear weakrefs after calling __del__. Nick wasn't a fan (he likes the invariant that weakrefs can't trigger a resurrection), but if all else fails it might be worth re-raising.

    You could add a secondary reference count on FileIO recording how many BufferedIO wrappers there are around it; then it's __del__ would skip calling close() if there are still BufferedIO wrappers, and BufferedIO.__del__ would decrement the reference count and close the underlying file if it hits zero and FileIO.__del__ had already been called.

    [1] https://mail.python.org/pipermail/python-dev/2016-October/146747.html

    @pppery pppery mannequin changed the title Built-in module _io can lose data from buffered files at exit Built-in module _io can lose data from buffered files in reference cycles Jan 27, 2018
    @vstinner
    Copy link
    Member

    I marked bpo-38548 as duplicate of this issue. Copy of msg355062:

    Consider the following program:

    f = open("out.txt", "w")
    f.write("abc\n")
    exit(0)

    Please note the absence of f.close().
    The documentation
    https://docs.python.org/3/tutorial/inputoutput.html#reading-and-writing-files
    says that you should use f.close() or with f = open(), but is not clear whether the program above without f.close() is guaranteed to write. The tutorial says:
    "If you don’t explicitly close a file, Python’s garbage collector will eventually destroy the object and close the open file for you, but the file may stay open for a while. Another risk is that different Python implementations will do this clean-up at different times."
    For me this sounds like even without f.close() the file is guaranteed to be written. If it is not guaranteed to be written, you should fix the documentation, if it is guaranteed to be written, then I will open another issue because the following program does not write into out.txt on my machine:

    from sympy.core import AtomicExpr
    class MyWeirdClass(AtomicExpr):
    	def __init__(self):
    		pass
    f = open("out.txt", "w")
    f.write("abc\n")
    exit(0)

    Note: sys.version is: "3.7.3 (default, Oct 7 2019, 12:56:13) \n[GCC 8.3.0]"

    @methane
    Copy link
    Member

    methane commented Nov 27, 2020

    New changeset c8aaf71 by Volker-Weissmann in branch 'master':
    bpo-17852: Doc: Fix the tutorial about closing files (GH-23135)
    c8aaf71

    @miss-islington
    Copy link
    Contributor

    New changeset 01fcde8 by Miss Islington (bot) in branch '3.8':
    bpo-17852: Doc: Fix the tutorial about closing files (GH-23135)
    01fcde8

    @Volker-Weissmann
    Copy link
    Mannequin

    Volker-Weissmann mannequin commented Nov 27, 2020

    The PR below fixed it:

    #23135

    @miss-islington
    Copy link
    Contributor

    New changeset 4a44f53 by Miss Islington (bot) in branch '3.9':
    [3.9] bpo-17852: Doc: Fix the tutorial about closing files (GH-23135) (GH-23527)
    4a44f53

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @gpshead
    Copy link
    Member

    gpshead commented May 26, 2023

    is there still an issue to be fixed here? if so re-open and explain or open a new issue. The docmentation was updated to highlight that unclose files can't be guaranteed to flush buffered data.

    @nascheme
    Copy link
    Member

    I think there is still an issue here, even though it is a low priority bug. It's good we document now that unclosed files could lose buffered data. However, it is bad, IMHO, that it works for 99%+ of the time but sometimes doesn't. I think that's worth fixing, if we can do it without too much pain.

    I created GH-105171 as a different approach to fix this.

    @vstinner
    Copy link
    Member

    vstinner commented Jun 1, 2023

    Over last years, I made many enhancements to be able to log ResourceWarning as late as possible. But Python finalization is not deterministic and so some warnings cannot be logged when emitted "too late".

    My latest failed attempt to make the finalization more determistic: #86837

    @serhiy-storchaka
    Copy link
    Member

    serhiy-storchaka commented Jun 6, 2023

    The root cause is the GC algorithm for breaking reference cycles. If we have a cycle:

    flowchart LR;
        A==>B;
        B==>C;
        C==>A;
        C-->TextIOWrapper;
        TextIOWrapper-->BufferedWriter;
        BufferedWriter-->FileIO;
    

    then it will try to break it by breaking any reference, including references C-->TextIOWrapper, TextIOWrapper-->BufferedWriter and BufferedWriter-->FileIO, even if they do not lead to breaking the cycle. The only references which advance in reference cycle elimination are A-->B, B-->C and C-->A. If the algorithm be smarter, this would solve most of problems, not only with IO.

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life topic-IO type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    8 participants