Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

io.FileIO calls flush() after file closed #49950

Closed
brianquinlan opened this issue Apr 5, 2009 · 22 comments
Closed

io.FileIO calls flush() after file closed #49950

brianquinlan opened this issue Apr 5, 2009 · 22 comments
Assignees
Labels
stdlib Python modules in the Lib dir topic-IO type-bug An unexpected behavior, bug, or error

Comments

@brianquinlan
Copy link
Contributor

BPO 5700
Nosy @terryjreedy, @brianquinlan, @pitrou, @vadmium, @serhiy-storchaka
Files
  • remove_flush.diff: Patch to remove the call to flush by _io._IOBase
  • cooperation.diff: Use cooperative inheritance
  • fileio_flush_closed.patch
  • fileio_flush_closed_2.patch
  • fileio_flush_closed-2.7.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/serhiy-storchaka'
    closed_at = <Date 2015-02-20.22:37:46.140>
    created_at = <Date 2009-04-05.15:35:55.732>
    labels = ['type-bug', 'library', 'expert-IO']
    title = 'io.FileIO calls flush() after file closed'
    updated_at = <Date 2015-02-22.22:33:59.994>
    user = 'https://github.com/brianquinlan'

    bugs.python.org fields:

    activity = <Date 2015-02-22.22:33:59.994>
    actor = 'python-dev'
    assignee = 'serhiy.storchaka'
    closed = True
    closed_date = <Date 2015-02-20.22:37:46.140>
    closer = 'serhiy.storchaka'
    components = ['Library (Lib)', 'IO']
    creation = <Date 2009-04-05.15:35:55.732>
    creator = 'bquinlan'
    dependencies = []
    files = ['13620', '13677', '36889', '38147', '38148']
    hgrepos = []
    issue_num = 5700
    keywords = ['patch']
    message_count = 22.0
    messages = ['85521', '85616', '85890', '85894', '85895', '85900', '85901', '85902', '85913', '85914', '85915', '85917', '139205', '228234', '229143', '234880', '236039', '236040', '236074', '236337', '236339', '236417']
    nosy_count = 6.0
    nosy_names = ['terry.reedy', 'bquinlan', 'pitrou', 'python-dev', 'martin.panter', 'serhiy.storchaka']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue5700'
    versions = ['Python 2.7', 'Python 3.4', 'Python 3.5']

    @brianquinlan
    Copy link
    Contributor Author

    >>> import io
    >>> class MyIO(io.FileIO):
    ...     def flush(self):
    ...             print('closed:', self.closed)
    ...
    >>> f = MyIO('test.out', 'wb')
    >>> f.close()
    closed: True

    IMHO, calling flush() after the file has already been closed is
    incorrect behaviour. Search for "Possible py3k io wierdness" on
    python-dev for discussion.

    @brianquinlan brianquinlan added the stdlib Python modules in the Lib dir label Apr 5, 2009
    @brianquinlan
    Copy link
    Contributor Author

    Discussion about this bug is ongoing in python-dev.

    @brianquinlan
    Copy link
    Contributor Author

    cooperation.diff:

    • change the close method to call .flush() and then ._close()
    • only IOBase implements close() (though a subclass can override close
      without causing problems - so long as it calls super().close())
    • .flush() invokes super().flush()
    • ._close() invokes super()._close()
    • FileIO is implemented in Python in _pyio.py so that it can have the
      same base class as the other Python-implemented files classes
    • tests verify that .flush() is not called after the file is closed
    • tests verify that ._close()/.flush() calls move correctly are
      propagated correctly

    @pitrou
    Copy link
    Member

    pitrou commented Apr 12, 2009

    • FileIO is implemented in Python in _pyio.py so that it can have the
      same base class as the other Python-implemented files classes

    Is it really necessary (e.g. to pass the tests)?

    @brianquinlan
    Copy link
    Contributor Author

    > - FileIO is implemented in Python in _pyio.py so that it can have the
    > same base class as the other Python-implemented files classes

    Is it really necessary (e.g. to pass the tests)?

    It is necessary to make MI work. With out it the inheritance graph looks
    like this (using _pyio):

    io.IOBase _pyio.IOBase
    | |
    io.FileIO MyMixin
    | |
    \ /
    \ /
    \ /
    MyClass

    If you call MyClass.flush() with this hierarchy then it will propagate
    to io.IOBase. Since io.IOBase doesn't call super().flush() in its flush
    implementation (because it has no super class), flush propagation would
    stop and MyMixin.flush wouldn't be called.

    There are other ways you could solve this, of course, like getting
    io.IOBase to call super().flush and ignore any errors that it gets.

    But it seems like this is the right way to fix this problem anyway - as
    a user, I would expect isinstance(FileIO(...), IOBase) but that is not
    currently the case with _pyio.

    @pitrou
    Copy link
    Member

    pitrou commented Apr 12, 2009

    It is necessary to make MI work. With out it the inheritance graph looks
    like this (using _pyio):

    io.IOBase _pyio.IOBase
    | |
    io.FileIO MyMixin
    | |
    \ /
    \ /
    \ /
    MyClass

    MyMixin doesn't need to inherit from IOBase, since precisely it is a
    mixin. It's just a piece of functionality that you drop into an already
    existing inheritance tree.

    But it seems like this is the right way to fix this problem anyway - as
    a user, I would expect isinstance(FileIO(...), IOBase) but that is not
    currently the case with _pyio.

    That's because the reference ABCs are defined in the io module, not in
    _pyio:

    True
    >>> issubclass(_pyio.FileIO, io.IOBase)
    True
    >>> issubclass(io.TextIOWrapper, io.IOBase)
    True
    >>> issubclass(_pyio.TextIOWrapper, io.IOBase)
    True

    _pyio.IOBase and friends are just concrete implementations of those
    ABCs:

    True
    >>> issubclass(_pyio.TextIOBase, io.TextIOBase)
    True

    I know it looks a bit non-intuitive at first, but the point is that the
    ABCs are unified, even there are two different implementations. I think
    it's better than having two different sets of ABCs depending on whether
    you import the pure-Python version or the C version.

    @brianquinlan
    Copy link
    Contributor Author

    Antoine Pitrou wrote:

    Antoine Pitrou <pitrou@free.fr> added the comment:

    > It is necessary to make MI work. With out it the inheritance graph looks
    > like this (using _pyio):
    >
    >
    > io.IOBase _pyio.IOBase
    > | |
    > io.FileIO MyMixin
    > | |
    > \ /
    > \ /
    > \ /
    > MyClass

    MyMixin doesn't need to inherit from IOBase, since precisely it is a
    mixin. It's just a piece of functionality that you drop into an already
    existing inheritance tree.

    In my implementation, it does. Otherwise the linearization of the class
    hierarchy becomes:

    MyClass -> io.FileIO -> io.IOBase -> MyMixin (-> _pyio.IOBase)

    But io.IOBase doesn't make super calls so MyMixin method won't be called.

    I think that this linearization is probably more useful:

    MyClass -> io.FileIO -> MyMixin -> IOBase

    And you get that when io.FileIO and MyMixin share the same IOBase as a
    base class.

    I don't mind changing io.IOBase to make super calls in ._close() and
    .flush() and suppressing the attributes errors that will be generated if
    a mixin is not present - but it feels a bit untidy to me.

    [snipped]

    I know it looks a bit non-intuitive at first, but the point is that the
    ABCs are unified, even there are two different implementations. I think
    it's better than having two different sets of ABCs depending on whether
    you import the pure-Python version or the C version.

    I'm not trying to change the ABC unification at all - and if I did then
    there is a bug in my code. I just think that the immediate parent class
    of _pyio.FileIO should be _pyio.IOBase (just like _io.FileIO has
    _io.IOBase as an immediate parent). That will make the Python and C
    class hierarchies completely consistent within themselves.

    Cheers,
    Brian

    @pitrou
    Copy link
    Member

    pitrou commented Apr 12, 2009

    I think that this linearization is probably more useful:

    MyClass -> io.FileIO -> MyMixin -> IOBase

    But why not simply:

    MyClass -> MyMixin -> io.FileIO -> IOBase

    ?
    Is there something I'm missing that prevents you from doing this?

    I'm not trying to change the ABC unification at all - and if I did then
    there is a bug in my code. I just think that the immediate parent class
    of _pyio.FileIO should be _pyio.IOBase (just like _io.FileIO has
    _io.IOBase as an immediate parent). That will make the Python and C
    class hierarchies completely consistent within themselves.

    I understand, but that's at the price of an otherwise useless
    indirection layer, which will also make _pyio even slower that it
    already is :-)
    (I admit, however, that _pyio shouldn't be used in normal circumstances,
    so this is not a showstopper argument)

    @brianquinlan
    Copy link
    Contributor Author

    Antoine Pitrou wrote:

    Antoine Pitrou <pitrou@free.fr> added the comment:

    > I think that this linearization is probably more useful:
    >
    > MyClass -> io.FileIO -> MyMixin -> IOBase

    But why not simply:

    MyClass -> MyMixin -> io.FileIO -> IOBase

    ?
    Is there something I'm missing that prevents you from doing this?

    No, doing this is trivial. But shouldn't it be up to the implementor of
    MyClass to decide whether MyMixin or io.FileIO methods are evaluated first?

    > I'm not trying to change the ABC unification at all - and if I did then
    > there is a bug in my code. I just think that the immediate parent class
    > of _pyio.FileIO should be _pyio.IOBase (just like _io.FileIO has
    > _io.IOBase as an immediate parent). That will make the Python and C
    > class hierarchies completely consistent within themselves.

    I understand, but that's at the price of an otherwise useless
    indirection layer, which will also make _pyio even slower that it
    already is :-)

    Maybe I misunderstand the purpose of _pyio. The Python 3.1 says that its
    purpose is for experimentation. For experimentation, having a Python
    implementation where you can add methods and change behavior (though
    perhaps not in as deep as way is if this class were completely written
    in Python) is useful. It is also useful for the behavior of the Python
    implementation to match that of the C implementation as closely as
    possible - this patch makes the inheritance graph for _pyio.FileIO more
    consistent.

    I, for example, want to add a sync() method to FileIO that will call
    fsync() on the file's file descriptor. With this change, I have a place
    to plug in that change in Python and I can write the C implementation
    when I have the Python implementation right.

    Cheers,
    Brian

    @pitrou
    Copy link
    Member

    pitrou commented Apr 12, 2009

    No, doing this is trivial. But shouldn't it be up to the implementor of
    MyClass to decide whether MyMixin or io.FileIO methods are evaluated first?

    Is there a concrete use case, though?

    By the way, what if _pyio.FileIO inherited from io.FileIO instead of
    delegating all methods?

    @brianquinlan
    Copy link
    Contributor Author

    Antoine Pitrou wrote:

    Antoine Pitrou <pitrou@free.fr> added the comment:

    > No, doing this is trivial. But shouldn't it be up to the implementor of
    > MyClass to decide whether MyMixin or io.FileIO methods are evaluated first?

    Is there a concrete use case, though?

    I don't have a good one in mind - though

    By the way, what if _pyio.FileIO inherited from io.FileIO instead of
    delegating all methods?

    I don't think that would work - but reasoning about MRO hurts my brain ;-)

    You'd have to declare the class like this:

    class FileIO(io.FileIO, IOBase):
       pass

    to get the io.File methods to resolve first. But that means that the
    method invocation chain would be broken by io.IOBase, which doesn't make
    super calls.

    Cheers,
    Brian

    @brianquinlan
    Copy link
    Contributor Author

    Oops, I didn't finish my thought:

    > No, doing this is trivial. But shouldn't it be up to the implementor of
    > MyClass to decide whether MyMixin or io.FileIO methods are evaluated
    first?

    Is there a concrete use case, though?

    I don't have a good one in mind - though writing the unit test
    implementation required that the mixin class methods be called last.

    @terryjreedy
    Copy link
    Member

    The original snippet works the same on 3.2.0. Was their any conclusion as to whether or not a change should be made?

    @serhiy-storchaka
    Copy link
    Member

    There is complete implementation of FileIO in pure Python in bpo-21859. It is free from this bug.

    >>> import _pyio as io
    >>> class MyIO(io.FileIO):
    ...     def flush(self):
    ...             print('closed:', self.closed)
    ... 
    >>> f = MyIO('test.out', 'wb')
    >>> f.close()
    closed: False

    @serhiy-storchaka
    Copy link
    Member

    Yet one related bug is that flush() isn't called at all if the file was opened with closefd=False.

    >>> import io, os
    >>> class MyIO(io.FileIO):
    ...     def flush(self):
    ...         print('closed:', self.closed)
    ... 
    >>> fd = os.open('test.out', os.O_WRONLY|os.O_CREAT)
    >>> f = MyIO(fd, 'wb', closefd=False)
    >>> f.close()

    The proposed simple patch fixes both bugs.

    @serhiy-storchaka serhiy-storchaka self-assigned this Oct 12, 2014
    @serhiy-storchaka serhiy-storchaka added the type-bug An unexpected behavior, bug, or error label Oct 12, 2014
    @serhiy-storchaka
    Copy link
    Member

    Ping.

    @serhiy-storchaka
    Copy link
    Member

    Added comments as Antoine suggested.

    @serhiy-storchaka
    Copy link
    Member

    And here is a patch for 2.7.

    @vadmium
    Copy link
    Member

    vadmium commented Feb 15, 2015

    For the record, the python-dev thread referenced in the original post is <https://mail.python.org/pipermail/python-dev/2009-April/088212.html\>.

    The obvious fix to me is to have FileIO.close() call IOBase.close() to invoke the flush() before continuing with its own closing. This is what Serhiy’s patches appear to do, so I agree with them in general.

    I do wonder what the point of duplicating test code in test_io and test_fileio is. The only difference seems to be calling open() versus calling FileIO() directly. Wouldn’t it be better to merge the code together, or maybe just assert open() returns a FileIO instance?

    @serhiy-storchaka
    Copy link
    Member

    Agree, the test in test_fileio is redundant.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Feb 20, 2015

    New changeset 7052206ad381 by Serhiy Storchaka in branch '2.7':
    Issue bpo-5700: io.FileIO() called flush() after closing the file.
    https://hg.python.org/cpython/rev/7052206ad381

    New changeset 36f5c36b7704 by Serhiy Storchaka in branch '3.4':
    Issue bpo-5700: io.FileIO() called flush() after closing the file.
    https://hg.python.org/cpython/rev/36f5c36b7704

    New changeset e1f08f5b6b62 by Serhiy Storchaka in branch 'default':
    Issue bpo-5700: io.FileIO() called flush() after closing the file.
    https://hg.python.org/cpython/rev/e1f08f5b6b62

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Feb 22, 2015

    New changeset bf52f69d6948 by Serhiy Storchaka in branch '2.7':
    Broke reference loops in tests added in issue bpo-5700.
    https://hg.python.org/cpython/rev/bf52f69d6948

    New changeset 00bde0743690 by Serhiy Storchaka in branch '3.4':
    Broke reference loops in tests added in issue bpo-5700.
    https://hg.python.org/cpython/rev/00bde0743690

    New changeset a8df1ca60452 by Serhiy Storchaka in branch 'default':
    Broke reference loops in tests added in issue bpo-5700.
    https://hg.python.org/cpython/rev/a8df1ca60452

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir topic-IO type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants