Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warn when files are not explicitly closed #54302

Closed
pitrou opened this issue Oct 13, 2010 · 37 comments
Closed

Warn when files are not explicitly closed #54302

pitrou opened this issue Oct 13, 2010 · 37 comments
Labels
extension-modules C modules in the Modules dir interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement

Comments

@pitrou
Copy link
Member

pitrou commented Oct 13, 2010

BPO 10093
Nosy @malemburg, @brettcannon, @amauryfa, @pitrou, @giampaolo, @benjaminp, @alex, @bitdancer, @briancurtin
Files
  • deallocwarn.patch
  • deallocwarn2.patch
  • deallocwarn3.patch
  • deallocwarn4.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2010-10-29.14:09:24.723>
    created_at = <Date 2010-10-13.20:16:28.182>
    labels = ['extension-modules', 'interpreter-core', 'type-feature']
    title = 'Warn when files are not explicitly closed'
    updated_at = <Date 2010-10-30.16:29:36.967>
    user = 'https://github.com/pitrou'

    bugs.python.org fields:

    activity = <Date 2010-10-30.16:29:36.967>
    actor = 'r.david.murray'
    assignee = 'none'
    closed = True
    closed_date = <Date 2010-10-29.14:09:24.723>
    closer = 'pitrou'
    components = ['Extension Modules', 'Interpreter Core']
    creation = <Date 2010-10-13.20:16:28.182>
    creator = 'pitrou'
    dependencies = []
    files = ['19225', '19231', '19238', '19411']
    hgrepos = []
    issue_num = 10093
    keywords = ['patch']
    message_count = 37.0
    messages = ['118577', '118578', '118594', '118605', '118606', '118608', '118609', '118641', '118642', '118643', '118644', '118713', '118718', '118759', '118761', '119361', '119401', '119402', '119405', '119406', '119824', '119831', '119832', '119839', '119870', '119871', '119879', '119883', '119888', '119889', '119895', '119896', '119898', '119901', '119903', '119904', '120000']
    nosy_count = 10.0
    nosy_names = ['lemburg', 'brett.cannon', 'exarkun', 'amaury.forgeotdarc', 'pitrou', 'giampaolo.rodola', 'benjamin.peterson', 'alex', 'r.david.murray', 'brian.curtin']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue10093'
    versions = ['Python 3.2']

    @pitrou
    Copy link
    Member Author

    pitrou commented Oct 13, 2010

    This patch outputs a warning on file deallocation when it hasn't been explicitly closed by the programmer.

    Printing the warning by default is probably not desirable, but using the patch for "-X" options in bpo-10089 would allow to switch on when necessary.

    @pitrou pitrou added extension-modules C modules in the Modules dir interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement labels Oct 13, 2010
    @pitrou
    Copy link
    Member Author

    pitrou commented Oct 13, 2010

    By the way, python-dev discussion was here:
    http://mail.python.org/pipermail/python-dev/2010-September/104247.html

    The rough consensus seems to be that there should be a command-line option to enable it, but to enable it by default with pydebug.

    @exarkun
    Copy link
    Mannequin

    exarkun mannequin commented Oct 13, 2010

    If the warnings are emitted as usual with the warnings module, you can use -W to control this. -X isn't necessary.

    @pitrou
    Copy link
    Member Author

    pitrou commented Oct 13, 2010

    If the warnings are emitted as usual with the warnings module, you can
    use -W to control this.

    Right. Perhaps we can add a new warning category (e.g. ResourceWarning), or simply reuse RuntimeWarning.

    @benjaminp
    Copy link
    Contributor

    +1 for RuntimeError.

    @alex
    Copy link
    Member

    alex commented Oct 14, 2010

    RuntimeWarning you mean? I don't think anyone is suggesting making it an error.

    @benjaminp
    Copy link
    Contributor

    2010/10/13 Alex <report@bugs.python.org>:

    Alex <alex.gaynor@gmail.com> added the comment:

    RuntimeWarning you mean?  I don't think anyone is suggesting making it an error.

    Indeed.

    @pitrou
    Copy link
    Member Author

    pitrou commented Oct 14, 2010

    There is a slight issue with warnings, and that's the filtering by module/location. Since these warnings will occur in destructors, they can appear at any point without being related to the code being executed. The "default" filtering method (print the first occurrence of matching warnings for each location where the warning is issued) then is not appropriate; therefore I suggest a separate warning category (ResourceWarning?) for which people can enable different rules.

    @giampaolo
    Copy link
    Contributor

    Does this work also for sockets?

    @pitrou
    Copy link
    Member Author

    pitrou commented Oct 14, 2010

    Does this work also for sockets?

    Not with the current implementation. I guess this could be added, but then I would have to make the warning method ("_dealloc_warn") public on IO classes.

    @pitrou
    Copy link
    Member Author

    pitrou commented Oct 14, 2010

    Here is an updated patch using warnings (RuntimeWarning for now) and also warning about unclosed sockets.

    @pitrou
    Copy link
    Member Author

    pitrou commented Oct 14, 2010

    Here is a patch adding ResourceWarning, adding new tests, and fixing failures in existing tests.
    Feedback welcome.

    @pitrou
    Copy link
    Member Author

    pitrou commented Oct 14, 2010

    I committed the test fixes to py3k, so here is an updated patch.

    @amauryfa
    Copy link
    Member

    Please add a similar warning in PC/_subprocess.c::sp_handle_dealloc()
    I just got caught by this in PyPy because some pipe handle relies on reference counting to be closed.

    This ad-hoc fix would suppress the warning:
    http://codespeak.net/pipermail/pypy-svn/2010-October/043746.html

    @pitrou
    Copy link
    Member Author

    pitrou commented Oct 15, 2010

    Please add a similar warning in PC/_subprocess.c::sp_handle_dealloc()
    I just got caught by this in PyPy because some pipe handle relies on reference counting to be closed.

    Do you want to provide a patch?

    @brettcannon
    Copy link
    Member

    After thinking about what warning to go with, I take back my python-dev suggestion of ResourceWarning and switch to DebugWarning. It is in fact harder to add in DebugWarning as a superclass to ResourceWarning than the other way around, especially once people start doing custom filters and decide that DebugWarning should not be filtered as a whole but ResourceWarning should (or some such crazy thing).

    @pitrou
    Copy link
    Member Author

    pitrou commented Oct 22, 2010

    After thinking about what warning to go with, I take back my python-dev
    suggestion of ResourceWarning and switch to DebugWarning.

    So what is your advice now?
    I've thought about the DebugWarning name myself and I think it's a rather bad name, because many warnings are debug-related in practice. I'd just say stick with ResourceWarning.

    @brettcannon
    Copy link
    Member

    On Fri, Oct 22, 2010 at 12:49, Antoine Pitrou <report@bugs.python.org> wrote:

    Antoine Pitrou <pitrou@free.fr> added the comment:

    > After thinking about what warning to go with, I take back my python-dev
    > suggestion of ResourceWarning and switch to DebugWarning.

    So what is your advice now?
    I've thought about the DebugWarning name myself and I think it's a rather bad name, because many warnings are debug-related in practice. I'd just say stick with ResourceWarning.

    That's true. And the warning hierarchy is rather shallow anyway. I'm
    fine with ResourceWarning then.

    @malemburg
    Copy link
    Member

    I wonder why you think a warning is needed if files aren't closed explicitly. The fact that they get closed on garbage collection is one of the nice features of Python and has made programming easy for years.

    Explicitly having to close files may have some benefits w/r to resource management, but in most cases is not needed. I disagree that we should discourage not explicitly closing files. After all, this is Python, not C.

    @brettcannon
    Copy link
    Member

    But this is meant to be an optional warning; users will never see it. Me as a developer, I would like to know when I leave a file open as that is a waste of resources, plus with no guarantee of everything being flushed to disk.

    Besides, the context manager for files makes the chance of leaving a file open a much more blaring mistake.

    @pitrou
    Copy link
    Member Author

    pitrou commented Oct 28, 2010

    I would need opinions on one more thing. The current patch warns when a socket has not been explicitly closed. But it does so even when the socket isn't bound at all. e.g.:

    $ ./python -c "import socket; socket.socket()"
    -c:1: ResourceWarning: unclosed <socket.socket object, fd=3, family=2, type=1, proto=0>

    Perhaps we should be more discriminate and only warn when either bind(), listen() or connect() had been called previously? What do you think?

    @brettcannon
    Copy link
    Member

    If a new, unbound socket uses some form of OS resource, then a warning is needed. Is their an equivalent limitation like there is with file descriptors as to how many an OS can possibly have open at once?

    @pitrou
    Copy link
    Member Author

    pitrou commented Oct 29, 2010

    If a new, unbound socket uses some form of OS resource, then a warning
    is needed. Is their an equivalent limitation like there is with file
    descriptors as to how many an OS can possibly have open at once?

    A socket *is* a file descriptor (under Unix), so, yes, there is a limit.

    @brettcannon
    Copy link
    Member

    That's what I thought. So if socket.socket() alone is enough to consume an fd then there should be a warning.

    @malemburg
    Copy link
    Member

    Brett Cannon wrote:

    Brett Cannon <brett@python.org> added the comment:

    But this is meant to be an optional warning; users will never see it. Me as a developer, I would like to know when I leave a file open as that is a waste of resources, plus with no guarantee of everything being flushed to disk.

    That's what I'm referring to: most Python applications are
    written with the fact in mind, that garbage collection will
    close the files or socket.

    That's a perfectly fine way of writing Python applications,
    so why should the programmer get warned about this regular
    approach to Python programming ?

    Besides, the context manager for files makes the chance of leaving a file open a much more blaring mistake.

    See above: context aren't required for working with files. And again:
    it's *not* a mistake to not explicitly close a file.

    The same applies for sockets.

    Think of the simple idiom:

    data = open(filename).read()

    This would always create a warning under the proposal.

    Same for:

    for line in open(filename):
       print line

    Also note that files and sockets can be kept as references in data
    structures (other than local variables or on the stack) and there
    you usually have the same approach: you expect Python to close the
    files when garbage collecting the data structures and that's
    perfectly ok.

    If you want to monitor resource usage in your application it
    would be a lot more useful to provide access to the number of
    currently open FDs, than scattering warnings around the
    application which mostly trigger false positives.

    @pitrou
    Copy link
    Member Author

    pitrou commented Oct 29, 2010

    That's what I'm referring to: most Python applications are
    written with the fact in mind, that garbage collection will
    close the files or socket.

    That's a perfectly fine way of writing Python applications,

    Some people would disagree, especially Windows users who cannot timely
    delete files when some file descriptors still point to them.

    so why should the programmer get warned about this regular
    approach to Python programming ?

    Again: it is an *optional* warning. It is *disabled* by default, except
    when compiled --with-pydebug.

    The same applies for sockets.

    It is *definitely* a mistake if the socket has been bound to a local
    address and/or connected to a remote endpoint.

    Think of the simple idiom:

    data = open(filename).read()

    This would always create a warning under the proposal.

    We have had many Windows buildbot failures because of such coding style.

    If you want to monitor resource usage in your application it
    would be a lot more useful to provide access to the number of
    currently open FDs

    Agreed it would be useful as well, but please tell that to operating
    system vendors. Python has no way to calculate such a statistic.

    @pitrou
    Copy link
    Member Author

    pitrou commented Oct 29, 2010

    Here is an updated patch (also fixes a small refleak).

    @pitrou
    Copy link
    Member Author

    pitrou commented Oct 29, 2010

    I've committed the patch in r85920; let's watch the buildbots. Also, you'll see many warnings in the test suite if compiled --with-pydebug.

    @malemburg
    Copy link
    Member

    Antoine Pitrou wrote:

    Antoine Pitrou <pitrou@free.fr> added the comment:

    > That's what I'm referring to: most Python applications are
    > written with the fact in mind, that garbage collection will
    > close the files or socket.
    >
    > That's a perfectly fine way of writing Python applications,

    Some people would disagree, especially Windows users who cannot timely
    delete files when some file descriptors still point to them.

    There is no difference here:

    Whether you write an application with automatic closing of
    the file/socket at garbage collection time in mind, or you explicitly
    close the file/socket, the timing is the same.

    The only difference is with Python implementations that don't
    use synchronous garbage collection, e.g. Jython, but not with
    CPython.

    > so why should the programmer get warned about this regular
    > approach to Python programming ?

    Again: it is an *optional* warning. It is *disabled* by default, except
    when compiled --with-pydebug.

    Yes, I know. I still find the warning rather useless, since it
    warns about code that's perfectly ok.

    > The same applies for sockets.

    It is *definitely* a mistake if the socket has been bound to a local
    address and/or connected to a remote endpoint.

    I don't follow you. Where's the difference between writing:

    s.close()
    or
    s = None

    for an open socket s ?

    > Think of the simple idiom:
    >
    > data = open(filename).read()
    >
    > This would always create a warning under the proposal.

    We have had many Windows buildbot failures because of such coding style.

    Again: where's the difference between writing:

    data = open(filename).read()
    and
    f = open(filename)
    data = f.read()
    f.close()

    ?

    If the above coding style causes problems, the reasons must be
    elsewhere, since there is no difference between those two
    styles (other than cluttering up your locals).

    The for-loop file iterator support was explicitly added to make
    writing:

    for line in open(filename):
        print line

    possible.

    > If you want to monitor resource usage in your application it
    > would be a lot more useful to provide access to the number of
    > currently open FDs

    Agreed it would be useful as well, but please tell that to operating
    system vendors. Python has no way to calculate such a statistic.

    At least for Linux, that's not hard and I doubt it is for other OSes.

    4

    On other Unixes, you can simply use fcntl() to scan all possible FDs
    for open FDs.

    On Windows you can use one of these functions for the same effect:
    http://msdn.microsoft.com/en-us/library/kdfaxaay(v=VS.90).aspx

    @pitrou
    Copy link
    Member Author

    pitrou commented Oct 29, 2010

    Whether you write an application with automatic closing of
    the file/socket at garbage collection time in mind, or you explicitly
    close the file/socket, the timing is the same.

    No, because objects can be kept alive through tracebacks (or reference
    cycles).

    I don't follow you. Where's the difference between writing:

    s.close()
    or
    s = None

    for an open socket s ?

    The difference is when s is still referenced elsewhere.
    Also, the intent of the former is clear while the latter is deliberately
    obscure (while not saving any significant amount of typing).

    The for-loop file iterator support was explicitly added to make
    writing:

    for line in open(filename):
    print line

    possible.

    So what?

    At least for Linux, that's not hard and I doubt it is for other OSes.

    4

    On other Unixes, you can simply use fcntl() to scan all possible FDs
    for open FDs.

    On Windows you can use one of these functions for the same effect:
    http://msdn.microsoft.com/en-us/library/kdfaxaay(v=VS.90).aspx

    Until you post some code I won't understand what you are talking about.

    @malemburg
    Copy link
    Member

    Antoine Pitrou wrote:

    Antoine Pitrou <pitrou@free.fr> added the comment:

    > Whether you write an application with automatic closing of
    > the file/socket at garbage collection time in mind, or you explicitly
    > close the file/socket, the timing is the same.

    No, because objects can be kept alive through tracebacks (or reference
    cycles).

    If you get a traceback, the explicitly file.close() won't help
    you either; but that usually doesn't matter, since program
    execution will stop, the traceback will get GCed and the file
    closed. This is a very normal and reasonable expectation of a
    Python programmer.

    > I don't follow you. Where's the difference between writing:
    >
    > s.close()
    > or
    > s = None
    >
    > for an open socket s ?

    The difference is when s is still referenced elsewhere.
    Also, the intent of the former is clear while the latter is deliberately
    obscure (while not saving any significant amount of typing).

    Sure, but that's not the point. It is not a mistake to write
    such code and neither is this obscure, otherwise we'd also
    require explicit garbage collection for other parts of Python.

    If the application keeps references to the objects in other
    places, the programmer would, of course, add an explicit .close().
    Ditto for the files.

    > The for-loop file iterator support was explicitly added to make
    > writing:
    >
    > for line in open(filename):
    > print line
    >
    > possible.

    So what?

    The warning will trigger without any reason.

    > At least for Linux, that's not hard and I doubt it is for other OSes.
    >
    > 4
    >
    > On other Unixes, you can simply use fcntl() to scan all possible FDs
    > for open FDs.
    >
    > On Windows you can use one of these functions for the same effect:
    > http://msdn.microsoft.com/en-us/library/kdfaxaay(v=VS.90).aspx

    Until you post some code I won't understand what you are talking about.

    RoundUp's email interface ate the code. I'll post it again via the
    web interface.

    @malemburg
    Copy link
    Member

    Reposted via the web interface:

    > If you want to monitor resource usage in your application it
    > would be a lot more useful to provide access to the number of
    > currently open FDs
    Agreed it would be useful as well, but please tell that to operating
    system vendors. Python has no way to calculate such a statistic.

    At least for Linux, that's not hard and I doubt it is for other OSes.

    >>> import os
    >>> nfds = len(os.listdir('/proc/%i/fd' % os.getpid()))
    >>> print nfds
    4

    On other Unixes, you can simply use fcntl() to scan all possible FDs
    for open FDs.

    On Windows you can use one of these functions for the same effect:
    http://msdn.microsoft.com/en-us/library/kdfaxaay(v=VS.90).aspx

    @pitrou
    Copy link
    Member Author

    pitrou commented Oct 29, 2010

    The warning will trigger without any reason.

    Well, by now you should have understood the reason, so I conclude that
    you are either 1) deaf 2) stupid 3) deliberately obnoxious.

    This is my last answer to you on this topic. Goodbye.

    @malemburg
    Copy link
    Member

    Antoine Pitrou wrote:

    Antoine Pitrou <pitrou@free.fr> added the comment:

    > The warning will trigger without any reason.

    Well, by now you should have understood the reason, so I conclude that
    you are either 1) deaf 2) stupid 3) deliberately obnoxious.

    This is my last answer to you on this topic. Goodbye.

    Ignoring your insults, I think the problem is that you fail to see
    point or address it:

    Warnings in Python are meant to tell programmers that something
    should be fixed. The warnings in the examples I gave do not point
    to cases that need to be fixed, in fact, they are very common
    idioms used in Python books, examples and tutorials.

    Note that you've just added warning filters to the test suite
    to ignore the warning again. I find that a good argument
    for not having it in this form in the first place.

    A truely useful ResourceWarning would trigger if resources gets
    close to some limit, e.g. if the number of open file descriptors
    reaches a certain limit. This could easily be tested when
    opening them, since they are plain integers.

    @pitrou
    Copy link
    Member Author

    pitrou commented Oct 29, 2010

    The buildbots showed no major issue, so this issue is now resolved. The warnings expose a lot of issues in the stdlib that deserve addressing, though ;)

    @pitrou pitrou closed this as completed Oct 29, 2010
    @benjaminp
    Copy link
    Contributor

    2010/10/29 Marc-Andre Lemburg <report@bugs.python.org>:
    >
    > Marc-Andre Lemburg <mal@egenix.com> added the comment:
    >
    > Antoine Pitrou wrote:
    >>
    >> Antoine Pitrou <pitrou@free.fr> added the comment:
    >>
    >>> The warning will trigger without any reason.
    >>
    >> Well, by now you should have understood the reason, so I conclude that
    >> you are either 1) deaf 2) stupid 3) deliberately obnoxious.
    >>
    >> This is my last answer to you on this topic. Goodbye.
    >
    > Ignoring your insults, I think the problem is that you fail to see
    > point or address it:
    >
    > Warnings in Python are meant to tell programmers that something
    > should be fixed. The warnings in the examples I gave do not point
    > to cases that need to be fixed, in fact, they are very common
    > idioms used in Python books, examples and tutorials.

    They're not particularly good idioms then. Leaving files open
    haphazardly leads to subtle bugs as our test suite has shown.

    @bitdancer
    Copy link
    Member

    MAL wrote:
    > Antoine wrote:
    >> MAL wrote:
    >>> I don't follow you. Where's the difference between writing:
    >>>
    >>> s.close()
    >>> or
    >>> s = None
    >>>
    >>> for an open socket s ?
    >> 
    >> The difference is when s is still referenced elsewhere.
    >> Also, the intent of the former is clear while the latter is deliberately
    >> obscure (while not saving any significant amount of typing).
    >
    >Sure, but that's not the point. It is not a mistake to write
    >such code and neither is this obscure, otherwise we'd also
    >require explicit garbage collection for other parts of Python.

    Yes it is a mistake:

    In an earlier message MAL wrote:

    The only difference is with Python implementations that don't
    use synchronous garbage collection, e.g. Jython, but not with
    CPython.

    This by definition makes it non-equivalent and a bad *Python* idiom,
    since it depends on an acknowledged CPython *implementation detail*.
    As far as I know, there is no language requirement that mandates having
    garbage collection at all.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    extension-modules C modules in the Modules dir interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    8 participants