classification
Title: Warn when files are not explicitly closed
Type: enhancement Stage: committed/rejected
Components: Extension Modules, Interpreter Core Versions: Python 3.2
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: alex, amaury.forgeotdarc, benjamin.peterson, brett.cannon, brian.curtin, exarkun, giampaolo.rodola, lemburg, pitrou, r.david.murray
Priority: normal Keywords: patch

Created on 2010-10-13 20:16 by pitrou, last changed 2010-10-30 16:29 by r.david.murray. This issue is now closed.

Files
File name Uploaded Description Edit
deallocwarn.patch pitrou, 2010-10-13 21:06
deallocwarn2.patch pitrou, 2010-10-14 10:29
deallocwarn3.patch pitrou, 2010-10-14 21:34
deallocwarn4.patch pitrou, 2010-10-29 10:15
Messages (37)
msg118577 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-10-13 20:16
This patch outputs a warning on file deallocation when it hasn't been explicitly closed by the programmer.

Printing the warning by default is probably not desirable, but using the patch for "-X" options in issue10089 would allow to switch on when necessary.
msg118578 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-10-13 20:22
By the way, python-dev discussion was here:
http://mail.python.org/pipermail/python-dev/2010-September/104247.html

The rough consensus seems to be that there should be a command-line option to enable it, but to enable it by default with pydebug.
msg118594 - (view) Author: Jean-Paul Calderone (exarkun) * (Python committer) Date: 2010-10-13 22:20
If the warnings are emitted as usual with the warnings module, you can use -W to control this.  -X isn't necessary.
msg118605 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-10-13 23:53
> If the warnings are emitted as usual with the warnings module, you can 
> use -W to control this.

Right. Perhaps we can add a new warning category (e.g. ResourceWarning), or simply reuse RuntimeWarning.
msg118606 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2010-10-14 00:09
+1 for RuntimeError.
msg118608 - (view) Author: Alex Gaynor (alex) * (Python committer) Date: 2010-10-14 00:41
RuntimeWarning you mean?  I don't think anyone is suggesting making it an error.
msg118609 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2010-10-14 01:00
2010/10/13 Alex <report@bugs.python.org>:
>
> Alex <alex.gaynor@gmail.com> added the comment:
>
> RuntimeWarning you mean?  I don't think anyone is suggesting making it an error.

Indeed.
msg118641 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-10-14 09:03
There is a slight issue with warnings, and that's the filtering by module/location. Since these warnings will occur in destructors, they can appear at any point without being related to the code being executed. The "default" filtering method (print the first occurrence of matching warnings for each location where the warning is issued) then is not appropriate; therefore I suggest a separate warning category (ResourceWarning?) for which people can enable different rules.
msg118642 - (view) Author: Giampaolo Rodola' (giampaolo.rodola) * (Python committer) Date: 2010-10-14 09:17
Does this work also for sockets?
msg118643 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-10-14 09:20
> Does this work also for sockets?

Not with the current implementation. I guess this could be added, but then I would have to make the warning method ("_dealloc_warn") public on IO classes.
msg118644 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-10-14 10:29
Here is an updated patch using warnings (RuntimeWarning for now) and also warning about unclosed sockets.
msg118713 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-10-14 19:42
Here is a patch adding ResourceWarning, adding new tests, and fixing failures in existing tests.
Feedback welcome.
msg118718 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-10-14 21:34
I committed the test fixes to py3k, so here is an updated patch.
msg118759 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2010-10-15 11:49
Please add a similar warning in PC/_subprocess.c::sp_handle_dealloc()
I just got caught by this in PyPy because some pipe handle relies on reference counting to be closed.

This ad-hoc fix would suppress the warning:
   http://codespeak.net/pipermail/pypy-svn/2010-October/043746.html
msg118761 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-10-15 12:25
> Please add a similar warning in PC/_subprocess.c::sp_handle_dealloc()
> I just got caught by this in PyPy because some pipe handle relies on reference counting to be closed.

Do you want to provide a patch?
msg119361 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2010-10-22 03:04
After thinking about what warning to go with, I take back my python-dev suggestion of ResourceWarning and switch to DebugWarning. It is in fact harder to add in DebugWarning as a superclass to ResourceWarning than the other way around, especially once people start doing custom filters and decide that DebugWarning should not be filtered as a whole but ResourceWarning should (or some such crazy thing).
msg119401 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-10-22 19:49
> After thinking about what warning to go with, I take back my python-dev 
> suggestion of ResourceWarning and switch to DebugWarning.

So what is your advice now?
I've thought about the DebugWarning name myself and I think it's a rather bad name, because many warnings are debug-related in practice. I'd just say stick with ResourceWarning.
msg119402 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2010-10-22 20:01
On Fri, Oct 22, 2010 at 12:49, Antoine Pitrou <report@bugs.python.org> wrote:
>
> Antoine Pitrou <pitrou@free.fr> added the comment:
>
>> After thinking about what warning to go with, I take back my python-dev
>> suggestion of ResourceWarning and switch to DebugWarning.
>
> So what is your advice now?
> I've thought about the DebugWarning name myself and I think it's a rather bad name, because many warnings are debug-related in practice. I'd just say stick with ResourceWarning.

That's true. And the warning hierarchy is rather shallow anyway. I'm
fine with ResourceWarning then.
msg119405 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2010-10-22 22:13
I wonder why you think a warning is needed if files aren't closed explicitly. The fact that they get closed on garbage collection is one of the nice features of Python and has made programming easy for years. 

Explicitly having to close files may have some benefits w/r to resource management, but in most cases is not needed. I disagree that we should discourage not explicitly closing files. After all, this is Python, not C.
msg119406 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2010-10-22 22:17
But this is meant to be an optional warning; users will never see it. Me as a developer, I would like to know when I leave a file open as that is a waste of resources, plus with no guarantee of everything being flushed to disk.

Besides, the context manager for files makes the chance of leaving a file open a much more blaring mistake.
msg119824 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-10-28 23:15
I would need opinions on one more thing. The current patch warns when a socket has not been explicitly closed. But it does so even when the socket isn't bound at all. e.g.:

$ ./python -c "import socket; socket.socket()"
-c:1: ResourceWarning: unclosed <socket.socket object, fd=3, family=2, type=1, proto=0>

Perhaps we should be more discriminate and only warn when either bind(), listen() or connect() had been called previously? What do you think?
msg119831 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2010-10-29 00:26
If a new, unbound socket uses some form of OS resource, then a warning is needed. Is their an equivalent limitation like there is with file descriptors as to how many an OS can possibly have open at once?
msg119832 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-10-29 00:29
> If a new, unbound socket uses some form of OS resource, then a warning
> is needed. Is their an equivalent limitation like there is with file
> descriptors as to how many an OS can possibly have open at once?

A socket *is* a file descriptor (under Unix), so, yes, there is a limit.
msg119839 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2010-10-29 02:15
That's what I thought. So if socket.socket() alone is enough to consume an fd then there should be a warning.
msg119870 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2010-10-29 08:07
Brett Cannon wrote:
> 
> Brett Cannon <brett@python.org> added the comment:
> 
> But this is meant to be an optional warning; users will never see it. Me as a developer, I would like to know when I leave a file open as that is a waste of resources, plus with no guarantee of everything being flushed to disk.

That's what I'm referring to: most Python applications are
written with the fact in mind, that garbage collection will
close the files or socket.

That's a perfectly fine way of writing Python applications,
so why should the programmer get warned about this regular
approach to Python programming ?

> Besides, the context manager for files makes the chance of leaving a file open a much more blaring mistake.

See above: context aren't required for working with files. And again:
it's *not* a mistake to not explicitly close a file.

The same applies for sockets.

Think of the simple idiom:

data = open(filename).read()

This would always create a warning under the proposal.

Same for:

for line in open(filename):
   print line

Also note that files and sockets can be kept as references in data
structures (other than local variables or on the stack) and there
you usually have the same approach: you expect Python to close the
files when garbage collecting the data structures and that's
perfectly ok.

If you want to monitor resource usage in your application it
would be a lot more useful to provide access to the number of
currently open FDs, than scattering warnings around the
application which mostly trigger false positives.
msg119871 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-10-29 08:22
> That's what I'm referring to: most Python applications are
> written with the fact in mind, that garbage collection will
> close the files or socket.
> 
> That's a perfectly fine way of writing Python applications,

Some people would disagree, especially Windows users who cannot timely
delete files when some file descriptors still point to them.

> so why should the programmer get warned about this regular
> approach to Python programming ?

Again: it is an *optional* warning. It is *disabled* by default, except
when compiled --with-pydebug.

> The same applies for sockets.

It is *definitely* a mistake if the socket has been bound to a local
address and/or connected to a remote endpoint.

> Think of the simple idiom:
> 
> data = open(filename).read()
> 
> This would always create a warning under the proposal.

We have had many Windows buildbot failures because of such coding style.

> If you want to monitor resource usage in your application it
> would be a lot more useful to provide access to the number of
> currently open FDs

Agreed it would be useful as well, but please tell that to operating
system vendors. Python has no way to calculate such a statistic.
msg119879 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-10-29 10:15
Here is an updated patch (also fixes a small refleak).
msg119883 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-10-29 10:39
I've committed the patch in r85920; let's watch the buildbots. Also, you'll see many warnings in the test suite if compiled --with-pydebug.
msg119888 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2010-10-29 11:18
Antoine Pitrou wrote:
> 
> Antoine Pitrou <pitrou@free.fr> added the comment:
> 
>> That's what I'm referring to: most Python applications are
>> written with the fact in mind, that garbage collection will
>> close the files or socket.
>>
>> That's a perfectly fine way of writing Python applications,
> 
> Some people would disagree, especially Windows users who cannot timely
> delete files when some file descriptors still point to them.

There is no difference here:

Whether you write an application with automatic closing of
the file/socket at garbage collection time in mind, or you explicitly
close the file/socket, the timing is the same.

The only difference is with Python implementations that don't
use synchronous garbage collection, e.g. Jython, but not with
CPython.

>> so why should the programmer get warned about this regular
>> approach to Python programming ?
> 
> Again: it is an *optional* warning. It is *disabled* by default, except
> when compiled --with-pydebug.

Yes, I know. I still find the warning rather useless, since it
warns about code that's perfectly ok.

>> The same applies for sockets.
> 
> It is *definitely* a mistake if the socket has been bound to a local
> address and/or connected to a remote endpoint.

I don't follow you. Where's the difference between writing:

s.close()
or
s = None

for an open socket s ?

>> Think of the simple idiom:
>>
>> data = open(filename).read()
>>
>> This would always create a warning under the proposal.
> 
> We have had many Windows buildbot failures because of such coding style.

Again: where's the difference between writing:

data = open(filename).read()
and
f = open(filename)
data = f.read()
f.close()

?

If the above coding style causes problems, the reasons must be
elsewhere, since there is no difference between those two
styles (other than cluttering up your locals).

The for-loop file iterator support was explicitly added to make
writing:

for line in open(filename):
    print line

possible.

>> If you want to monitor resource usage in your application it
>> would be a lot more useful to provide access to the number of
>> currently open FDs
> 
> Agreed it would be useful as well, but please tell that to operating
> system vendors. Python has no way to calculate such a statistic.

At least for Linux, that's not hard and I doubt it is for other OSes.

4

On other Unixes, you can simply use fcntl() to scan all possible FDs
for open FDs.

On Windows you can use one of these functions for the same effect:
http://msdn.microsoft.com/en-us/library/kdfaxaay(v=VS.90).aspx
msg119889 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-10-29 11:52
> Whether you write an application with automatic closing of
> the file/socket at garbage collection time in mind, or you explicitly
> close the file/socket, the timing is the same.

No, because objects can be kept alive through tracebacks (or reference
cycles).

> I don't follow you. Where's the difference between writing:
> 
> s.close()
> or
> s = None
> 
> for an open socket s ?

The difference is when s is still referenced elsewhere.
Also, the intent of the former is clear while the latter is deliberately
obscure (while not saving any significant amount of typing).

> The for-loop file iterator support was explicitly added to make
> writing:
> 
> for line in open(filename):
>     print line
> 
> possible.

So what?

> At least for Linux, that's not hard and I doubt it is for other OSes.
> 
> 4
> 
> On other Unixes, you can simply use fcntl() to scan all possible FDs
> for open FDs.
> 
> On Windows you can use one of these functions for the same effect:
> http://msdn.microsoft.com/en-us/library/kdfaxaay(v=VS.90).aspx

Until you post some code I won't understand what you are talking about.
msg119895 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2010-10-29 12:55
Antoine Pitrou wrote:
> 
> Antoine Pitrou <pitrou@free.fr> added the comment:
> 
>> Whether you write an application with automatic closing of
>> the file/socket at garbage collection time in mind, or you explicitly
>> close the file/socket, the timing is the same.
> 
> No, because objects can be kept alive through tracebacks (or reference
> cycles).

If you get a traceback, the explicitly file.close() won't help
you either; but that usually doesn't matter, since program
execution will stop, the traceback will get GCed and the file
closed. This is a very normal and reasonable expectation of a
Python programmer.

>> I don't follow you. Where's the difference between writing:
>>
>> s.close()
>> or
>> s = None
>>
>> for an open socket s ?
> 
> The difference is when s is still referenced elsewhere.
> Also, the intent of the former is clear while the latter is deliberately
> obscure (while not saving any significant amount of typing).

Sure, but that's not the point. It is not a mistake to write
such code and neither is this obscure, otherwise we'd also
require explicit garbage collection for other parts of Python.

If the application keeps references to the objects in other
places, the programmer would, of course, add an explicit .close().
Ditto for the files.

>> The for-loop file iterator support was explicitly added to make
>> writing:
>>
>> for line in open(filename):
>>     print line
>>
>> possible.
> 
> So what?

The warning will trigger without any reason.

>> At least for Linux, that's not hard and I doubt it is for other OSes.
>>
>> 4
>>
>> On other Unixes, you can simply use fcntl() to scan all possible FDs
>> for open FDs.
>>
>> On Windows you can use one of these functions for the same effect:
>> http://msdn.microsoft.com/en-us/library/kdfaxaay(v=VS.90).aspx
> 
> Until you post some code I won't understand what you are talking about.

RoundUp's email interface ate the code. I'll post it again via the
web interface.
msg119896 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2010-10-29 12:57
Reposted via the web interface:

>> If you want to monitor resource usage in your application it
>> would be a lot more useful to provide access to the number of
>> currently open FDs
> Agreed it would be useful as well, but please tell that to operating
> system vendors. Python has no way to calculate such a statistic.

At least for Linux, that's not hard and I doubt it is for other OSes.

>>> import os
>>> nfds = len(os.listdir('/proc/%i/fd' % os.getpid()))
>>> print nfds
4

On other Unixes, you can simply use fcntl() to scan all possible FDs
for open FDs.

On Windows you can use one of these functions for the same effect:
http://msdn.microsoft.com/en-us/library/kdfaxaay(v=VS.90).aspx
msg119898 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-10-29 13:06
> The warning will trigger without any reason.

Well, by now you should have understood the reason, so I conclude that
you are either 1) deaf 2) stupid 3) deliberately obnoxious.

This is my last answer to you on this topic. Goodbye.
msg119901 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2010-10-29 13:25
Antoine Pitrou wrote:
> 
> Antoine Pitrou <pitrou@free.fr> added the comment:
> 
>> The warning will trigger without any reason.
> 
> Well, by now you should have understood the reason, so I conclude that
> you are either 1) deaf 2) stupid 3) deliberately obnoxious.
> 
> This is my last answer to you on this topic. Goodbye.

Ignoring your insults, I think the problem is that you fail to see
point or address it:

Warnings in Python are meant to tell programmers that something
should be fixed. The warnings in the examples I gave do not point
to cases that need to be fixed, in fact, they are very common
idioms used in Python books, examples and tutorials.

Note that you've just added warning filters to the test suite
to ignore the warning again. I find that a good argument
for not having it in this form in the first place.

A truely useful ResourceWarning would trigger if resources gets
close to some limit, e.g. if the number of open file descriptors
reaches a certain limit. This could easily be tested when
opening them, since they are plain integers.
msg119903 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-10-29 14:09
The buildbots showed no major issue, so this issue is now resolved. The warnings expose a lot of issues in the stdlib that deserve addressing, though ;)
msg119904 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2010-10-29 14:09
2010/10/29 Marc-Andre Lemburg <report@bugs.python.org>:
>
> Marc-Andre Lemburg <mal@egenix.com> added the comment:
>
> Antoine Pitrou wrote:
>>
>> Antoine Pitrou <pitrou@free.fr> added the comment:
>>
>>> The warning will trigger without any reason.
>>
>> Well, by now you should have understood the reason, so I conclude that
>> you are either 1) deaf 2) stupid 3) deliberately obnoxious.
>>
>> This is my last answer to you on this topic. Goodbye.
>
> Ignoring your insults, I think the problem is that you fail to see
> point or address it:
>
> Warnings in Python are meant to tell programmers that something
> should be fixed. The warnings in the examples I gave do not point
> to cases that need to be fixed, in fact, they are very common
> idioms used in Python books, examples and tutorials.

They're not particularly good idioms then. Leaving files open
haphazardly leads to subtle bugs as our test suite has shown.
msg120000 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-10-30 16:29
MAL wrote:
> Antoine wrote:
>> MAL wrote:
>>> I don't follow you. Where's the difference between writing:
>>>
>>> s.close()
>>> or
>>> s = None
>>>
>>> for an open socket s ?
>> 
>> The difference is when s is still referenced elsewhere.
>> Also, the intent of the former is clear while the latter is deliberately
>> obscure (while not saving any significant amount of typing).
>
>Sure, but that's not the point. It is not a mistake to write
>such code and neither is this obscure, otherwise we'd also
>require explicit garbage collection for other parts of Python.

Yes it is a mistake:

In an earlier message MAL wrote:
> The only difference is with Python implementations that don't
> use synchronous garbage collection, e.g. Jython, but not with
> CPython.

This by definition makes it non-equivalent and a bad *Python* idiom,
since it depends on an acknowledged CPython *implementation detail*.
As far as I know, there is no language requirement that mandates having
garbage collection at all.
History
Date User Action Args
2010-10-30 16:29:37r.david.murraysetnosy: + r.david.murray
messages: + msg120000
2010-10-29 14:09:30benjamin.petersonsetmessages: + msg119904
2010-10-29 14:09:24pitrousetstatus: open -> closed

messages: + msg119903
2010-10-29 13:25:59lemburgsetmessages: + msg119901
2010-10-29 13:06:58pitrousetmessages: + msg119898
2010-10-29 12:57:51lemburgsetmessages: + msg119896
2010-10-29 12:55:48lemburgsetmessages: + msg119895
2010-10-29 11:52:36pitrousetmessages: + msg119889
2010-10-29 11:18:34lemburgsetmessages: + msg119888
2010-10-29 10:39:14pitrousetresolution: fixed
messages: + msg119883
stage: committed/rejected
2010-10-29 10:15:11pitrousetfiles: + deallocwarn4.patch

messages: + msg119879
2010-10-29 08:22:35pitrousetmessages: + msg119871
2010-10-29 08:07:34lemburgsetmessages: + msg119870
2010-10-29 02:15:20brett.cannonsetmessages: + msg119839
2010-10-29 00:29:41pitrousetmessages: + msg119832
2010-10-29 00:26:47brett.cannonsetmessages: + msg119831
2010-10-28 23:15:32pitrousetmessages: + msg119824
2010-10-22 22:17:31brett.cannonsetmessages: + msg119406
2010-10-22 22:13:41lemburgsetnosy: + lemburg
messages: + msg119405
2010-10-22 20:01:12brett.cannonsetmessages: + msg119402
2010-10-22 19:49:56pitrousetmessages: + msg119401
2010-10-22 03:04:21brett.cannonsetnosy: + brett.cannon
messages: + msg119361
2010-10-15 12:25:46pitrousetmessages: + msg118761
2010-10-15 11:49:46amaury.forgeotdarcsetmessages: + msg118759
2010-10-14 21:34:13pitrousetfiles: + deallocwarn3.patch

messages: + msg118718
2010-10-14 21:33:53pitrousetfiles: - deallocwarn3.patch
2010-10-14 19:43:05pitrousetfiles: + deallocwarn3.patch

messages: + msg118713
2010-10-14 10:29:50pitrousetfiles: + deallocwarn2.patch

messages: + msg118644
2010-10-14 09:20:41pitrousetmessages: + msg118643
2010-10-14 09:17:12giampaolo.rodolasetmessages: + msg118642
2010-10-14 09:15:42giampaolo.rodolasetnosy: + giampaolo.rodola
2010-10-14 09:03:23pitrousetmessages: + msg118641
2010-10-14 01:00:03benjamin.petersonsetmessages: + msg118609
2010-10-14 00:41:46alexsetnosy: + alex
messages: + msg118608
2010-10-14 00:09:20benjamin.petersonsetmessages: + msg118606
2010-10-13 23:54:09brian.curtinsetnosy: + brian.curtin
2010-10-13 23:53:26pitrousetmessages: + msg118605
2010-10-13 22:20:24exarkunsetnosy: + exarkun
messages: + msg118594
2010-10-13 21:06:54pitrousetfiles: + deallocwarn.patch
2010-10-13 21:06:46pitrousetfiles: - deallocwarn.patch
2010-10-13 20:22:26pitrousetnosy: + amaury.forgeotdarc, benjamin.peterson
messages: + msg118578
2010-10-13 20:16:28pitroucreate