Title: wave.Wave_read.close() doesn't release file
Type: resource usage Stage: resolved
Components: Library (Lib) Versions: Python 3.1, Python 3.2, Python 2.7
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: georg.brandl, ned.deily, pitrou, pjcreath, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2011-01-07 16:23 by pjcreath, last changed 2012-11-25 21:16 by serhiy.storchaka. This issue is now closed.

File name Uploaded Description Edit
issue10855.diff pjcreath, 2011-01-12 15:31
Messages (11)
msg125654 - (view) Author: Peter Creath (pjcreath) Date: 2011-01-07 16:23
Calling wave.close() fails to release all references to the file passed in via, "rb").  As a result, processing many wave files produces an IOError of too many files open.

This bug is often masked because this dangling reference is collected if the wave object is collected.  However, if the wave object is retained, calling wave_obj.close() won't release the reference, and so the file will never be closed.

There are two solutions:

1) The workaround: the client program can explicitly close the file object it passed to the wave object ("file_obj.close()").

2) The bug fix: the wave module can properly release the extra reference to the file, by setting "self._data_chunk = None" in the close() method.  Explanation:

Trunk code (and 2.7.1, and older):

    def close(self):
        if self._i_opened_the_file:
            self._i_opened_the_file = None
        self._file = None

but note initfp(self, file):
        self._file = Chunk(file, bigendian = 0)
                chunk = Chunk(self._file, bigendian = 0)
                self._data_chunk = chunk

therefore close needs to add:

        self._data_chunk = None
msg125750 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2011-01-08 05:51
Thanks for the report and analysis.  Would you care to submit a patch to fix it?
msg125751 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2011-01-08 05:54
(Presumably this is also a problem for Python 3, as well).
msg125772 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2011-01-08 09:45
This is not a bug in the implementation: the file object is only closed when you passed a file name to open().

Like other APIs that allow file names or objects to be passed in, it is the caller's responsibility to close the file object if an object was passed.

However, this was not documented.  I've fixed that with r87859.
msg126106 - (view) Author: Peter Creath (pjcreath) Date: 2011-01-12 15:31
Thank you for clarifying the documentation.  However, I don't think that fully resolves the issue.

I'm not complaining about a failure to close the file.  As you observe, it doesn't need to (and shouldn't) close a file object, but it should release the reference.

The code already tries to release the reference ("self._file = None").  It just fails to release it correctly, missing the other reference to the file object (self._data_chunk).  That's the bug.

Your clarification of the documentation is appreciated nonetheless.

I've attached a patch as Ned requested.  The same patch can currently be applied to release27-maint, release31-maint, and py3k.  (The line numbers and surrounding context are identical.)
msg126130 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2011-01-12 19:24
I don't really see the bug here.  Either you openened the file object, then you have to close it.  Or opened it, then it will close it, no matter if it still has a reference or not.

Of course we could set _data_chunk to None, but I'm unsure what behavior change you would expect from that.
msg126131 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-01-12 19:30
Agreed with Georg. No OS resource is leaking if the file is explicitly closed (since it releases the underlying file descriptor). That the Python "file object" is still attached somewhere is of secondary importance.
msg126133 - (view) Author: Peter Creath (pjcreath) Date: 2011-01-12 20:16
A point of clarification on the original report: Georg is completely right when he points out that this is only an issue when passing in a file object.  If passed a filename, both opens and closes the file explicitly, and the dangling reference isn't important, as Antoine observes.

However, a retained reference in the file-object case is still a leak.

Georg writes: "Of course we could set _data_chunk to None, but I'm unsure what behavior change you would expect from that."

It allows garbage collection to close the file object if there are no more references to it.  It seems reasonable for a client of to assume that close() will release all references to the object, and indeed the code seems to support that assumption -- it sets _file to None.

If releasing references were truly of no importance, then I would argue that the line setting _file to None should be removed.  It serves no purpose after has explicitly closed the file (if it opened it) other than to release a reference to the file object.

Therefore, I suggest that _data_chunk should also be set to None in order to release the reference completely, thereby allowing the file object to be garbage collected.
msg126137 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-01-12 21:05
> It allows garbage collection to close the file object if there are no 
> more references to it.

This is a very bad policy to begin with. Garbage collection can be delayed for a number of reasons:
- someone might be running your program on a Python implementation which doesn't use reference counting (such as Jython or PyPy)
- an exception, together with its traceback object, might capture the value of some local variables and keep them alive (that is, reachable from the GC's point of view)
- a reference cycle might delay proper resource cleanup until the cyclic garbage collector kicks in

So the good thing to do is to close your file explicitly. Luckily, Python 2.6 and upwards makes it easier by using the "with" statement.

IMO this issue should be closed.
msg176382 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-11-25 18:31
I am not sure, should this issue be closed as "rejected" because the suggested patch for Lib/ was rejected, or as "fixed" because Georg was committed the documentation patch (changeset 8239ec6f39e6) for this issue?
msg176389 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2012-11-25 20:47
I say "fixed": there was a bug (undocumented, but correct behavior) and that was fixed.
Date User Action Args
2012-11-25 21:16:22serhiy.storchakasetstage: resolved
2012-11-25 20:47:02georg.brandlsetstatus: open -> closed
resolution: rejected -> fixed
messages: + msg176389
2012-11-25 18:31:43serhiy.storchakasetstatus: pending -> open
nosy: + serhiy.storchaka
messages: + msg176382

2011-01-12 21:05:01pitrousetstatus: open -> pending
versions: - Python 2.6, Python 2.5
nosy: georg.brandl, pitrou, ned.deily, pjcreath
messages: + msg126137

resolution: rejected
stage: test needed -> (no value)
2011-01-12 20:16:44pjcreathsetnosy: georg.brandl, pitrou, ned.deily, pjcreath
messages: + msg126133
2011-01-12 19:30:34pitrousetnosy: + pitrou
messages: + msg126131
2011-01-12 19:24:46georg.brandlsetnosy: georg.brandl, ned.deily, pjcreath
messages: + msg126130
2011-01-12 15:31:53pjcreathsetstatus: closed -> open
files: + issue10855.diff

versions: + Python 2.6, Python 2.5, Python 3.1
keywords: + patch
nosy: georg.brandl, ned.deily, pjcreath
messages: + msg126106
resolution: fixed -> (no value)
2011-01-08 09:45:53georg.brandlsetstatus: open -> closed

nosy: + georg.brandl
messages: + msg125772

resolution: fixed
2011-01-08 05:54:29ned.deilysetnosy: ned.deily, pjcreath
messages: + msg125751
versions: + Python 3.2
2011-01-08 05:51:43ned.deilysetnosy: + ned.deily

messages: + msg125750
stage: test needed
2011-01-07 16:23:42pjcreathcreate