Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wave.Wave_read.close() doesn't release file #55064

Closed
pjcreath mannequin opened this issue Jan 7, 2011 · 11 comments
Closed

wave.Wave_read.close() doesn't release file #55064

pjcreath mannequin opened this issue Jan 7, 2011 · 11 comments
Labels
performance Performance or resource usage stdlib Python modules in the Lib dir

Comments

@pjcreath
Copy link
Mannequin

pjcreath mannequin commented Jan 7, 2011

BPO 10855
Nosy @birkenfeld, @pitrou, @ned-deily, @serhiy-storchaka
Files
  • issue10855.diff
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2012-11-25.20:47:02.376>
    created_at = <Date 2011-01-07.16:23:42.112>
    labels = ['library', 'performance']
    title = "wave.Wave_read.close() doesn't release file"
    updated_at = <Date 2012-11-25.21:16:22.582>
    user = 'https://bugs.python.org/pjcreath'

    bugs.python.org fields:

    activity = <Date 2012-11-25.21:16:22.582>
    actor = 'serhiy.storchaka'
    assignee = 'none'
    closed = True
    closed_date = <Date 2012-11-25.20:47:02.376>
    closer = 'georg.brandl'
    components = ['Library (Lib)']
    creation = <Date 2011-01-07.16:23:42.112>
    creator = 'pjcreath'
    dependencies = []
    files = ['20374']
    hgrepos = []
    issue_num = 10855
    keywords = ['patch']
    message_count = 11.0
    messages = ['125654', '125750', '125751', '125772', '126106', '126130', '126131', '126133', '126137', '176382', '176389']
    nosy_count = 5.0
    nosy_names = ['georg.brandl', 'pitrou', 'ned.deily', 'pjcreath', 'serhiy.storchaka']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'resource usage'
    url = 'https://bugs.python.org/issue10855'
    versions = ['Python 3.1', 'Python 2.7', 'Python 3.2']

    @pjcreath
    Copy link
    Mannequin Author

    pjcreath mannequin commented Jan 7, 2011

    Calling wave.close() fails to release all references to the file passed in via wave.open(filename_or_obj, "rb"). As a result, processing many wave files produces an IOError of too many files open.

    This bug is often masked because this dangling reference is collected if the wave object is collected. However, if the wave object is retained, calling wave_obj.close() won't release the reference, and so the file will never be closed.

    There are two solutions:

    1. The workaround: the client program can explicitly close the file object it passed to the wave object ("file_obj.close()").

    2. The bug fix: the wave module can properly release the extra reference to the file, by setting "self._data_chunk = None" in the close() method. Explanation:

    Trunk code (and 2.7.1, and older):

        def close(self):
            if self._i_opened_the_file:
                self._i_opened_the_file.close()
                self._i_opened_the_file = None
            self._file = None

    but note initfp(self, file):
    ...
    self._file = Chunk(file, bigendian = 0)
    ...
    chunk = Chunk(self._file, bigendian = 0)
    ...
    self._data_chunk = chunk
    ...

    therefore close needs to add:

            self._data_chunk = None

    @pjcreath pjcreath mannequin added stdlib Python modules in the Lib dir performance Performance or resource usage labels Jan 7, 2011
    @ned-deily
    Copy link
    Member

    Thanks for the report and analysis. Would you care to submit a patch to fix it?

    @ned-deily
    Copy link
    Member

    (Presumably this is also a problem for Python 3, as well).

    @birkenfeld
    Copy link
    Member

    This is not a bug in the implementation: the file object is only closed when you passed a file name to open().

    Like other APIs that allow file names or objects to be passed in, it is the caller's responsibility to close the file object if an object was passed.

    However, this was not documented. I've fixed that with r87859.

    @pjcreath
    Copy link
    Mannequin Author

    pjcreath mannequin commented Jan 12, 2011

    Thank you for clarifying the documentation. However, I don't think that fully resolves the issue.

    I'm not complaining about a failure to close the file. As you observe, it doesn't need to (and shouldn't) close a file object, but it should release the reference.

    The code already tries to release the reference ("self._file = None"). It just fails to release it correctly, missing the other reference to the file object (self._data_chunk). That's the bug.

    Your clarification of the documentation is appreciated nonetheless.

    I've attached a patch as Ned requested. The same patch can currently be applied to release27-maint, release31-maint, and py3k. (The line numbers and surrounding context are identical.)

    @pjcreath pjcreath mannequin reopened this Jan 12, 2011
    @birkenfeld
    Copy link
    Member

    I don't really see the bug here. Either you openened the file object, then you have to close it. Or wave.py opened it, then it will close it, no matter if it still has a reference or not.

    Of course we could set _data_chunk to None, but I'm unsure what behavior change you would expect from that.

    @pitrou
    Copy link
    Member

    pitrou commented Jan 12, 2011

    Agreed with Georg. No OS resource is leaking if the file is explicitly closed (since it releases the underlying file descriptor). That the Python "file object" is still attached somewhere is of secondary importance.

    @pjcreath
    Copy link
    Mannequin Author

    pjcreath mannequin commented Jan 12, 2011

    A point of clarification on the original report: Georg is completely right when he points out that this is only an issue when passing in a file object. If passed a filename, wave.py both opens and closes the file explicitly, and the dangling reference isn't important, as Antoine observes.

    However, a retained reference in the file-object case is still a leak.

    Georg writes: "Of course we could set _data_chunk to None, but I'm unsure what behavior change you would expect from that."

    It allows garbage collection to close the file object if there are no more references to it. It seems reasonable for a client of wave.py to assume that close() will release all references to the object, and indeed the code seems to support that assumption -- it sets _file to None.

    If releasing references were truly of no importance, then I would argue that the line setting _file to None should be removed. It serves no purpose after wave.py has explicitly closed the file (if it opened it) other than to release a reference to the file object.

    Therefore, I suggest that _data_chunk should also be set to None in order to release the reference completely, thereby allowing the file object to be garbage collected.

    @pitrou
    Copy link
    Member

    pitrou commented Jan 12, 2011

    It allows garbage collection to close the file object if there are no
    more references to it.

    This is a very bad policy to begin with. Garbage collection can be delayed for a number of reasons:

    • someone might be running your program on a Python implementation which doesn't use reference counting (such as Jython or PyPy)
    • an exception, together with its traceback object, might capture the value of some local variables and keep them alive (that is, reachable from the GC's point of view)
    • a reference cycle might delay proper resource cleanup until the cyclic garbage collector kicks in

    So the good thing to do is to close your file explicitly. Luckily, Python 2.6 and upwards makes it easier by using the "with" statement.

    IMO this issue should be closed.

    @serhiy-storchaka
    Copy link
    Member

    I am not sure, should this issue be closed as "rejected" because the suggested patch for Lib/wave.py was rejected, or as "fixed" because Georg was committed the documentation patch (changeset 8239ec6f39e6) for this issue?

    @birkenfeld
    Copy link
    Member

    I say "fixed": there was a bug (undocumented, but correct behavior) and that was fixed.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    performance Performance or resource usage stdlib Python modules in the Lib dir
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants