This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: tokenize.open() leaks an open binary file on TextIOWrapper error
Type: resource usage Stage: needs patch
Components: Library (Lib) Versions: Python 3.4, Python 3.5
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: introom, itsmemattchung, python-dev, vstinner
Priority: normal Keywords: easy, patch

Created on 2015-04-01 11:52 by vstinner, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
tokenize_open_bug.py vstinner, 2015-04-01 11:52
tokenize.patch introom, 2015-04-01 20:54 review
issue23840.patch itsmemattchung, 2015-04-01 21:20 review
tokenize.patch itsmemattchung, 2015-04-01 23:39 review
tokenizeV2.patch introom, 2015-04-01 23:54 review
tokenize.patch itsmemattchung, 2015-04-02 00:09 moved buffer outside of try block per storchaka review
Messages (18)
msg239786 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-04-01 11:52
If TextIOWrapper constructor fails in tokenize.open(), the open binary file is not closed and a ResourceWarning is emited.

Try attached patch:

$ python3 -Wd ~/tokenize_open_bug.py
tokenize.open failed: unknown encoding for 'test.py': xxx
/home/haypo/tokenize_open_bug.py:13: ResourceWarning: unclosed file <_io.BufferedReader name='test.py'>
  print("tokenize.open failed: %s" % err)

The fix is quite simple: add "try: ... except: buffer.close(); raise". If someone wants to fix this issue, an unit test must be added, test based on my script and ensures that the buffer is closed (ex: mock tokenize._builtin_open and checks that close() was called).
msg239787 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-04-01 11:53
I sent an email to the python core mentorship mailing list to find a volunteer to fix this easy issue.
msg239847 - (view) Author: shiyao.ma (introom) * Date: 2015-04-01 20:54
ITSM it's not the TextIOWrapper but the detect_encoding fails and throws an error.

So I wrapped a try/catch block around that.
msg239848 - (view) Author: Matt Chung (itsmemattchung) Date: 2015-04-01 21:20
Comparing to introom. Any reason to not wrap the entire in a try? Attached patch.
msg239850 - (view) Author: Matt Chung (itsmemattchung) Date: 2015-04-01 21:32
Curious, 
@haypo, are you looking for a new unittest or a function within an existing unit test? Perhaps go under TestTokenize? 

1227 def test_main():
1228     from test import test_tokenize
1229     support.run_doctest(test_tokenize, True)
1230     support.run_unittest(TestTokenizerAdheresToPep0263)
1231     support.run_unittest(Test_Tokenize)
1232     support.run_unittest(TestDetectEncoding)
1233     support.run_unittest(TestTokenize)
1234     support.run_unittest(UntokenizeTest)
msg239852 - (view) Author: Matt Chung (itsmemattchung) Date: 2015-04-01 21:40
I believe it goes here under the TestTokenize class. Going to give it a shot now.
msg239854 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-04-01 22:18
> ITSM it's not the TextIOWrapper but the detect_encoding fails and throws an error.

Oh, right. But TextIOWrapper can fail for differen reasons. For example, CTRL+c may send KeyboardInterrupt.

Try for example:

    with unittest.mock.patch.object(tokenize, '_builtin_open') as mock_open:
        mock_file = mock_open.return_value
        mock_file.tell.side_effect = OSError
        mock_file.readline.return_value = b''

        tokenize.open(fn)

This example raises an OSError in TextIOWrapper on file.tell(), and file.close() is not called.
msg239857 - (view) Author: shiyao.ma (introom) * Date: 2015-04-01 22:25
On Wed, Apr 1, 2015 at 6:18 PM, STINNER Victor <report@bugs.python.org> wrote:

> Oh, right. But TextIOWrapper can fail for differen reasons. For example, CTRL+c may send KeyboardInterrupt.

Yes. Any comment on the test?

I wil update a second patch.

Regards.

-- 

吾輩は猫である。ホームーページはhttp://introo.me
msg239858 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-04-01 22:26
> Yes. Any comment on the test?

Yes, as I wrote, I reviewed patches. You may get a notification by email. Anyway, it's here:
http://bugs.python.org/review/23840/diff/14417/Lib/test/test_tokenize.py
(I also commented tokenize.py)
msg239861 - (view) Author: Matt Chung (itsmemattchung) Date: 2015-04-01 23:39
Updated patch based off of haypo's comment. Changed "except Exception as err:" -> "except"
msg239862 - (view) Author: shiyao.ma (introom) * Date: 2015-04-01 23:54
Based upon the previous review.

Catch the TextWrapper, move the test into a function.
msg239864 - (view) Author: Matt Chung (itsmemattchung) Date: 2015-04-02 00:09
Based off of storchaka's comment, moving the
msg240192 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-04-06 23:36
tokenize.patch is wrong: you should not call buffer.close() if TextIOWrapper() was created successfully.
msg241988 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-04-24 22:42
tokenizeV2.patch and tokenize.patch have issues, I reviewed them. Can someone please write a new patch taking my comments in account?
msg241990 - (view) Author: Matt Chung (itsmemattchung) Date: 2015-04-24 23:41
Hey Haypo,
I'm working on submitting the new patch now.  Still getting used to the workflow and tools here. Thanks for being patient.

You should see the new file in the next 30 minutes.
msg243285 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-05-15 22:26
> You should see the new file in the next 30 minutes.

I don't see the new file.
msg244057 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015-05-25 22:49
New changeset 623e07ea43df by Victor Stinner in branch '3.4':
Issue #23840: tokenize.open() now closes the temporary binary file on error to
https://hg.python.org/cpython/rev/623e07ea43df

New changeset a640d268ba97 by Victor Stinner in branch '3.5':
(Merge 3.5) Issue #23840: tokenize.open() now closes the temporary binary file
https://hg.python.org/cpython/rev/a640d268ba97

New changeset c5cfd6353b4b by Victor Stinner in branch 'default':
(Merge 3.6) Issue #23840: tokenize.open() now closes the temporary binary file
https://hg.python.org/cpython/rev/c5cfd6353b4b
msg244058 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-05-25 22:50
Sorry for not being more available for feedback on patches. I chose to write the final patch because patches were not updated to take in account my latest comments.

I hope that this issue helps you at least the process for reviewing patches ;-)
History
Date User Action Args
2022-04-11 14:58:15adminsetgithub: 68028
2015-05-25 22:50:49vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg244058
2015-05-25 22:49:34python-devsetnosy: + python-dev
messages: + msg244057
2015-05-15 22:26:26vstinnersetmessages: + msg243285
2015-04-24 23:41:54itsmemattchungsetmessages: + msg241990
2015-04-24 22:42:14vstinnersetmessages: + msg241988
2015-04-06 23:36:15vstinnersetmessages: + msg240192
2015-04-02 00:09:46itsmemattchungsetfiles: + tokenize.patch

messages: + msg239864
2015-04-01 23:54:08introomsetfiles: + tokenizeV2.patch

messages: + msg239862
2015-04-01 23:39:37itsmemattchungsetfiles: + tokenize.patch

messages: + msg239861
2015-04-01 22:26:40vstinnersetmessages: + msg239858
2015-04-01 22:25:23introomsetmessages: + msg239857
2015-04-01 22:18:33vstinnersetmessages: + msg239854
2015-04-01 21:40:48itsmemattchungsetmessages: + msg239852
2015-04-01 21:32:30itsmemattchungsetmessages: + msg239850
2015-04-01 21:20:05itsmemattchungsetfiles: + issue23840.patch
nosy: + itsmemattchung
messages: + msg239848

2015-04-01 20:54:10introomsetfiles: + tokenize.patch

nosy: + introom
messages: + msg239847

keywords: + patch
2015-04-01 13:02:46serhiy.storchakasettype: resource usage
components: + Library (Lib)
stage: needs patch
2015-04-01 11:53:41vstinnersetmessages: + msg239787
2015-04-01 11:52:33vstinnercreate