classification
Title: chained exception/incorrect exception from tarfile.open on a non-existent file
Type: behavior Stage: needs patch
Components: Interpreter Core Versions: Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: barry Nosy List: Trundle, barry, ev, georg.brandl, python-dev, r.david.murray, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2011-03-14 23:40 by ev, last changed 2013-11-10 21:43 by barry. This issue is now closed.

Files
File name Uploaded Description Edit
tarfile-fix-multiple-exception-on-enoent.patch ev, 2011-03-18 16:03 review
Messages (12)
msg130932 - (view) Author: Evan Dandrea (ev) * Date: 2011-03-14 23:40
Using Python tip from Sunday, I noticed that tarfile does not elegantly handle ENOENT by raising a single exception:

>>> tarfile.TarFile.gzopen('fdsfd', 'r')
Traceback (most recent call last):
  File "/home/evan/hg/cpython/Lib/tarfile.py", line 1808, in gzopen
    fileobj = gzip.GzipFile(name, mode + "b", compresslevel, fileobj)
  File "/home/evan/hg/cpython/Lib/gzip.py", line 157, in __init__
    fileobj = self.myfileobj = builtins.open(filename, mode or 'rb')
IOError: [Errno 2] No such file or directory: 'fdsfd'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/evan/hg/cpython/Lib/tarfile.py", line 1812, in gzopen
    fileobj.close()
AttributeError: 'NoneType' object has no attribute 'close'

I tried to fix this in a cross-platform way, by attempting to open the file before processing it, and letting the with statement handle calling close:

diff -r 955547e57cff Lib/tarfile.py
--- a/Lib/tarfile.py	Sun Mar 13 19:32:21 2011 +0100
+++ b/Lib/tarfile.py	Mon Mar 14 19:33:21 2011 -0400
@@ -1793,6 +1793,10 @@
         if len(mode) > 1 or mode not in "rw":
             raise ValueError("mode must be 'r' or 'w'")
 
+        if mode == "r":
+            # force an exception if the file does not exist.
+            with open(name):
+                pass
         try:
             import gzip
             gzip.GzipFile

However, this produces infinite recursion:
...
  File "/home/evan/hg/cpython/Lib/tarfile.py", line 1738, in open
    return func(name, "r", fileobj, **kwargs)
  File "/home/evan/hg/cpython/Lib/tarfile.py", line 1798, in gzopen
    with open(name):
  File "/home/evan/hg/cpython/Lib/tarfile.py", line 1738, in open
    return func(name, "r", fileobj, **kwargs)
RuntimeError: maximum recursion depth exceeded

Curiously, if I force the ValueError in that same function to be triggered (by passing a three character string for the mode), that exception is raised fine.  I am therefore wondering if this is a bug in the exit processing.

Unfortunately, my attempts at producing a general test case have been unsuccessful thusfar.
msg130945 - (view) Author: Andreas Stührk (Trundle) Date: 2011-03-15 01:57
The infinite recursion happens because `open` in tarfile is really `TarFile.open` (see last line in the module). The builtin `open` is imported as `_open`.

I also think that explicitly checking for "fileobj" being None is a cleaner solution. With your current method (trying to open the file beforehand), you will introduce a race condition: If the file exists when you open it for the first time but has been deleted when you try to open it the second time, the exact same error will happen ("fileobj" being None, that is).
msg131012 - (view) Author: Evan Dandrea (ev) * Date: 2011-03-15 18:08
*facepalm* Indeed, thanks for pointing that out and nice catch on the race condition.  Attached is a patch to fix the issue as you've suggested, and adds a test case for it.
msg131136 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-03-16 16:40
This fix reveals a second bug.  Without this fix, a non-existent file raises an IOError with an appropriate error message, but with the chained exception.  After this fix, it raises an error that says 'not a gzip file', which while technically true is not very helpful :)

The correct IOError message only happened by accident in the original code, but we need to fix this second bug in order to fix the first one correctly.  I suggest that the test case should read:

  with self.assertRaisesRegex("xxx", IOError) as ex:
      tarfile.open("xxx", self.mode)
  self.assertEqual(ex.exception.errno, errno.ENOENT)
msg131138 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-03-16 16:42
Note that the code in question was changed to the form with the bugs in 3.2, so this fix does not need to be backported to 3.1.  The test could be, though.
msg131333 - (view) Author: Evan Dandrea (ev) * Date: 2011-03-18 16:03
David,

Thanks for the pointers. I've updated the patch hopefully adequately addressing your concerns.
msg134267 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2011-04-22 15:46
Commented on the patch.  I'll be happy to land this for Evan.
msg142018 - (view) Author: Roundup Robot (python-dev) Date: 2011-08-13 09:48
New changeset 843cd43206b4 by Georg Brandl in branch '3.2':
Fix #11513: wrong exception handling for the case that GzipFile itself raises an IOError.
http://hg.python.org/cpython/rev/843cd43206b4
msg142019 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2011-08-13 09:51
Fixed in 3.2/default.

2.7 has even more primitive error handling; should the gzopen() be adapted to the 3.x case?
msg202566 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2013-11-10 20:06
Should this issue still remain open?  The original report described a chained exception, which obviously doesn't happen in 2.7 (nor with Georg's changeset, in 3.2, 3.3, or 3.4).

RDM's message implies there still may still be bugs lurking here in 2.7, but OTOH, the original issue isn't a problem (i.e. no chained exceptions).  I'd be tempted to say that if there are still problems here, it would be better to open a 2.7 specific bug.
msg202570 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-11-10 20:36
I suppose that 2.7 may leak GzipFile in case of some errors, but this is another issue.
msg202576 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2013-11-10 21:43
Alright, I'm going to close this issue.  Please open a new bug for Python 2.7.
History
Date User Action Args
2013-11-10 21:43:29barrysetstatus: open -> closed
resolution: fixed
messages: + msg202576
2013-11-10 20:36:02serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg202570
2013-11-10 20:06:29barrysetmessages: + msg202566
2011-12-02 16:55:36ezio.melottisetstage: needs patch
type: behavior
versions: + Python 2.7, - Python 3.2, Python 3.3
2011-08-13 09:51:22georg.brandlsetnosy: + georg.brandl
messages: + msg142019
2011-08-13 09:48:45python-devsetnosy: + python-dev
messages: + msg142018
2011-04-22 15:46:29barrysetassignee: barry

messages: + msg134267
nosy: + barry
2011-03-18 16:04:12evsetfiles: - tarfile-fix-multiple-exception-on-enoent.patch
nosy: r.david.murray, Trundle, ev
2011-03-18 16:03:42evsetfiles: + tarfile-fix-multiple-exception-on-enoent.patch
nosy: r.david.murray, Trundle, ev
messages: + msg131333

components: + Interpreter Core, - Library (Lib)
type: behavior -> (no value)
2011-03-16 16:42:56r.david.murraysetversions: + Python 3.2, Python 3.3
nosy: r.david.murray, Trundle, ev
title: Infinite recursion around raising an exception in tarfile -> chained exception/incorrect exception from tarfile.open on a non-existent file
messages: + msg131138

components: + Library (Lib), - Interpreter Core
type: behavior
2011-03-16 16:40:56r.david.murraysetnosy: + r.david.murray
messages: + msg131136
2011-03-15 18:10:04evsetfiles: - tarfile-fix-multiple-exception-on-enoent.patch
2011-03-15 18:09:54evsetfiles: + tarfile-fix-multiple-exception-on-enoent.patch
2011-03-15 18:08:23evsetfiles: + tarfile-fix-multiple-exception-on-enoent.patch

messages: + msg131012
keywords: + patch
2011-03-15 01:57:42Trundlesetnosy: + Trundle
messages: + msg130945
2011-03-14 23:40:32evcreate