classification
Title: argparse.FileType opens a file and never closes it
Type: behavior Stage:
Components: Library (Lib) Versions: Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: David.Layton, Paolo.Elvati, Stefan.Pfeiffer, bethard, eric.araujo, manveru, paul.j3
Priority: normal Keywords: patch

Created on 2012-01-19 12:25 by David.Layton, last changed 2014-04-01 07:36 by paul.j3.

Files
File name Uploaded Description Edit
patch_3.diff paul.j3, 2013-09-23 20:26 review
Messages (8)
msg151615 - (view) Author: David Layton (David.Layton) Date: 2012-01-19 12:25
argparse.FileType.__call__ opens the specified file and returns it. This is well documented as an anit-idiom in http://docs.python.org/howto/doanddont.html#exceptions. 

"...a serious problem — due to implementation details in CPython, the file would not be closed when an exception is raised until the exception handler finishes; and, worse, in other implementations (e.g., Jython) it might not be closed at all regardless of whether or not an exception is raised."

Disregarding the above, handling a file which may or may not have been opened depending the users input requires a bit of boilerplate code compared to the usual with-open idiom.  

Additionally, there is no way to prevent FileType from clobbering an existing file when used with write mode. 

Given these issues and others, it seems to me that the usefulness of FileType is outweighed by propensity to encourage bad coding. Perhaps, it would be best if FileType (or some replacement) simply checked that the file exists (when such a check is appropriate), it can be opened in the specified mode, and, curry the call to open (i.e. return lambda: open(string, self._mode, self._bufsize))
msg152518 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2012-02-03 14:32
> Additionally, there is no way to prevent FileType from clobbering an existing file
> when used with write mode.
I think this deserves its own feature request: now that Python 3.3 has an “exclusive create” mode, argparse.FileType could gain support for it.  Would you open that request?

> Given these issues and others,
We have one issue and one missing feature; what are the other issues?

> it seems to me that the usefulness of FileType is outweighed by propensity to encourage bad
> coding.
I think the problem is not as bad as you paint it.  A great number of unclosed file handles may cause problem to some OSes, but that’s it.  On the plus side, the fact that argparse accepts for its type argument any callable that can check and convert a string input is simple, clean and works.  FileType is needed.

In packaging/distutils2 for example we have similar functions that return an open file object and never close it: the responsibility is at a higher level.  Other packaging code calling these functions does so in a with statement.  It is not evil by nature.

The problem here is that FileType may return stdin or stdout, so we can’t just always close the file object (or maybe we can, say using an atexit handler?).

> Perhaps, it would be best if FileType (or some replacement) simply checked that the file exists
But then you’d run into race conditions.  The only sure was to say if a file can be opened is to open it.
msg152519 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2012-02-03 14:37
s/sure was/sure way/
msg152530 - (view) Author: David Layton (David.Layton) Date: 2012-02-03 15:59
Eric,

checked that the file exists
But then you’d run into race conditions.  The only sure was to say if a
file can be opened is to open it.

    I think you misunderstand me. I am NOT suggesting that you open and
close the file. I am saying that you should not open it in the first place.
If I cannot open the file at the point in my program where I actually want
to open it, then fine, I can decide, in my own code, what to do.

propensity to encourage bad
> coding.
I think the problem is not as bad as you paint it.  A great number of
unclosed file handles may cause problem to some OSes, but that’s it.  On
the plus side, the fact that argparse accepts for its type argument any
callable that can check and convert a string input is simple, clean and
works.  FileType is needed.

  Causing a problem on some OSes and not others is worse than causing a
problem on all OSes as it increases the likelihood of buggy code passing
tests and moving to production.
 I think argparse is wonderful; I just think that by having FileType not
open the file, the number of it's use cases is increased. As it stands now,
I would prefer the just pass the argument as a string argument and handling
the opening myself unless:

   1. I wanted my program to open the file at the very beginning of the
   program (where one traditionally handles arg parsing)
   2. I wanted to exit on the first, and only, attempt to open the file
   3. I really did not care if the file closed properly--which, granted, is
   often the case with tiny scripts

The moment any of these is not true due to a change in requirements, I will
have to refactor my code to use a filename arg. Where as if I start out
with a bog-standard filename and open it myself, I can easily add the
behaviour I want. I just don't see FileType as a big convenience. However,
I do see that changing this would break backwards compatibility and would
not want to see that happen. Perhaps a new FileNameType that does some
basic, perhaps, optional checks would have wider use-cases.

I hope this helps.

                David Layton

On Fri, Feb 3, 2012 at 2:37 PM, Éric Araujo <report@bugs.python.org> wrote:

>
> Éric Araujo <merwok@netwok.org> added the comment:
>
> s/sure was/sure way/
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue13824>
> _______________________________________
>
msg166068 - (view) Author: Steven Bethard (bethard) * (Python committer) Date: 2012-07-21 21:01
So I generally agree that FileType is not what you want for anything but quick scripts that can afford to either leave a file open or to close stdin and/or stdout. However, quick scripts are an important use case for argparse, so I don't think we should get rid of FileType.

What should definitely happen:

* Someone should add some documentation explaining the risks of using FileType (e.g. forgetting to close the file, or closing stdin/stdout when you didn't mean to).

What could potentially happen if someone really wanted it:

* Someone could create a "safe" replacement, e.g. a FileOpenerType that returns an object with open() and close() methods that do the right things (or whatever API makes sense).
msg198336 - (view) Author: paul j3 (paul.j3) * (Python triager) Date: 2013-09-23 20:26
In this patch I implement a FileContext class.  It differs from FileType
in 2 key areas:

- it returns a 'partial(open, filename, ...)'
- it wraps '-' (stdin/out) in a dummy context protecting the file from closure.

The resulting argument is meant to be used as:

    with args.input() as f:
        f.read()
        etc

The file is not opened until value is called, and it will be closed after the block.  stdin/out can also be used in this context, but without closing.

The signature for this class is the same as for FileType, with one added parameter, 'style'. (alternative name suggestions welcomed).

class argparse.FileContext(mode='r', bufsize=-1, encoding=None, errors=None, style='delayed')

The default behavior, "style='delayed'", is as described above.

"style='evaluate'" immediately calls the partial, returning an opened file.  This is essentially the same as FileType, except for the stdin/out context wrapping.

"style='osaccess'", adds os.acccess testing to determine whether the 'delayed' file can be read or written.  It attempts to catch the same sort of OS errors that  FileType would, but without actually opening or creating the file.

Most of the added tests in test_argparse.py copy the FileType tests.  I had to make some modifications to the testing framework to handle the
added levels of indirection.

I have not written the documentation changes yet.

A sample use case is:

    import argparse, sys
    p = argparse.ArgumentParser()
    p.add_argument('-d','--delayed', type=argparse.FileContext('r'))
    p.add_argument('-e','--evaluated', type=argparse.FileContext('r', style='evaluate'))
    p.add_argument('-t','--test', dest='delayed', type=argparse.FileContext('r', style='osaccess'))
    p.add_argument('-o','--output', type=argparse.FileContext('w', style='osaccess'), default='-')
    p.add_argument('--unused', type=argparse.FileContext('w', style='osaccess'),help='unused write file')
    args = p.parse_args()

    with args.output() as o:
        if args.delayed:
            with args.delayed() as f:
                print(f.read(), file=o)
        if args.evaluated:
            with args.evaluated as f:
                print(f.read(), file=o)
    # f and o will be closed if regular files
    # but not if stdin/out
    # the file referenced by args.unused will not be created
msg198376 - (view) Author: paul j3 (paul.j3) * (Python triager) Date: 2013-09-25 07:15
An alternative way to delay the file opening, is to return an instance that has a `filename` attribute, and has an `open` method.  This can be compactly added to the `FileContext` that I defined in the previous patch.  The `FileContext` provides the `_ostest` function (testing using os.access), and wrapper for stdin/out.


    class FileOpener(argparse.FileContext):
        # delayed FileType; alt to use of partial()
        # sample use:
        # with args.input.open() as f: f.read()
        def __call__(self, string):
            string = self._ostest(string)
            self.filename = string
            return self
        def open(self):
            return self.__delay_call__(self.filename)()
        file =  property(open, None, None, 'open file property')
msg215298 - (view) Author: paul j3 (paul.j3) * (Python triager) Date: 2014-04-01 07:36
From http://bugs.python.org/issue14156 
"argparse.FileType for '-' doesn't work for a mode of 'rb'"

I learned that 'stdin/out' can be embedded in an 'open' by using the 'fileno()' and 'closefd=False' (so it isn't closed at the end of open).

With this, the dummy file context that I implemented in the previous patch, could be replaced with:

    partial(open, sys.stdin.fileno(), mode=self._mode,..., closefd=False)

However, as Steven Bethard wrote in the earlier issue, setting up tests when stdin/out will be redirected is not a trivial problem.  So I don't yet have a testable patch.

---------------
And just for the record, the 'osaccess' testing that I wrote earlier, probably should also test that proposed output file string is not already a directory.
History
Date User Action Args
2014-04-01 07:36:51paul.j3setmessages: + msg215298
2013-09-25 07:15:01paul.j3setmessages: + msg198376
2013-09-23 20:26:04paul.j3setfiles: + patch_3.diff
keywords: + patch
messages: + msg198336
2013-09-16 06:57:02paul.j3setnosy: + paul.j3
2012-07-21 21:01:09bethardsetmessages: + msg166068
2012-02-03 15:59:12David.Laytonsetmessages: + msg152530
2012-02-03 14:37:42eric.araujosetmessages: + msg152519
2012-02-03 14:32:58eric.araujosetnosy: + eric.araujo

messages: + msg152518
title: argparse.FileType opens a file without excepting resposibility for closing it -> argparse.FileType opens a file and never closes it
2012-01-19 12:25:39David.Laytoncreate