classification
Title: logging: ValueError: I/O operation on closed file
Type: behavior Stage: test needed
Components: Library (Lib) Versions: Python 2.7
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: vinay.sajip Nosy List: hpk, pekka.klarck, srid, vinay.sajip
Priority: normal Keywords:

Created on 2009-06-24 01:42 by srid, last changed 2012-03-08 22:20 by vinay.sajip. This issue is now closed.

Files
File name Uploaded Description Edit
issue6333repro.py srid, 2009-07-08 22:34 Repro script
Messages (20)
msg89653 - (view) Author: Sridhar Ratnakumar (srid) Date: 2009-06-24 01:42
The logging module has a bug that tries to call `flush' on a closed 
file handle (sys.std[out|err] to be specific). This bug was introduced 
by ConsoleHandler as defined in http://code.activestate.com/
recipes/576819/

The fix is simple: change definition of StreamHandler.flush in logging/
__init__.py to:

  def flush(self):
      if self.stream and hasattr(self.stream, 'flush') and not 
self.stream.closed:
          logging.StreamHandler.flush()
msg89654 - (view) Author: Sridhar Ratnakumar (srid) Date: 2009-06-24 01:46
BTW, this only happens when running the tests via py.test - http://
pytest.org ... perhaps threading/multiprocess issue.
msg90148 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2009-07-05 09:32
Can you submit a short script which demonstrates the bug? Your fix is
unfortunately not suitable to use as-is, since the StreamHandler
operates on file-like objects, and these are not guaranteed to have the
"closed" attribute.
msg90287 - (view) Author: Sridhar Ratnakumar (srid) Date: 2009-07-08 22:34
I have attached a script that repros this bug. Run it using py.test 
(instructions inside the file).

Workaround is by uncommenting the last few lines ('flush' method).
msg90288 - (view) Author: Sridhar Ratnakumar (srid) Date: 2009-07-08 22:39
Note: It appears that py.test assigns a temporary file to sys.std[out|
err].
msg90292 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2009-07-08 23:06
Perhaps it's a py.test issue. Have you tried with unittest and nose?
Remember, py.test has had little love for a while and so perhaps has a
few rougher edges (though I hear in the blogosphere that the project may
be rejuvenated shortly).
msg90294 - (view) Author: Sridhar Ratnakumar (srid) Date: 2009-07-08 23:19
Perhaps there is an issue with py.test.

My understanding is that: if the logging module is designed to let 
users implement their own custom handlers (like I did with 
ConsoleHandler), is it not expected that they may do whatever they want 
with it including closing the file (in this case - sys.stdout is a 
temporary file that is obviously closed at one stage)? Shouldn't the 
logging module be reliable in such cases?

I will point holger (author of py.test) to this issue in case if he has 
any comments.

Anyways, this is a very low-prio bug for me .. especially when there is 
a workaround.
msg90322 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2009-07-09 10:10
> Shouldn't the logging module be reliable in such cases?

It tries to be, but it's neither omnipotent nor omniscient. It doesn't 
expect streams which it's using to be closed out from under it.

See my comments about your ConsoleHandler recipe(s) on issue6345. Sorry 
if it seems I'm trying to rain on your parade - I'm just calling it as 
I see it.
msg90324 - (view) Author: holger krekel (hpk) Date: 2009-07-09 10:47
I think the issue is unrelated to py.test - it just presents a use case
as it wants to run 1000's of tests and capture stdout/stderr per each
test function, cannot guess about a test's logging-usage yet wants to
minimize memory/resource usage and close any temporary files it opens.  

Anyway, a standalone minimal example involving the issue is this:

import logging
import StringIO
stream = StringIO.StringIO()
logging.basicConfig(stream=stream)
stream.close() # to free memory/release resources

At exit logging's shutdown() will try to flush/close resulting in an
exception.  Is it a problem to have the logging module be a bit more
defensive and only try a flush/close if the stream isn't already closed?
msg90328 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2009-07-09 13:48
> Is it a problem to have the logging module be a bit more defensive and
> only try a flush/close if the stream isn't already closed?

Not particularly, but it would be better if the following approach were
taken:

import logging
import StringIO
stream = StringIO.StringIO()
h = logging.StreamHandler(stream)
r = logging.getLogger()
r.addHandler(h)
#Do your testing work
r.removeHandler(h)
h.close() # closes underlying stream, also removes logging refs to h

It should be understood that if a stream is passed to a StreamHandler,
the ownership of that stream is passed as well, so closing the handler
rather than the underlying stream is the right thing to do. ISTM this is
good practice in general and not exclusive to logging.
msg90344 - (view) Author: holger krekel (hpk) Date: 2009-07-09 19:33
To recap the use case: stdout is redirected during a test function run
which might trigger arbitrary usage of logging-functionality.  Not
closing the temporary file would mean that there could be as many open
files as there are test functions - or one needs to rely on garbage
collection for closing the resource - this is generally considered bad
practise.  So I consider it best practise to do resource cleanup
immediately and close the temp file.  

Note that the test tool *does not touch the logging module at all*, and
it has no way to mandate the usage of the logging module like you suggest.
msg90345 - (view) Author: Sridhar Ratnakumar (srid) Date: 2009-07-09 19:56
On Thu, 09 Jul 2009 06:48:28 -0700, Vinay Sajip <report@bugs.python.org>  
wrote:

> It should be understood that if a stream is passed to a StreamHandler,
> the ownership of that stream is passed as well, so closing the handler
> rather than the underlying stream is the right thing to do. ISTM this is
> good practice in general and not exclusive to logging.

However, sys.stdout|err can be assigned to some thing else (eg: py.test  
assigning it to an open file object) .. in which case it is legitimate to  
handle (close) such objects (handles) by who created it (py.test).
msg90363 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2009-07-09 23:43
> However, sys.stdout|err can be assigned to some thing else
> (eg: py.test assigning it to an open file object) .. in which case it 
> is legitimate to handle (close) such objects (handles) by who
> created it (py.test).

@Sridhar: I disagree. Creation is not ownership. Clearly ownership is
*initially* with the creator, but when a stream is used to initialise a
StreamHandler, ownership of the stream passes to the StreamHandler.

@Holger: I am not asking that py.test be specially modified to cater for
use of a logging module it knows nothing about. But your example showed
logging calls; I indicated what changes were needed to that sample to
interoperate with logging. So if that was a part of the test cases, then
the design of test cases which use logging (and not py.test itself, if
it doesn't invoke logging directly) needs to consider the interaction
between each test case, logging and py.test. And it is not solely the
responsibility of py.test to minimise resource usage during test runs;
part of this is also the responsibility of the test case design, which
surely is why test frameworks have the concepts of setUp and tearDown.

Your example calls logging.basicConfig passing a stream, and the stream
is then immediately closed. If the test case had involved threads which
did logging, for example, how could you guarantee that a call on the
stream (other than flush) would never happen after the call to close the
stream? Would I then have to ensure not only that flush was defensively
programmed, but write as well? It seems like a can of worms to me,
because of the propensity of this kind of approach to mask logic errors.
I'm not saying you never have to resort to defensive programming - just
that errors shouldn't pass silently where this is avoidable. The design
of StreamHandler is predicated on it owning the stream (for example, it
takes care to serialise access to the stream with locks, to allow
thread-safe operation). This would be pointless if other code could
arbitrarily operate on the stream without worrying about thread-safety.

I also pointed out numerous flaws in Sridhar's reasoning both in this
issue and in issue6435 (not py.test related) and in my view, he needs to
look more closely at the logging package and the included functionality,
and if necessary the core's regression test module, test_logging.py, to
see what sorts of issues crop up. For example, I pointed out in msg90148
that file-like objects are not guaranteed to have a "closed" attribute,
to which he has not responded. I would appeal for more attention to detail.

And just in case it may seem otherwise, I'll state explicitly that I am
not trying to be difficult. Similar issues have arisen during the
initial development of test_logging.py. 

I will update the documentation for StreamHandler, FileHandler etc. to
highlight the issue of ownership of the stream, as perhaps it's not
clear enough. Sorry if this post seems a little long-winded.
msg90367 - (view) Author: Sridhar Ratnakumar (srid) Date: 2009-07-10 00:14
On Thu, 09 Jul 2009 16:43:49 -0700, Vinay Sajip <report@bugs.python.org>
wrote:

> Vinay Sajip <vinay_sajip@yahoo.co.uk> added the comment:
>
>> However, sys.stdout|err can be assigned to some thing else
>> (eg: py.test assigning it to an open file object) .. in which case it
>> is legitimate to handle (close) such objects (handles) by who
>> created it (py.test).
> @Sridhar: I disagree. Creation is not ownership. Clearly ownership is
> *initially* with the creator, but when a stream is used to initialise a
> StreamHandler, ownership of the stream passes to the StreamHandler.

Are you suggesting that the ownership of `sys.stderr` belongs to the
logging module once logging.basicConfig (that initializes a StreamHandler
with stderr) is called? That no other module/library is to close
sys.stderr even though they created it (sys.__stderr__ being the backup)?

StreamHandler can take ownership of an arbitrary stream (say, created by
the caller) passed to it, but assuming ownership of a standard stream,
that are free to be overridden by a library (such as py.test), is rather
bizarre.

> I pointed out in msg90148
> that file-like objects are not guaranteed to have a "closed" attribute,
> to which he has not responded.

If this bug is acknowledged as a problem, then coming up with a fix is a  
different issue. I don't know how to detect whether a stream is closed or  
not .. especially when you say that not all file-like objects have a  
'closed' attribute (BTW, which doesn't?), but this is the only solution I  
know currently:

     ...
     # some streams, such as sys.stdout|err, cannot be owned by  
StreamHandler
     # they might have been closed by whoever customized it.
     closed = hasattr(self.stream, 'closed') and self.stream.closed
     if self.stream and hasattr(self.stream, 'flush') and not closed:
         self.stream.flush()
     ...

-srid
msg90379 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2009-07-10 08:03
> Are you suggesting that the ownership of `sys.stderr` belongs to the
> logging module once logging.basicConfig (that initializes a 
> StreamHandler with stderr) is called?
> That no other module/library is to close sys.stderr even though they
> created it (sys.__stderr__ being the backup)?

> StreamHandler can take ownership of an arbitrary stream (say, created
> by the caller) passed to it, but assuming ownership of a
> standard stream, that are free to be overridden by a library (such as
> py.test), is rather bizarre.

Actually, it's not bizarre at all. Are you saying it's OK for multiple
threads in a program to write to sys.stderr, whenever they want, without
restriction, even if the end result is end-user visible output which is
gibberish (for example data from two logical streams interspersed
randomly)? Just because it's a "standard" stream? 

Also, perhaps you haven't noticed that StreamHandler has no way of
knowing whether the stream it was passed was a standard stream or not.
Logging may make certain hasattr-type checks e.g. for presence of
"flush" or "close", but it can't expect to not have ownership of the
stream, even for a standard stream.

I advise you to do the following: refactor your test cases to include
the explicit handler-adding code in your setup (instead of basicConfig),
and the handler-removing and closing code in your teardown.

def setUp(self):
  self.hdlr = logging.StreamHandler(stream)
  someLogger.addHandler(h)

def tearDown(self):
    someLogger.removeHandler(self.hdlr)
    self.hdlr.close()

where I have given someLogger module scope in the above snippet, you may
of course make it a testcase attribute too. I've used a unittest-style
of setUp/tearDown, I assume py.test is similar.

No changes to py.test are required. What happens if you do this?
msg90389 - (view) Author: holger krekel (hpk) Date: 2009-07-10 11:59
Actually py.test catches stdout separately for setup and for the test
code.  Moreover, functional or integration test code (py.test is not
only for unittests) can easily trigger some implicit logging-module
usage which cannot eaysily be factored into a testcase-setup method, i
am afraid.
msg155163 - (view) Author: Pekka Klärck (pekka.klarck) Date: 2012-03-08 14:54
The same problem that caused problems to py.test caused problems also to Robot Framework:
http://code.google.com/p/robotframework/issues/detail?id=1079

I was surprised to notice this issue was closed as invalid although the problem didn't occur with Python 2.7 anymore. After a little more digging I noticed that the proposed fix has actually been done in r84282 as part of fixing issue 9051. The fix doesn't check for `closed` attribute but instead silences possible ValueError.

We also noticed a variation of the problem: If the registered handler ever tries to write anything to its stream you also get an exception. We decided to silence all these errors with this code:

import logging
logging.raiseExceptions = False

Finally, I consider it a separate bug that logging.StreamHandler uses sys.stderr by default. It should use sys.__stderr__ instead.
msg155166 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2012-03-08 16:49
Re. the fix to #9501, the swallowing of exceptions was done for specific reasons noted in the comments in the exception clause which does the swallowing. This only happens during application shutdown - not all ValueErrors on flush() or close() are swallowed. In general, swallowing these will mask logic errors which close a stream prematurely.

If you pass a stream to a logging.StreamHandler, make sure you remove the handler from any loggers you attached it to, then close it. Otherwise (if you just close the underlying stream) the StreamHandler might still try to use it, leading to the error. Unattaching the handler and closing it tells logging you're done with writing to that stream, so subsequent logging calls won't try to write to that stream.

I'm not aware that every file-like object (including duck-typed ones) will have a 'closed' attribute, so checking for that may not help.

If you want to set sys.stderr to a StringIO() and then create a StreamHandler from it, that's fine. The sequence should be:

set sys.stderr to StringIO instance
create a StreamHandler with sys.stderr
add that handler to a logger

when finished, 

remove that handler from a logger
close the handler (StreamHandler does not close the underlying stream)
restore sys.stderr
close the StringIO instance
access the contents of the StringIO instance using getvalue()

If things are not done in that order, it's possible for arbitrary code (e.g. in a 3rd-party library) to log an event, because of which logging then tries to write to the stream. If that stream is closed, you get the ValueError - that's a warning that something isn't being done correctly, and the answer is not to just swallow that exception.

Re. your proposed fix - it may be fine in your specific scenario, but I wouldn't recommend it in general because it may mask logic errors in a program.

Regarding your preference for sys.__stderr__ rather than sys.stderr, that's all it is - a preference. You can easily pass the exact stream that you want to StreamHandler, including sys.__stderr__, so the fact that it picks sys.stderr by default shouldn't bother you - just be explicit.

Of course, that default - whatever the rights and wrongs of it - cannot be changed without breaking existing code, so it has to stay that way.
msg155181 - (view) Author: Pekka Klärck (pekka.klarck) Date: 2012-03-08 21:07
@vinay.sajip the problem is that many tools that don't do anything with logging module intercept sys.stdout and sys.stderr. Such tools typically aren't even aware of libraries they use (or test) using logging and much less about them registering StreamHandler using sys.stderr.

It's great that nowadays you don't always get ValueError at exit anymore. Now you only get them if the StreamHandler itself tries to use the stream after sys.stderr is restored and the intercepting stream closed. Typically this only happens in error situations so the problem isn't so severe.

IMHO nobody should ever register StreamHandler with sys.stderr because you cannot be sure has someone intercepted it or not. There's a strong convention that sys.__stderr__ should not be intercepted so using it is safe.

I strongly believe StreamHandler should be changed to use sys.__stderr__ by default. That shouldn't cause problems because a) if sys.stderr isn't intercepted sys.stderr is sys.__stderr__, and b) if sys.stderr is intercepted you most likely already have problems. I know stuff like this cannot be changed other than in major releases and there should perhaps even be a deprecation period. I also acknowledge that it might be too much work to be worth the effort. Mentioning this issue in the docs might be a good idea nevertheless.
msg155184 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2012-03-08 22:20
> IMHO nobody should ever register StreamHandler with sys.stderr because you 

> cannot be sure has someone intercepted it or not. There's a strong 
> convention that sys.__stderr__ should not be intercepted so using it is safe.

Actually, I believe it is reasonable to intercept sys.stderr, even with logging in use, because you might be testing that logging actually works in a common scenario (use of StreamHandler in the default configuration). To do that, you would need to capture sys.stderr and check that its contents contain logged messages. For example, the Python test suite contains specific tests which intercept (and restore) sys.stderr, for this very purpose.

If one is careful to follow the sequence of events I mentioned in my last response, it should be safe to intercept sys.stderr. You only have to remember that if you have passed it to logging to use for I/O, logging may try to use it at times you don't expect (e.g. because of logging by third-party code). Just remove it from logging before you close it, and all will be well. This, of course, applies to any stream you pass to logging - not just sys.stderr or a replacement for it like a StringIO.

Using sys.__stderr__ is of course doable right now by your code; as you are the first one to mention this as desirable, I think it quite likely that people are working with the existing default without any problems. So it's unlikely that this default will change (as it doesn't need to - you can be explicit about the stream you want to use). If there is a lot of adverse feedback about the default (unlikely, since it's been there ever since logging was added to Python - about 10 years ago) then I will of course reconsider.
History
Date User Action Args
2012-03-08 22:20:42vinay.sajipsetmessages: + msg155184
2012-03-08 21:07:59pekka.klarcksetmessages: + msg155181
2012-03-08 16:49:26vinay.sajipsetmessages: + msg155166
2012-03-08 14:54:38pekka.klarcksetnosy: + pekka.klarck
messages: + msg155163
2009-07-10 11:59:22hpksetmessages: + msg90389
2009-07-10 08:03:05vinay.sajipsetmessages: + msg90379
2009-07-10 08:01:29vinay.sajipsetmessages: - msg90378
2009-07-10 08:00:02vinay.sajipsetmessages: + msg90378
2009-07-10 00:14:35sridsetmessages: + msg90367
2009-07-09 23:43:45vinay.sajipsetmessages: + msg90363
2009-07-09 23:08:29vinay.sajipsetmessages: - msg90346
2009-07-09 19:58:46sridsetmessages: + msg90346
2009-07-09 19:56:32sridsetmessages: + msg90345
2009-07-09 19:33:45hpksetmessages: + msg90344
2009-07-09 13:48:27vinay.sajipsetmessages: + msg90328
2009-07-09 10:47:45hpksetnosy: + hpk
messages: + msg90324
2009-07-09 10:10:58vinay.sajipsetstatus: open -> closed
resolution: not a bug
messages: + msg90322
2009-07-08 23:19:07sridsetmessages: + msg90294
2009-07-08 23:06:42vinay.sajipsetmessages: + msg90292
2009-07-08 22:39:02sridsetmessages: + msg90288
2009-07-08 22:34:03sridsetfiles: + issue6333repro.py

messages: + msg90287
2009-07-07 09:28:53ezio.melottisetpriority: normal
status: pending -> open
stage: test needed
versions: + Python 2.7, - Python 2.6
2009-07-05 09:32:47vinay.sajipsetstatus: open -> pending

nosy: + vinay.sajip
messages: + msg90148

assignee: vinay.sajip
2009-06-24 01:46:11sridsetmessages: + msg89654
2009-06-24 01:42:46sridcreate