classification
Title: "unclosed file" warning when defining unused logging FileHandler in dictConfig
Type: Stage:
Components: Library (Lib) Versions: Python 3.4
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: vinay.sajip, wdoekes
Priority: normal Keywords:

Created on 2014-12-08 13:46 by wdoekes, last changed 2015-01-06 12:29 by vinay.sajip.

Messages (9)
msg232302 - (view) Author: Walter Doekes (wdoekes) Date: 2014-12-08 13:46
If you define a FileHandler in your logging dictConfig but fail to use it in the same configuration, the file handle is leaked and a ResourceWarning is thrown.

Versions:

    $ python3 -V
    Python 3.4.0
    $ apt-cache show python3.4 | grep ^Version
    Version: 3.4.0-2ubuntu1

Expected:

    $ PYTHONWARNINGS="default,ignore::DeprecationWarning" \
        python3 problem.py 
    imported once

Observed:

    $ PYTHONWARNINGS="default,ignore::DeprecationWarning" \
        python3 problem.py 
    .../lib/python3.4/importlib/_bootstrap.py:656:
        ResourceWarning: unclosed file
          <_io.FileIO name='/tmp/debug.log' mode='ab'>
      code = marshal.loads(data)
    imported once

To reproduce, save below as problem.py:

    if __name__ == '__main__':
        LOGGING = {
            'version': 1,
            'handlers': {
                'logfile': {
                    'level': 'DEBUG',
                    'class': 'logging.FileHandler',
                    'filename': '/tmp/debug.log',
                },
            },
        }
        
        from logging.config import dictConfig
        dictConfig(LOGGING)
        
        # using importlib on a new file triggers the warnings
        import importlib, shutil, os
        shutil.copy(__file__, __file__ + '.new')
        os.unlink(__file__)
        os.rename(__file__ + '.new', __file__)
        importlib.import_module('problem')
    else:
        print('imported once')


Fixed by:

    --- /usr/lib/python3.4/logging/config.py.orig	2014-12-08 14:06:24.911460799 +0100
    +++ /usr/lib/python3.4/logging/config.py	2014-12-08 14:16:17.519460799 +0100
    @@ -637,6 +637,16 @@ class DictConfigurator(BaseConfigurator)
                         except Exception as e:
                             raise ValueError('Unable to configure root '
                                              'logger: %s' % e)
    +
    +                # Close unused loggers
    +                used_handlers = set()
    +                for logger in logging.root.manager.loggerDict.values():
    +                    if hasattr(logger, 'handlers') and logger.handlers:
    +                        used_handlers.add(*logger.handlers)
    +                for name, handler in handlers.items():
    +                    if handler not in used_handlers:
    +                        if hasattr(handler, 'close'):
    +                            handler.close()
             finally:
                 logging._releaseLock()
     

The fix above might not be the best fix, but it proves where the problem is. Perhaps the handlers shouldn't be instantiated until the first user of such a handler is instantiated.

This issue was found by using Django 1.7 with Python 3.4 when enabling the PYTHONWARNINGS.

Cheers,
Walter Doekes
OSSO B.V.
msg232321 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2014-12-08 19:36
Well, it's possible that you could configure handlers in the configuration for later use (i.e. at some point after the dictConfig() call returns).

If you want to avoid opening the file until it's actually needed, you can specify delay=True, and then you shouldn't see the resource leak because the stream will be opened when it's actually needed.

Unless there's some reason you can't do that, or there's something I've overlooked, I think I should close this as invalid.
msg232326 - (view) Author: Walter Doekes (wdoekes) Date: 2014-12-08 20:23
Thanks for the quick response!

> Well, it's possible that you could configure handlers in the configuration for later use (i.e. at some point after the dictConfig() call returns).

After the dictConfig call returns, the StreamHandler/FileHandler is not referenced by anyone anymore. That's what causes the ResourceWarning. Unless I'm severely mistaken, there is no way to reach that old FileHandler instance.

> If you want to avoid opening the file until it's actually needed, you can specify delay=True.

I am aware of that, but that's a workaround, not a fix. (And it has drawbacks of its own, for example in forking and/or setuid situations.)

> Unless there's some reason you can't do that, or there's something I've overlooked, I think I should close this as invalid.

You could do that, but -- barring me overlooking something here -- I think that would only be correct if the dictionary that I passed to dictConfig is judged as being illegal, because it contains unused handlers.

The ResourceWarning thrown is hard to understand because of where it is raised (at random points in a different modules), so like it is now, it may dissuade users from enabling more (visual) warnings. I'd rather see a warning raised earlier from dictConfig() that I configured an unused handler, so I have a better indication of what to fix.
msg232337 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2014-12-08 23:20
The delay= is not really a workaround for this particular circumstance, it's mainly there to avoid leaving empty files around and allocating resources that might never be used (which is analogous to this issue, but delay was not implemented to work around this specific scenario).

The handlers are AFAIK referenced - if you peek at logging._handlerList or logging._handlers you should see them in there.

A config dictionary is not judged as being illegal just because it contains unused handlers - you just happen to allocate resource that you never use. You could just leave out the handlers, since you're never using them; it's not logging's job to be overly restrictive about this sort of thing. Python doesn't warn you for allocating a dictionary that you never populate, or if you populate a dictionary that you then never interrogate. The type of unnecessary allocation of resources you're talking about happens a lot in programs - in fact, it also applies to loggers, since you might never use them in a specific program run, because of which code gets run based on command-line arguments, or with handlers that are "used" (in the sens you mean here, because they are linked to loggers), but never *actually* used because the levels in the configuration are set to CRITICAL, and there are no .critical() calls encountered during a program run. I'm not sure a line can usefully be drawn regarding "useless" allocations.
msg232350 - (view) Author: Walter Doekes (wdoekes) Date: 2014-12-09 08:10
> The handlers are AFAIK referenced - if you peek at logging._handlerList or logging._handlers you should see them in there.

Aha. But that's the point. They aren't. If they were, I wouldn't have any problems with this. But I have this problem because the warnings are generated at startup of my program, not at cleanup.

Look at this:

    $ PYTHONWARNINGS=default python3 problem.py 2>&1 | sed -e 's/^/    /'
    2014-12-09 08:58:45.982258: starting
    2014-12-09 08:58:46.011788: after dictconfig
    /usr/lib/python3.4/importlib/_bootstrap.py:321: ResourceWarning: unclosed file <_io.FileIO name='/tmp/debug.log' mode='ab'>
      return f(*args, **kwds)
    imported once
    2014-12-09 08:58:46.019327: after error
    2014-12-09 08:58:51.024376: after sleep
    _handlerList: []
    _handlers: []

If what you say is true, the following wouldn't be:

- _handlerList is non-empty (wrong, see above)
- the warning would be generated at application shutdown (no, it's generated at importlib time, see the timestamps)

(See source below.)


> You could just leave out the handlers, since you're never using them; it's not logging's job to be overly restrictive about this sort of thing.

I could agree with you there. But in that case I still think we should call it illegal (and perhaps raise a more sensible warning).

Obviously you spot the problem immediately in my toy example. But in a large LOGGING dictionary, items may be shuffled around and changed. When that happens, and these warnings turn up out of the blue, that's distracting to put it mildly.

I expect that you'll reconsider your strict opinion after seeing that logging._handlers is indeed empty and the handlers are not referenced (and therefore should have been destroyed properly or not created at all).

Cheers,
Walter



Example source:

    $ cat problem.py | sed -e 's/^/    /'
    from datetime import datetime
    from time import sleep
    import sys
    
    if __name__ == '__main__':
        LOGGING = {
            'version': 1,
            'handlers': {
                'logfile': {
                    'level': 'DEBUG',
                    'class': 'logging.FileHandler',
                    'filename': '/tmp/debug.log',
                },
            },
        }
    
        print('%s: starting' % (datetime.now(),))
        
        from logging.config import dictConfig
        dictConfig(LOGGING)
        
        print('%s: after dictconfig' % (datetime.now(),))
        sys.stdout.flush()
    
        # using importlib on a new file triggers the warnings
        import importlib, shutil, os
        shutil.copy(__file__, __file__ + '.new')
        os.unlink(__file__)
        os.rename(__file__ + '.new', __file__)
        importlib.import_module('problem')
        sys.stderr.flush()
    
        print('%s: after error' % (datetime.now(),))
        sys.stdout.flush()
    
        sleep(5)
        print('%s: after sleep' % (datetime.now(),))
    
        # Vinay Sajip wrote:
        # > The handlers are AFAIK referenced - if you peek at
        # logging._handlerList or logging._handlers you should see them in
        # there.
        from logging import _handlerList, _handlers
        print('_handlerList:', _handlerList)
        print('_handlers:', [i for i in _handlers.values()])
    else:
        print('imported once')
msg232356 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2014-12-09 08:58
> Aha. But that's the point. They aren't.

Ok, I'll investigate further. Thanks for your persistence :-)
msg233316 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2015-01-02 16:47
Sorry I've not had much time to look at this yet. I haven't forgotten.
msg233353 - (view) Author: Walter Doekes (wdoekes) Date: 2015-01-03 11:19
No worries. I know how it is ;)
Thanks for the update.
msg233531 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2015-01-06 12:29
Data point: if you print out _handlerList immediately after the dictConfig() call, it *is* there, as I would have expected. The following script saved as logtest6.py:

from datetime import datetime
from time import sleep
import sys

if __name__ == '__main__':
    LOGGING = {
        'version': 1,
        'handlers': {
            'logfile': {
                'level': 'DEBUG',
                'class': 'logging.FileHandler',
                'filename': '/tmp/debug.log',
            },
        },
    }

    print('%s: starting' % (datetime.now(),))
    from logging.config import dictConfig
    dictConfig(LOGGING)
    from logging import _handlerList
   
    print('%s: after dictconfig' % (datetime.now(),))
    sys.stdout.flush()
    print('_handlerList 1:', _handlerList)
    # using importlib on a new file triggers the warnings
    import importlib, shutil, os
    print('_handlerList 2:', _handlerList)
    shutil.copy(__file__, __file__ + '.new')
    os.unlink(__file__)
    os.rename(__file__ + '.new', __file__)
    importlib.import_module('logtest6')
    sys.stderr.flush()

    print('%s: after error' % (datetime.now(),))
    sys.stdout.flush()

    sleep(5)
    print('%s: after sleep' % (datetime.now(),))

    # Vinay Sajip wrote:
    # > The handlers are AFAIK referenced - if you peek at
    # logging._handlerList or logging._handlers you should see them in
    # there.
    from logging import _handlerList, _handlers
    print('_handlerList 3:', _handlerList)
else:
    print('imported once')

When run, yields

$ python3.4 logtest6.py 
2015-01-06 12:26:46.910634: starting
2015-01-06 12:26:47.290223: after dictconfig
_handlerList 1: [<weakref at 0xb7223e34; to 'FileHandler' at 0xb71680dc>]
/home/vinay/projects/python/3.4/Lib/collections/__init__.py:373: ResourceWarning: unclosed file <_io.FileIO name='/tmp/debug.log' mode='ab'>
  exec(class_definition, namespace)
_handlerList 2: []
imported once
2015-01-06 12:26:47.388877: after error
2015-01-06 12:26:52.394514: after sleep
_handlerList 3: []
History
Date User Action Args
2015-01-06 12:29:41vinay.sajipsetmessages: + msg233531
2015-01-03 11:19:08wdoekessetmessages: + msg233353
2015-01-02 16:47:01vinay.sajipsetmessages: + msg233316
2014-12-09 08:58:43vinay.sajipsetmessages: + msg232356
2014-12-09 08:10:03wdoekessetmessages: + msg232350
2014-12-08 23:20:50vinay.sajipsetmessages: + msg232337
2014-12-08 20:23:22wdoekessetmessages: + msg232326
2014-12-08 19:36:09vinay.sajipsetmessages: + msg232321
2014-12-08 15:36:05r.david.murraysetnosy: + vinay.sajip
2014-12-08 13:46:54wdoekescreate