classification
Title: Add caching for the isEnabledFor() computation
Type: Stage:
Components: Library (Lib) Versions: Python 3.1, Python 3.2, Python 3.3, Python 2.7
process
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: vinay.sajip Nosy List: William.Hart, vinay.sajip
Priority: normal Keywords:

Created on 2011-03-02 03:03 by William.Hart, last changed 2011-04-11 18:21 by William.Hart. This issue is now closed.

Files
File name Uploaded Description Edit
logging__init__diffs.txt William.Hart, 2011-03-02 03:03 Diffs in logging/__init__.py between Python 2.7.1 and pyutilib.logging
unnamed William.Hart, 2011-03-04 17:38
unnamed William.Hart, 2011-03-14 06:30
unnamed William.Hart, 2011-04-11 18:21
Messages (8)
msg129851 - (view) Author: William Hart (William.Hart) Date: 2011-03-02 03:03
I recently started using logging extensively in the Coopr software, and I ran into some performance issues when logging was used within frequently used kernels in the code.  After profiling, it became clear that the performance of the logging package could be improved by simply caching the value of the Logger.isEnabledFor() method.

I've created a draft version of this cachine mechanism based on a snapshot of logging that I took from Python 2.7.1.  This is currently hosted in pytuilib.logging, though I'd love to see this migrate into the Python library (see https://software.sandia.gov/trac/pyutilib/browser/pyutilib.logging/trunk/pyutilib/logging).

Basically, I did the following:

1. Added a counter to the Manager class (status) that is incremented whenever the manager object has its level set

2. Add a counter to the Logger class (level_status) that represents the value of the manger status when the Logger's cache was last updated

3. Rework the isEnabledFor() method to update the cache if the logger status is older than the manager status.  I moved the present isEnabledFor logic into the _isEnabledFor() method for simplicity.

The attached file shows the diffs.  Note that there were a few other diffs due to an effort to make pyutilib.logging work on Python 2.5-2.7.

--Bill
msg129860 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2011-03-02 09:25
Bill,

Thanks for the suggestion and the patch. It's a good idea, though I wonder whether you found that the bulk of the time spent in isEnabledFor() was actually spent in getEffectiveLevel()? That's the one which loops through a logger's ancestors looking for a level which is actually set, so intuitively it would take a while - especially for deeper levels in the logging hierarchy. If so (which I suspect to be the case, but it would be good to have you confirm it), a better solution may be to cache the effective level.

Roughly how deep are/were your logger hierarchies in the situation where you experienced performance problems?

I'm happy to look at caching effective level for Python 3.3: The 2.X branches are now closed for additions other than bugs and security issues.
msg130062 - (view) Author: William Hart (William.Hart) Date: 2011-03-04 17:38
Vinay:

Yes, the bulk of the time was spent in getEffectiveLevel().  Since that
method is only called by isEnabledFor(), it made sense to cache
isEnabledFor().  But, that probably reflects the characteristics of my use
of logging.  I never call getEffectiveLevel() otherwise, so caching
isEnabledFor() minimized the total cost of logging.  I think you could
sensibly argue for caching getEffectiveLevel().

I had somewhat shallow logger hierarchies (3-4 deep).  The real problem was
that this was being called in a kernel of my code ... so even that
additional cost was noteworthy.

I think it's smart to focus on Python 3.3. As it happens, my colleagues did
some explicit caching of logging information that made pyutilib.logging
redundant.  When you don't think you need this as a reference
implementation, I'll delete it.

--Bill

On Wed, Mar 2, 2011 at 2:25 AM, Vinay Sajip <report@bugs.python.org> wrote:

>
> Vinay Sajip <vinay_sajip@yahoo.co.uk> added the comment:
>
> Bill,
>
> Thanks for the suggestion and the patch. It's a good idea, though I wonder
> whether you found that the bulk of the time spent in isEnabledFor() was
> actually spent in getEffectiveLevel()? That's the one which loops through a
> logger's ancestors looking for a level which is actually set, so intuitively
> it would take a while - especially for deeper levels in the logging
> hierarchy. If so (which I suspect to be the case, but it would be good to
> have you confirm it), a better solution may be to cache the effective level.
>
> Roughly how deep are/were your logger hierarchies in the situation where
> you experienced performance problems?
>
> I'm happy to look at caching effective level for Python 3.3: The 2.X
> branches are now closed for additions other than bugs and security issues.
>
> ----------
> assignee:  -> vinay.sajip
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue11369>
> _______________________________________
>
msg130077 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2011-03-04 21:38
> I had somewhat shallow logger hierarchies (3-4  deep).  The real problem was> 
>that this was being called in a kernel of  my code ... so even that

> additional cost was noteworthy.
> 
> I think it's  smart to focus on Python 3.3. As it happens, my colleagues did
> some explicit  caching of logging information that made pyutilib.logging
> redundant.   When you don't think you need this as a reference
> implementation, I'll delete  it.

You can go ahead and do that, since the diff attached to this ticket will give 
me any information I need. I already have a patch of a solution which caches the 
effective level, but I need some time to benchmark what the improvement is, and 
whether it is effective enough to warrant committing. Anyway it's for Python 
3.3, so we have time to get it right :-)

Thanks for your feedback.
msg130497 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2011-03-10 10:16
Bill,

I was looking at this patch again, and I'm not sure about thread safety. The correctness of the caching depends on manager.status, which is state which is potentially shared across threads. There are no interlocks around it, so with the patch as it stands, ISTM it's possible in a multi-threaded application to get stale information. Has your patch been used in multi-threaded applications?
msg130790 - (view) Author: William Hart (William.Hart) Date: 2011-03-14 06:30
Vinay:

No, I haven't tried this in multi-threaded applications.  You're correct
that this would require locks around the global data.

--Bill

On Thu, Mar 10, 2011 at 3:16 AM, Vinay Sajip <report@bugs.python.org> wrote:

>
> Vinay Sajip <vinay_sajip@yahoo.co.uk> added the comment:
>
> Bill,
>
> I was looking at this patch again, and I'm not sure about thread safety.
> The correctness of the caching depends on manager.status, which is state
> which is potentially shared across threads. There are no interlocks around
> it, so with the patch as it stands, ISTM it's possible in a multi-threaded
> application to get stale information. Has your patch been used in
> multi-threaded applications?
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue11369>
> _______________________________________
>
msg133506 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2011-04-11 07:54
I'll regretfully have to mark this as wontfix, since adding threading interlocks for correct operation in multi-threaded environments will negate the performance benefit.
msg133539 - (view) Author: William Hart (William.Hart) Date: 2011-04-11 18:21
Understood!  FYI, we worked around this caching issue explicitly in our
code.  This wound up being simpler than supporting a hacked version of the
logger.

Thanks for looking into this!

On Mon, Apr 11, 2011 at 1:54 AM, Vinay Sajip <report@bugs.python.org> wrote:

>
> Vinay Sajip <vinay_sajip@yahoo.co.uk> added the comment:
>
> I'll regretfully have to mark this as wontfix, since adding threading
> interlocks for correct operation in multi-threaded environments will negate
> the performance benefit.
>
> ----------
> resolution:  -> wont fix
> status: open -> closed
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue11369>
> _______________________________________
>
History
Date User Action Args
2011-04-11 18:21:01William.Hartsetfiles: + unnamed

messages: + msg133539
2011-04-11 07:54:37vinay.sajipsetstatus: open -> closed
resolution: wont fix
messages: + msg133506
2011-03-14 06:30:20William.Hartsetfiles: + unnamed

messages: + msg130790
2011-03-10 10:16:32vinay.sajipsetmessages: + msg130497
2011-03-04 21:38:26vinay.sajipsetmessages: + msg130077
2011-03-04 17:38:57William.Hartsetfiles: + unnamed

messages: + msg130062
2011-03-02 09:25:15vinay.sajipsetassignee: vinay.sajip
messages: + msg129860
2011-03-02 03:21:40ned.deilysetnosy: + vinay.sajip
2011-03-02 03:03:35William.Hartcreate