classification
Title: logging: cache the traceback text on formatter, instead of record
Type: behavior Stage:
Components: Library (Lib) Versions: Python 3.1, Python 3.2, Python 2.7, Python 2.6
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: vinay.sajip Nosy List: srid, vinay.sajip
Priority: normal Keywords:

Created on 2009-07-08 02:18 by srid, last changed 2009-07-09 23:54 by vinay.sajip. This issue is now closed.

Messages (10)
msg90247 - (view) Author: Sridhar Ratnakumar (srid) Date: 2009-07-08 02:18
In Formatter.format():

           # Cache the traceback text to avoid converting it multiple 
times
            # (it's constant anyway)
            if not record.exc_text:
                record.exc_text = self.formatException(record.exc_info)

The result must ideally be cache on the formatter object (self) itself 
rather than the record object. Because, record object is reused in 
multiple handlers .. thus multiple formatters. If the first formatter, 
caches the traceback text, then subsequent formatters will use the same 
even though of them is apparently a custom Formatter with overridden 
`formatException`.
msg90248 - (view) Author: Sridhar Ratnakumar (srid) Date: 2009-07-08 02:25
For example, see: http://paste.pocoo.org/show/127167/
msg90297 - (view) Author: Sridhar Ratnakumar (srid) Date: 2009-07-08 23:31
A real-life recipe where this bug showed up: 

  http://code.activestate.com/recipes/576836/
msg90304 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2009-07-09 06:59
I fail to follow your logic, in the following respects:

1. The exception text cannot be cached in the formatter because a
formatter will format lots and lots of records. Formatters live for a
long time, records do not.
2. The exception text needs to be stored in the record, because in some
instances (e.g. pickling and sending over a socket) this information
will not be available at the other end in any other way.
3. The way it works now, if you have multiple formatters attached to
multiple handlers (e.g. with ISO time for log files, with no ISO time
for console output), then the traceback is only converted to text once.
4. There's nothing stopping you from overriding Formatter.format, is
there? the base version uses the cache, you can override format in your
custom formatter and ignore the cache altogether if you like.

Are you sure you've thought this through? ;-)
msg90305 - (view) Author: Sridhar Ratnakumar (srid) Date: 2009-07-09 07:19
On Wed, 08 Jul 2009 23:59:17 -0700, Vinay Sajip <report@bugs.python.org>  
wrote:

> 1. The exception text cannot be cached in the formatter because a
> formatter will format lots and lots of records. Formatters live for a
> long time, records do not.

Ah, yes.

> Are you sure you've thought this through?

Not thoroughly .. as I missed the point 1. on formatter processing  
multiple records.
msg90306 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2009-07-09 07:33
> Not thoroughly .. as I missed the point 1. on formatter processing  
> multiple records.

But what do you say to points 2, 3 and 4? Of course they may not apply
to your use case, but are they not valid points?
msg90311 - (view) Author: Sridhar Ratnakumar (srid) Date: 2009-07-09 08:06
> 2. The exception text needs to be stored in the record, because in some
> instances (e.g. pickling and sending over a socket) this information
> will not be available at the other end in any other way.

Caching in the record object is thus the way to go. But the cache needs to  
be invalidated when `exc_info` is changed .. as in set to None when it was  
a traceback object. I'd change the following:

         if record.exc_text:
             if s[-1:] != "\n":

to:

         if record.exc_info and record.exc_text:
             if s[-1:] != "\n":

(or, move the body of this IF to the preceding IF)

> 3. The way it works now, if you have multiple formatters attached to
> multiple handlers (e.g. with ISO time for log files, with no ISO time
> for console output), then the traceback is only converted to text once.

Yes, that is the benefit of caching I see.

> 4. There's nothing stopping you from overriding Formatter.format, is
> there? the base version uses the cache, you can override format in your
> custom formatter and ignore the cache altogether if you like.

I can, but I'd rather not duplicate that code. From the recipe I linked  
above:

     def format(self, record):
         # attach 'error:' prefix to error/critical messages
         s = logging.Formatter.format(self, record)
         if record.levelno >= logging.ERROR:
             return 'error: {0}'.format(s)
         else:
             return s

Here, I simply call the base class's `format` method.

And do you know of a better way to suppress traceback output (in the  
custom handler during `log.exception`) than the hack used in the recipe?

         elif record.levelno >= logging.ERROR:
             if record.exc_info and self.verbosity_level < 1:
                 # supress full traceback with verbosity_level <= 0
                 with new_record_exc_info(record, None):
                     self.__emit(record, sys.stderr)
             else:
                 self.__emit(record, sys.stderr)

.. http://code.activestate.com/recipes/576836/
msg90321 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2009-07-09 10:02
> But the cache needs to be invalidated when `exc_info` is changed -
> as in set to None when it was a traceback object.

When does the exc_info change for a record? AFAIK it's set when you 
create a LogRecord, and that's it.

To be honest, I'm not sure why you need that specific recipe. Notice 
the comment made by a user against the recipe. But if you are set on 
doing it like this, then suppose I configure as follows:

Create a StreamHandler(sys.stderr) and a StreamHandler(sys.stdout).
Create a Formatter and attach it to the output handler.
Create a Formatter with the format string used above but with 
an "error:" prefix, and attach it to the error handler.
Add a Filter to each handler. This can be a single Filter class which 
either passes everything above a threshold or everything below a 
threshold. This will only have a few lines of code. Attach two such 
instances, suitably configured, to the appropriate handlers.
Attach both handlers to the root logger (or some other logger).

How will this differ in its effect from your recipe?
msg90347 - (view) Author: Sridhar Ratnakumar (srid) Date: 2009-07-09 20:02
> Vinay Sajip <vinay_sajip@yahoo.co.uk> added the comment:
>
>> But the cache needs to be invalidated when `exc_info` is changed -
>> as in set to None when it was a traceback object.
>
> When does the exc_info change for a record? AFAIK it's set when you
> create a LogRecord, and that's it.

Ok.

> To be honest, I'm not sure why you need that specific recipe. Notice
> the comment made by a user against the recipe.

I've responded as well .. that will explain why I need this specific  
recipe.

> But if you are set on
> doing it like this, then suppose I configure as follows:
>
> Create a StreamHandler(sys.stderr) and a StreamHandler(sys.stdout).
> Create a Formatter and attach it to the output handler.
> Create a Formatter with the format string used above but with
> an "error:" prefix, and attach it to the error handler.
> Add a Filter to each handler. This can be a single Filter class which
> either passes everything above a threshold or everything below a
> threshold. This will only have a few lines of code. Attach two such
> instances, suitably configured, to the appropriate handlers.
> Attach both handlers to the root logger (or some other logger).
>
> How will this differ in its effect from your recipe?

Actually, I did not know about 'Filter' until now. I will give it a try  
sometime.
msg90365 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2009-07-09 23:54
[snip example of how to achieve same effect as recipe in a simpler way]
>> How will this differ in its effect from your recipe?

> Actually, I did not know about 'Filter' until now. I will give it a
> try sometime.

Wouldn't it improve your recipe further to use more effectively
functionality which is already present in the logging package, instead
of re-inventing the wheel? Filter has been part of the logging package
design from the beginning, and is reasonably well documented.

If my suggestion has the same effect as your recipe, it has the further
benefit of users being able to specify thresholds more flexibly. For
example, in your response to a comment on your recipe, you acknowledge
that perhaps the destination you initially chose for DEBUG message might
need to be reconsidered. This means that it's probably better to provide
mechanism, and leave policy more to the user. With moderately careful
design of the Filter class, more flexibility could be provided with no
real penalty in code size or complexity.
History
Date User Action Args
2009-07-09 23:54:21vinay.sajipsetmessages: + msg90365
2009-07-09 20:02:39sridsetmessages: + msg90347
2009-07-09 10:02:27vinay.sajipsetmessages: + msg90321
2009-07-09 08:06:35sridsetmessages: + msg90311
2009-07-09 07:33:38vinay.sajipsetmessages: + msg90306
2009-07-09 07:19:52sridsetmessages: + msg90305
2009-07-09 06:59:14vinay.sajipsetstatus: open -> closed
assignee: vinay.sajip
resolution: not a bug
messages: + msg90304
2009-07-08 23:31:54sridsetmessages: + msg90297
2009-07-08 02:25:09sridsetmessages: + msg90248
2009-07-08 02:18:19sridsettype: behavior
2009-07-08 02:18:02sridcreate