diff -r 3b368c4ad7c5 Doc/library/profile.rst --- a/Doc/library/profile.rst Sat May 12 19:02:47 2012 +0200 +++ b/Doc/library/profile.rst Sat May 12 18:20:48 2012 -0400 @@ -9,7 +9,8 @@ .. module:: profile :synopsis: Python source profiler. -**Source code:** :source:`Lib/profile.py` and :source:`Lib/pstats.py` +**Source code:** :source:`Lib/profile.py`, :source:`Lib/cProfile.py`, +:source:`Modules/_lsprof.c` and :source:`Lib/pstats.py` -------------- @@ -22,14 +23,13 @@ single: deterministic profiling single: profiling, deterministic -A :dfn:`profiler` is a program that describes the run time performance of a -program, providing a variety of statistics. This documentation describes the -profiler functionality provided in the modules :mod:`cProfile`, :mod:`profile` -and :mod:`pstats`. This profiler provides :dfn:`deterministic profiling` of -Python programs. It also provides a series of report generation tools to allow -users to rapidly examine the results of a profile operation. +:mod:`cProfile` and :mod:`profile` provide :dfn:`deterministic profiling` of +Python programs. A :dfn:`profile` is a set of statistics that describes how +often and for how long various parts of the program executed. These statistics +can be formatted into reports via the :mod:`pstats` module. -The Python standard library provides two different profilers: +The Python standard library provides two different implementations of the same +profiling interface: 1. :mod:`cProfile` is recommended for most users; it's a C extension with reasonable overhead that makes it suitable for profiling long-running @@ -37,14 +37,9 @@ Czotter. 2. :mod:`profile`, a pure Python module whose interface is imitated by - :mod:`cProfile`. Adds significant overhead to profiled programs. If you're - trying to extend the profiler in some way, the task might be easier with this - module. - -The :mod:`profile` and :mod:`cProfile` modules export the same interface, so -they are mostly interchangeable; :mod:`cProfile` has a much lower overhead but -is newer and might not be available on all systems. :mod:`cProfile` is really a -compatibility layer on top of the internal :mod:`_lsprof` module. + :mod:`cProfile`, but which adds significant overhead to profiled programs. + If you're trying to extend the profiler in some way, the task might be easier + with this module. .. note:: @@ -65,57 +60,89 @@ provides a very brief overview, and allows a user to rapidly perform profiling on an existing application. -To profile an application with a main entry point of :func:`foo`, you would add -the following to your module:: +To profile function :func:`foo` that takes a single integer argument, you would +add the following to your module:: import cProfile - cProfile.run('foo()') + x = 2 + cProfile.run('foo(x)') (Use :mod:`profile` instead of :mod:`cProfile` if the latter is not available on your system.) -The above action would cause :func:`foo` to be run, and a series of informative -lines (the profile) to be printed. The above approach is most useful when -working with the interpreter. If you would like to save the results of a -profile into a file for later examination, you can supply a file name as the -second argument to the :func:`run` function:: +The above action would run :func:`foo` and print profile results like the +following:: + + 2706 function calls (2004 primitive calls) in 4.504 CPU seconds + + Ordered by: standard name + + ncalls tottime percall cumtime percall filename:lineno(function) + 2 0.006 0.003 0.953 0.477 pobject.py:75(save_objects) + 43/3 0.533 0.012 0.749 0.250 pobject.py:99(evaluate) + ... + +The first line indicates that 2706 calls were monitored. Of those calls, 2004 +were :dfn:`primitive`. We define :dfn:`primitive` to mean that the call was not +induced via recursion. The next line: ``Ordered by: standard name``, indicates +that the text string in the far right column was used to sort the output. The +column headings include: + +ncalls + for the number of calls, + +tottime + for the total time spent in the given function (and excluding time made in + calls to sub-functions), + +percall + is the quotient of ``tottime`` divided by ``ncalls`` + +cumtime + is the total time spent in this and all subfunctions (from invocation + till exit). This figure is accurate *even* for recursive functions. + +percall + is the quotient of ``cumtime`` divided by primitive calls + +filename:lineno(function) + provides the respective data of each function + +When there are two numbers in the first column (for example, ``43/3``), then the +latter is the number of primitive calls, and the former is the actual number of +calls. Note that when the function does not recurse, these two values are the +same, and only the single figure is printed. + +Instead of printing the output at the end of the profile run, you can save the +results to a file by specifying a filename to the :func:`run` function:: import cProfile - cProfile.run('foo()', 'fooprof') + cProfile.run('foo(x)', 'fooprof') + +The :class:`pstats.Stats` class reads profile results from a file and formats +them in various ways. The file :file:`cProfile.py` can also be invoked as a script to profile another script. For example:: - python -m cProfile myscript.py + python -m cProfile [-o output_file] [-s sort_order] myscript.py -:file:`cProfile.py` accepts two optional arguments on the command line:: +``-o`` writes the profile results to a file instead of to stdout - cProfile.py [-o output_file] [-s sort_order] +``-s`` specifies one of the :func:`pstats.Stats.sort_stats` sort values to sort the output +by. This only applies when ``-o`` is not supplied. -``-s`` only applies to standard output (``-o`` is not supplied). -Look in the :class:`Stats` documentation for valid sort values. - -When you wish to review the profile, you should use the methods in the -:mod:`pstats` module. Typically you would load the statistics data as follows:: +The :mod:`pstats` module's :class:`pstats.Stats` class has a variety of methods +for manipulating and printing the data saved into a profile results file:: import pstats p = pstats.Stats('fooprof') - -The class :class:`Stats` (the above code just created an instance of this class) -has a variety of methods for manipulating and printing the data that was just -read into ``p``. When you ran :func:`cProfile.run` above, what was printed was -the result of three method calls:: - p.strip_dirs().sort_stats(-1).print_stats() -The first method removed the extraneous path from all the module names. The -second method sorted all the entries according to the standard module/line/name -string that is printed. The third method printed out all the statistics. You -might try the following sort calls: - -.. (this is to comply with the semantics of the old profiler). - -:: +The :meth:`strip_dirs` method removed the extraneous path from all the module +names. The :meth:`sort_stats` method sorted all the entries according to the +standard module/line/name string that is printed. The :meth:`print_stats` method +printed out all the statistics. You might try the following sort calls:: p.sort_stats('name') p.print_stats() @@ -170,131 +197,112 @@ reading and examining profile dumps. It has a simple line-oriented interface (implemented using :mod:`cmd`) and interactive help. - -.. _deterministic-profiling: - -What Is Deterministic Profiling? -================================ - -:dfn:`Deterministic profiling` is meant to reflect the fact that all *function -call*, *function return*, and *exception* events are monitored, and precise -timings are made for the intervals between these events (during which time the -user's code is executing). In contrast, :dfn:`statistical profiling` (which is -not done by this module) randomly samples the effective instruction pointer, and -deduces where time is being spent. The latter technique traditionally involves -less overhead (as the code does not need to be instrumented), but provides only -relative indications of where time is being spent. - -In Python, since there is an interpreter active during execution, the presence -of instrumented code is not required to do deterministic profiling. Python -automatically provides a :dfn:`hook` (optional callback) for each event. In -addition, the interpreted nature of Python tends to add so much overhead to -execution, that deterministic profiling tends to only add small processing -overhead in typical applications. The result is that deterministic profiling is -not that expensive, yet provides extensive run time statistics about the -execution of a Python program. - -Call count statistics can be used to identify bugs in code (surprising counts), -and to identify possible inline-expansion points (high call counts). Internal -time statistics can be used to identify "hot loops" that should be carefully -optimized. Cumulative time statistics should be used to identify high level -errors in the selection of algorithms. Note that the unusual handling of -cumulative times in this profiler allows statistics for recursive -implementations of algorithms to be directly compared to iterative -implementations. - - -Reference Manual -- :mod:`profile` and :mod:`cProfile` -====================================================== +The :mod:`profile` and :mod:`cProfile` Modules +============================================== .. module:: cProfile - :synopsis: Python profiler + :synopsis: Python source profiler. - -The primary entry point for the profiler is the global function -:func:`profile.run` (resp. :func:`cProfile.run`). It is typically used to create -any profile information. The reports are formatted and printed using methods of -the class :class:`pstats.Stats`. The following is a description of all of these -standard entry points and functions. For a more in-depth view of some of the -code, consider reading the later section on Profiler Extensions, which includes -discussion of how to derive "better" profilers from the classes presented, or -reading the source code for these modules. - +Both the :mod:`profile` and :mod:`cProfile` modules provide the following functions: .. function:: run(command, filename=None, sort=-1) This function takes a single argument that can be passed to the :func:`exec` - function, and an optional file name. In all cases this routine attempts to - :func:`exec` its first argument, and gather profiling statistics from the - execution. If no file name is present, then this function automatically - prints a simple profiling report, sorted by the standard name string - (file/line/function-name) that is presented in each line. The following is a - typical output from such a call:: + function, and an optional file name. In all cases this routine executes:: - 2706 function calls (2004 primitive calls) in 4.504 CPU seconds + exec(command, __main__.__dict__, __main__.__dict__) - Ordered by: standard name - - ncalls tottime percall cumtime percall filename:lineno(function) - 2 0.006 0.003 0.953 0.477 pobject.py:75(save_objects) - 43/3 0.533 0.012 0.749 0.250 pobject.py:99(evaluate) - ... - - The first line indicates that 2706 calls were monitored. Of those calls, 2004 - were :dfn:`primitive`. We define :dfn:`primitive` to mean that the call was not - induced via recursion. The next line: ``Ordered by: standard name``, indicates - that the text string in the far right column was used to sort the output. The - column headings include: - - ncalls - for the number of calls, - - tottime - for the total time spent in the given function (and excluding time made in - calls to sub-functions), - - percall - is the quotient of ``tottime`` divided by ``ncalls`` - - cumtime - is the total time spent in this and all subfunctions (from invocation till - exit). This figure is accurate *even* for recursive functions. - - percall - is the quotient of ``cumtime`` divided by primitive calls - - filename:lineno(function) - provides the respective data of each function - - When there are two numbers in the first column (for example, ``43/3``), then the - latter is the number of primitive calls, and the former is the actual number of - calls. Note that when the function does not recurse, these two values are the - same, and only the single figure is printed. - - If *sort* is given, it can be one of ``'stdname'`` (sort by filename:lineno), - ``'calls'`` (sort by number of calls), ``'time'`` (sort by total time) or - ``'cumulative'`` (sort by cumulative time). The default is ``'stdname'``. - + and gathers profiling statistics from the execution. If no file name is + present, then this function automatically creates a :class:`pstats.Stats` + instance and prints a simple profiling report. If the sort value is specifed + it is passed to this :class:`pstats.Stats` instance to control how the results are + sorted. .. function:: runctx(command, globals, locals, filename=None) This function is similar to :func:`run`, with added arguments to supply the - globals and locals dictionaries for the *command* string. + globals and locals dictionaries for the *command* string. This routine + executes:: + exec(command, globals, locals) + + and gathers profiling statistics as in the :func:`run` function above. + +.. class:: Profile(timer=None, timeunit=0.0, subcalls=True, builtins=True) + + This class is normally only used if more precise control over profiling is + needed than what the :func:`cProfile.run` function provides. + + A custom timer can be supplied for measuring how long code takes to run via + the `timer` argument. This must be a function that returns a single number + representing the current time. If the number is an integer, the `timeunit` + specifies a multiplier that specifies the duration of each unit of time. For + example, if the timer returns times measured in thousands of seconds, the + time unit would be .001. For best results, it may be necesary to hard-code + the new timer in the C source of :source:`Modules/_lsprof.c`. + + Additionall, directly using the :class:`Profile` class allows formatting + profile results without writing the profile data to a file:: + + import cProfile, pstats, io + pr = cProfile.Profile() + pr.enable() + ... do something ... + pr.disable() + s = io.StringIO() + ps = pstats.Stats(pr, stream=s) + ps.print_results() + +.. method:: Profile.enable() + + Start collecting profiling data. + +.. method:: Profile.disable() + + Stop collecting profiling data. + +.. method:: Profile.create_stats() + + Stop collecting profiling data and record the results internally as the + current profile. + +.. method:: Profile.print_sats(sort=-1) + + Create a :class:`pstats.Stats` object based on the current profile and print + the results to stdout. + +.. method:: Profile.dump_stats(filename) + + Write the results of the current profile to *filename*. + +.. method:: Profile.run(cmd) + + Profiles the cmd via :func:`exec`. + +.. method:: Profile.runctx(cmd, globals, locals) + + Profiles the cmd via :func:`exec` with the specified global and local environment. + +.. method:: Profile.runcall(func, *args, **kwargs) + + Profiles the specified function call. + +.. _profile-stats: + +The :class:`Stats` Class +======================== Analysis of the profiler data is done using the :class:`pstats.Stats` class. - .. module:: pstats :synopsis: Statistics object for use with the profiler. -.. class:: Stats(*filenames, stream=sys.stdout) +.. class:: Stats(*filenames or profile, stream=sys.stdout) This class constructor creates an instance of a "statistics object" from a - *filename* (or set of filenames). :class:`Stats` objects are manipulated by - methods, in order to print useful reports. You may specify an alternate output - stream by giving the keyword argument, ``stream``. + *filename* (or set of filenames) or from a :class:`Profile` instance. Output + will be printed to the stream specified by *stream*. The file selected by the above constructor must have been created by the corresponding version of :mod:`profile` or :mod:`cProfile`. To be specific, @@ -303,15 +311,13 @@ If several files are provided, all the statistics for identical functions will be coalesced, so that an overall view of several processes can be considered in a single report. If additional files need to be combined with data in an - existing :class:`Stats` object, the :meth:`add` method can be used. + existing :class:`pstats.Stats` object, the :meth:`add` method can be used. .. (such as the old system profiler). + Instead of reading the profile data from a file, a :class:`cProfile.Profile` + or :class:`profile.Profile` object can be used as the profile data source. -.. _profile-stats: - -The :class:`Stats` Class ------------------------- :class:`Stats` objects have the following methods: @@ -468,6 +474,39 @@ the :meth:`print_callers` method. +.. _deterministic-profiling: + +What Is Deterministic Profiling? +================================ + +:dfn:`Deterministic profiling` is meant to reflect the fact that all *function +call*, *function return*, and *exception* events are monitored, and precise +timings are made for the intervals between these events (during which time the +user's code is executing). In contrast, :dfn:`statistical profiling` (which is +not done by this module) randomly samples the effective instruction pointer, and +deduces where time is being spent. The latter technique traditionally involves +less overhead (as the code does not need to be instrumented), but provides only +relative indications of where time is being spent. + +In Python, since there is an interpreter active during execution, the presence +of instrumented code is not required to do deterministic profiling. Python +automatically provides a :dfn:`hook` (optional callback) for each event. In +addition, the interpreted nature of Python tends to add so much overhead to +execution, that deterministic profiling tends to only add small processing +overhead in typical applications. The result is that deterministic profiling is +not that expensive, yet provides extensive run time statistics about the +execution of a Python program. + +Call count statistics can be used to identify bugs in code (surprising counts), +and to identify possible inline-expansion points (high call counts). Internal +time statistics can be used to identify "hot loops" that should be carefully +optimized. Cumulative time statistics should be used to identify high level +errors in the selection of algorithms. Note that the unusual handling of +cumulative times in this profiler allows statistics for recursive +implementations of algorithms to be directly compared to iterative +implementations. + + .. _profile-limits: Limitations @@ -502,10 +541,10 @@ and the results are actually better than without calibration. -.. _profile-calibration: +.. _profile-notes: -Calibration -=========== +Notes for :mod:`profile` +========================= The profiler of the :mod:`profile` module subtracts a constant from each event handling time to compensate for the overhead of calling the time function, and @@ -545,50 +584,23 @@ If you have a choice, you are better off choosing a smaller constant, and then your results will "less often" show up as negative in profile statistics. - -.. _profiler-extensions: - -Extensions --- Deriving Better Profilers -======================================== - -The :class:`Profile` class of both modules, :mod:`profile` and :mod:`cProfile`, -were written so that derived classes could be developed to extend the profiler. -The details are not described here, as doing this successfully requires an -expert understanding of how the :class:`Profile` class works internally. Study -the source code of the module carefully if you want to pursue this. - -If all you want to do is change how current time is determined (for example, to -force use of wall-clock time or elapsed process time), pass the timing function -you want to the :class:`Profile` class constructor:: +If you want to change how current time is determined (for example, to force use +of wall-clock time or elapsed process time), pass the timing function you want +to the :class:`Profile` class constructor:: pr = profile.Profile(your_time_func) The resulting profiler will then call :func:`your_time_func`. -:class:`profile.Profile` - :func:`your_time_func` should return a single number, or a list of numbers whose - sum is the current time (like what :func:`os.times` returns). If the function - returns a single time number, or the list of returned numbers has length 2, then - you will get an especially fast version of the dispatch routine. +:func:`your_time_func` should return a single number, or a list of numbers whose +sum is the current time (like what :func:`os.times` returns). If the function +returns a single time number, or the list of returned numbers has length 2, then +you will get an especially fast version of the dispatch routine. - Be warned that you should calibrate the profiler class for the timer function - that you choose. For most machines, a timer that returns a lone integer value - will provide the best results in terms of low overhead during profiling. - (:func:`os.times` is *pretty* bad, as it returns a tuple of floating point - values). If you want to substitute a better timer in the cleanest fashion, - derive a class and hardwire a replacement dispatch method that best handles your - timer call, along with the appropriate calibration constant. - -:class:`cProfile.Profile` - :func:`your_time_func` should return a single number. If it returns - integers, you can also invoke the class constructor with a second argument - specifying the real duration of one unit of time. For example, if - :func:`your_integer_time_func` returns times measured in thousands of seconds, - you would construct the :class:`Profile` instance as follows:: - - pr = profile.Profile(your_integer_time_func, 0.001) - - As the :mod:`cProfile.Profile` class cannot be calibrated, custom timer - functions should be used with care and should be as fast as possible. For the - best results with a custom timer, it might be necessary to hard-code it in the C - source of the internal :mod:`_lsprof` module. +Be warned that you should calibrate the profiler class for the timer function +that you choose. For most machines, a timer that returns a lone integer value +will provide the best results in terms of low overhead during profiling. +(:func:`os.times` is *pretty* bad, as it returns a tuple of floating point +values). If you want to substitute a better timer in the cleanest fashion, +derive a class and hardwire a replacement dispatch method that best handles your +timer call, along with the appropriate calibration constant. \ No newline at end of file