classification
Title: Add a new tracemalloc module to trace memory allocations
Type: enhancement Stage:
Components: Versions: Python 3.4
process
Status: closed Resolution: fixed
Dependencies: 16742 Superseder:
Assigned To: Nosy List: Jim.Jewett, amaury.forgeotdarc, belopolsky, eric.snow, haypo, jcea, neologix, python-dev, rhettinger, tim.peters, vajrasky
Priority: normal Keywords: patch

Created on 2013-08-28 23:53 by haypo, last changed 2013-12-10 17:49 by jcea. This issue is now closed.

Files
File name Uploaded Description Edit
8b34364a66a9.patch haypo, 2013-11-09 18:44 review
4430e893d89f.patch haypo, 2013-11-20 16:51 review
pack.patch haypo, 2013-11-20 21:52
3de17d13002d.patch haypo, 2013-11-21 11:43 review
start_nframe.patch haypo, 2013-11-23 01:37
minor_makeup_test_tracemalloc.patch vajrasky, 2013-11-25 06:45 review
Repositories containing patches
http://hg.python.org/features/tracemalloc
Messages (57)
msg196436 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-08-28 23:53
Thanks to the PEP 445, it becomes possible to trace easily memory allocations. I propose to add a new tracemalloc module computing the memory usage per file and per line number.

I implemented the module as a third party module for Python 2.5-3.4, but it requires to patch and recompile Python:
https://pypi.python.org/pypi/pytracemalloc


My proposed implementation is different:

* use a simple C implementation of an hash table called "cfuhash" (coming from the libcfu project) instead of depending on the glib library ; I simplified and adapted the implementation for my usage

* no enable() / disable() function: tracemalloc can only be enabled before startup by setting PYTHONTRACEMALLOC=1 environment variable

* traces (size of the memory block, Python filename, Python line number) are stored directly in the memory block, not in a separated hash table

I chose PYTHONTRACEMALLOC env var instead of enable()/disable() functions to be able to really trace *all* memory allocated by Python, especially memory allocated at startup, during Python initialization.


TODO:

* The (high-level) API should be reviewed and discussed

* The documention should be improved

* PyMem_Raw API is not hooked yet because the code is not thread-safe (it relies on the Python GIL)

* Filenames should not be encoded: the hash table should directly use Unicode strings (PyObject*), encoding a filename allocates memory and may call the garbage collector

* Memory is not released at exit


For the documentation, see the following page:
https://pypi.python.org/pypi/pytracemalloc


See also the https://bitbucket.org/haypo/pyfailmalloc project and the issue #18408.
msg196437 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-08-29 00:07
Example of output:


$ PYTHONTRACEMALLOC=1 ./python -m test
...
== CPython 3.4.0a1+ (default:2ce9e5f6b47c+, Aug 29 2013, 02:03:02) [GCC 4.7.2 20121109 (Red Hat 4.7.2-8)]
==   Linux-3.9.4-200.fc18.x86_64-x86_64-with-fedora-18-Spherical_Cow little-endian
==   /home/haypo/prog/python/tracemalloc/build/test_python_11087
...
[  1/380] test_grammar
[  2/380] test_opcodes
[  3/380] test_dict
[  4/380] test_builtin
[  5/380] test_exceptions
[  6/380] test_types
[  7/380] test_unittest

2013-08-29 02:06:22: Top 25 allocations per file and line
#1: <frozen importlib._bootstrap>:704: size=5 MiB, count=56227, average=105 B
#2: .../tracemalloc/Lib/linecache.py:127: size=1004 KiB, count=8706, average=118 B
#3: .../Lib/unittest/mock.py:1764: size=895 KiB, count=7841, average=116 B
#4: .../Lib/unittest/mock.py:1805: size=817 KiB, count=15101, average=55 B
#5: .../Lib/test/test_dict.py:35: size=768 KiB, count=8, average=96 KiB
#6: <frozen importlib._bootstrap>:274: size=703 KiB, count=4604, average=156 B
#7: ???:?: size=511 KiB, count=4445, average=117 B
#8: .../Lib/unittest/mock.py:350: size=370 KiB, count=1227, average=308 B
#9: .../Lib/unittest/case.py:306: size=343 KiB, count=1390, average=253 B
#10: .../Lib/unittest/case.py:496: size=330 KiB, count=650, average=521 B
#11: .../Lib/unittest/case.py:327: size=291 KiB, count=717, average=416 B
#12: .../Lib/collections/__init__.py:368: size=239 KiB, count=2170, average=113 B
#13: .../Lib/test/test_grammar.py:132: size=195 KiB, count=1250, average=159 B
#14: .../Lib/unittest/mock.py:379: size=118 KiB, count=152, average=800 B
#15: .../tracemalloc/Lib/contextlib.py:37: size=102 KiB, count=672, average=156 B
#16: <frozen importlib._bootstrap>:1430: size=91 KiB, count=1193, average=78 B
#17: .../tracemalloc/Lib/inspect.py:1399: size=79 KiB, count=104, average=784 B
#18: .../tracemalloc/Lib/abc.py:133: size=77 KiB, count=275, average=289 B
#19: .../Lib/unittest/case.py:43: size=73 KiB, count=593, average=127 B
#20: .../Lib/unittest/mock.py:491: size=67 KiB, count=153, average=450 B
#21: <frozen importlib._bootstrap>:1438: size=64 KiB, count=20, average=3321 B
#22: .../Lib/unittest/case.py:535: size=56 KiB, count=76, average=766 B
#23: .../tracemalloc/Lib/sre_compile.py:508: size=54 KiB, count=115, average=485 B
#24: .../Lib/unittest/case.py:300: size=48 KiB, count=616, average=80 B
#25: .../Lib/test/regrtest.py:1207: size=48 KiB, count=2, average=24 KiB
7333 more: size=4991 KiB, count=28051, average=182 B
Total Python memory: size=17 MiB, count=136358, average=136 B
Total process memory: size=42 MiB (ignore tracemalloc: 22 KiB)
msg196438 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-08-29 00:18
tracemalloc.get_process_memory() should be moved to the shutil module.
msg196439 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-08-29 00:57
I'm not sure that the DisplayGarbage class is still useful. The PEP 442 has been implemented in Python 3.4, so objects with __del__() method are no more "uncollectable". I don't know which kind of objects can enter the gc.garbage list (by default, not using gc.set_debug(gc.DEBUG_SAVEALL)).
msg196474 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2013-08-29 18:10
Is it really impossible to use a standard Python dict instead of cfuhash?
msg196484 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-08-29 19:51
> Is it really impossible to use a standard Python dict instead of cfuhash?

I don't know if it is possible or not, I didn't try. Using PyDict has many requirements:

* the GIL must be held: I plan to hook also PyMem_Raw (which is called without the GIL being held)
* PyDict uses PyMem_Malloc() which would be a reentrant call
* the API uses PyObject* whereas tracemalloc uses C types and C structures
* PyDict can raises Python exceptions, which is not something expected from a memory allocator (especially PyMem_Malloc)
* etc.

The implementation is not done yet. I would like to be able to compute the size of memory allocated by the _tracemalloc module: size of hash tables.
msg196525 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-08-30 12:07
Ok, let's start with a first patch. It works in the common cases, because they are some corner cases like subinterpreter which might still crash.

The hook on PyMem_RawMalloc() takes the GIL. It is disabled because it has still bugs (it introduces a deadlock in some corner cases!)
msg196720 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-09-01 12:49
New patch (version 2):

- an hash table entry can now contain the data directly instead of a pointer to the data
- _tracemalloc._get_stats() copies the hash table to not have to lock the hash table too long, which will be especially useful for TRACE_RAW_MALLOC
- Add get_object_trace() function which returns a namedtuple
- get_process_memory() returns a namedtuple
- DisplayGarbage class has been removed: the PEP 442 has been implemented, it is no more interesting to trace uncollectable objects because they became very rare. If you still want to do that, get_object_trace() is available


TODO:

- TRACE_RAW_MALLOC define is still disabled because it introduces a deadlock when PyMem_RawMalloc() is called indirectly by Py_NewInterpreter()
- minor pending FIXME in the code
- review the high level API: add an helper to build a diff between two snapshots and drop colors (not portable)?


test_capi deadlock with TRACE_RAW_MALLOC:

    test_subinterps (test.test_capi.SubinterpreterTest) ... ^C
    Program received signal SIGINT, Interrupt.
    0xb7fdd424 in __kernel_vsyscall ()
    (gdb) where
    ...
    #2  0x081abc1c in PyCOND_TIMEDWAIT (cond=0x83470e0, mut=0x8347110, us=5000) at Python/condvar.h:103
    #3  0x081ac55c in take_gil (tstate=0x8349c78) at Python/ceval_gil.h:224
    #4  0x081ad0a9 in PyEval_RestoreThread (tstate=0x8349c78) at Python/ceval.c:443
    #5  0x081f08e3 in PyGILState_Ensure () at Python/pystate.c:786
    #6  0x0808df1b in trace_get_filename (trace=0x84c26b8, gil_held=0) at ./Modules/_tracemalloc.c:555
    #7  0x0808e314 in trace_log_alloc (ptr=0x84c9f18, size=52, gil_held=0) at ./Modules/_tracemalloc.c:698
    #8  0x0808e449 in trace_malloc (ctx=0x8342890, size=52, gil_held=0) at ./Modules/_tracemalloc.c:744
    #9  0x0808e5b8 in trace_raw_malloc (ctx=0x8342890, size=52) at ./Modules/_tracemalloc.c:811
    #10 0x0805d32a in PyMem_RawMalloc (size=52) at Objects/obmalloc.c:256
    #11 0x081eeb79 in PyInterpreterState_New () at Python/pystate.c:63
    #12 0x08060c8b in Py_NewInterpreter () at Python/pythonrun.c:700
    #13 0xb74d4b90 in run_in_subinterp ()
    #14 0x080e9e7a in PyCFunction_Call ()
    ...
msg196879 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-09-03 23:15
Oh, another TODO: drop the (optional) dependency to psutil. Implement get_process_memory() in C (in the _tracemalloc module).
msg197951 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-09-16 22:57
Updated patch.
msg197952 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-09-16 23:44
Include/tracemalloc.h: PyTraceMalloc_DisableTemporary() and PyTraceMalloc_RestoreTemporary() should be removed, they were tests to try to fix issues with subinterpreter when tracing PyMem_RawMalloc().

Lib/test/regrtest.py: These changes should not be commited, they are just here to create easily snapshot files (./python -X tracemalloc -m test).
msg197954 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-09-16 23:46
TODO list:

* remove dependency to psutil
* enable PYMEM_RAW: fix subinterpreter issue, check which variables are protected by which lock
* test the command line interface?
* don't trace objects created by get_stats(), reentrant flag should be a thread-local variable
* add an hash table of tracebacks: 86% of tracebacks are duplicated

See also the TODO.txt file in the tracemalloc repository.
msg198765 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-10-01 11:49
The issue #16742 must be fixed to be able to trace memory blocks allocated by PyMem_RawMalloc().
msg198767 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-10-01 11:57
For the development, it would also be nice to fix #18948.
msg199039 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2013-10-06 07:42
I recommend that tracemalloc focus exclusively on real allocations and ignore freelisting.   The former are interesting because they directly affect performance.  Freelists on the other hand are close to being free (as in beer).

Measuring the freelist usage (or disabling it and then measuring) makes the proposal more complex than necessary and doesn't provide useful information (freelisting is no more interesting to track than saving a value in a static variable).

Another suggestion is to expand the API to provide a way to identify a potential performance killer:  the number of reallocs() than result in a new pointer (because the data moved).

If you use hash table code from another project, be sure to check its licensing.
msg199040 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-10-06 08:01
2013/10/6 Raymond Hettinger <report@bugs.python.org>:
> I recommend that tracemalloc focus exclusively on real allocations and ignore freelisting.   The former are interesting because they directly affect performance.  Freelists on the other hand are close to being free (as in beer).

The version of tracemalloc available on PyPI has patches to disable
completly freelists or trace usage of freelists. It is important in
Python 2 because some lists had no limit, especially int types. The
version I wrote for Python 3.4 does not anything on freelists.

Did you read the source code of the repository on python.org?

> Measuring the freelist usage (or disabling it and then measuring) makes the proposal more complex than necessary and doesn't provide useful information (freelisting is no more interesting to track than saving a value in a static variable).

I would be nice to add metrics on freelists: number and size. Using
tracemalloc, I'm trying to trace any byte. Such metrics can be
optional (ex: a new method added to the Snapshot class).

sys._debugmalloc() does already print all free lists. It should not be
hard to write these information into structures, as I did for pymalloc
statistics.

Adding a method to read memory usage once is different than
maintaining a structure (hash tables) to know that status anytime.

> Another suggestion is to expand the API to provide a way to identify a potential performance killer:  the number of reallocs() than result in a new pointer (because the data moved).

I can add a global counter for each allocator (malloc, realloc, free)
and each domain (raw, mem, obj). But I don't want to add a counter per
trace, it would have a price on performances and memory footpring.

> If you use hash table code from another project, be sure to check its licensing.

Done: see
https://mail.python.org/pipermail/python-legal-sig/2013-September/000044.html
msg199041 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-10-06 08:19
>> If you use hash table code from another project, be sure to check its licensing.

> Done: see
> https://mail.python.org/pipermail/python-legal-sig/2013-September/000044.html

I contacted the original author and python-legal mailing list. The BSD
license asks to mention the license: done in Doc/license.html.

Note: cfuhash hash table was heavily modified.
msg199085 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-10-06 15:59
ec121a72e848.patch: updated patch, based on revision ec121a72e848 of the tracemalloc repository.
msg199086 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-10-06 16:00
TODO list:

* unit test for task reschedule
* hash_destroy(): use a double-linked list to avoid the O(n) complexity?
* Snapshot.add_process_memory_metrics():

  * rename metrics on Windows?
  * more metrics on Linux?
  * implement get_process_memory() on BSD

* add unit test for the command line interface
* add unit test for DisplayTopTask/TakeSnapshot.start()
* remove TRACE_RAW_MALLOC, always define it?
* test_io.check_interrupted_write() enters a deadlock when tracemalloc task
  is called before _pyio is blocked in the C write() function: the SIGINT does
  not interrupt write() but another instruction
* hide _tracemalloc._atexit()?
msg199123 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-10-07 00:51
ec121a72e848.patch:

- ignore changes on Lib/test/support/__init__.py: it's my own fix for issue #18948 which will be fixed differently
- ignore changes on Lib/test/regrtest.py: they should not be commited, it's just a convinient way to test tracemalloc (python -X tracemalloc -m test -r)
- changes on Modules/readline.c: this is the fix for the issue #16742
- Objects/codeobject.c: calling PyUnicode_READY(filename) in PyCode_New() is useful on Windows in debug mode, the filename may not be ready, whereas tracemalloc requires a ready Unicode string. This change can probably be fixed in default independently
- Objects/obmalloc.c: changes on _PyMem_Debug  and changes replacing PyMem_Malloc() with PyMem_RawMalloc() are a try to reuse pymalloc allocator for PyMem_Malloc(). This should be discussed independently

Sorry for all these unrelated change, I will try to cleanup the repository in the next patch.
msg199386 - (view) Author: Roundup Robot (python-dev) Date: 2013-10-10 14:02
New changeset bea4447c22bf by Victor Stinner in branch 'default':
Issue #18874: PyCode_New() now ensures that the filename is a ready Unicode
http://hg.python.org/cpython/rev/bea4447c22bf

New changeset ba27cba3ae20 by Victor Stinner in branch 'default':
Issue #18874: _PyObject_Malloc/Realloc/Free() now falls back on
http://hg.python.org/cpython/rev/ba27cba3ae20
msg200341 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-10-19 00:46
Update the patch to the last implementation.
msg202024 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-11-03 13:27
Updated patch to the latest implementation: 65e72bf01246.patch

The developement is now done in the start branch of  http://hg.python.org/features/tracemalloc
msg202438 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-11-08 17:28
Patch updated to the latest version of the PEP 454.
msg202492 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-11-09 18:44
> 2013-11-09 06:53:03	neologix	set	files: + 3bf73dcd0b42.diff

I guess that you clicked on [Create Patch]. This button doesn't work, it generates a patch with unrelated changes. Maybe I did something wrong on features/tracemalloc repository.

Here is a patch of the revision 8b34364a66a9 (which does not contain add_filter()): 8b34364a66a9.patch
msg203343 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2013-11-19 08:08
Would it be possible to generate a clean patch?
The latest one contains many unrelated commits.
msg203485 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2013-11-20 14:51
I made a review at http://bugs.python.org/review/18874/#ps9860 (not sure you got notified).
msg203508 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-11-20 16:51
Patch updated for Charles François's comments.
msg203534 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-11-20 21:47
Charles-François doesn't like packed structure (frame_t) because he fears crash on architectures not supporting non-aligned memory access or bad performances. Antoine and me want them to reduce the memory footprint of the tracemalloc module (tracemalloc.get_tracemalloc_memory()).

I think that the memory footprint has an higher price than the performances for tracemalloc: I can wait longer for a result, whereas I may not be able to use tracemalloc if it uses too much memory.

I propose to pack frame_t structure, and only disable it explicitly on architectures where tracemalloc does crash. What do you think?
msg203536 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-11-20 21:52
pack.patch: Patch to pack frame_t structure.

I tested tracemalloc with packed structure on Linux/AMD64, FreeBSD/AMD64, OpenBSD/AMD64, OpenIndiana/AMD64, Windows/x86 (Windows 7 64-bit with Python compiled in 32-bit). I don't have access to SPARC.
msg203587 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2013-11-21 08:46
If you really want to use packing, keep it.

But please remove this:
"""
+    /* ensure that the frame_t structure is packed */
+    assert(sizeof(frame_t) == (sizeof(PyObject*) + sizeof(int)));
"""
msg203589 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-11-21 08:56
> If you really want to use packing, keep it.
>
> But please remove this:
> """
> +    /* ensure that the frame_t structure is packed */
> +    assert(sizeof(frame_t) == (sizeof(PyObject*) + sizeof(int)));
> """

I added this assertion to ensure that I used correct GCC __attribute__((packed)) and Visual Studio #pragma pack(4). It can now be removed, I checked at least one per compiler that the structure is packed :-)

I will add a comment explaining that packing the structure is an optimization to reduce the memory footprint, it can be disabled if tracemalloc does crash because of it.
msg203615 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-11-21 11:43
3de17d13002d.patch:
- take in account Antoine Pitrou's and vajrasky's comments
- pack again frame_t structure (without the assertion and with a comment)
msg203616 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-11-21 11:50
Oh yes, Antoine also asked if Python/ and Include/ directories are the right place for hashtable.c and hashtable.h files. _Py_slist and _Py_hashtable are only used in Modules/_tracemalloc.c yet.

It was discussed to reuse _Py_slist somewhere (I don't remember where) and _Py_hashtable for the methods cache.

Reuse _Py_hashtable for the methods cache is not trivial, the API doesn't fit exactly. I didn't spend much time to try to adapt _Py_hashtable API or the methods cache.

_Py_hashtable API is written for performances, not for usability. Useless function calls are denied: they are punished by an assertion error (ex: _Py_hashtable_delete() if the key does not exist). An hash entry contains directly data, so you have to add a pointer with a size to store or retrieve an entry. I like my own API, but others may prefer a different API :-) Anyway, the _Py_hashtable API is fully private and excluded from the stable ABI.

What do you think? Should we start by moving them to Modules/ and only move them again if they are used in more than one place?

(Another little issue is that "hashtable" looks like Python dict, whereas Python dict is implemeted in Objects/dictobject.c).
msg203620 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-11-21 12:07
> It was discussed to reuse _Py_slist somewhere (I don't remember where) (...)

I did a quick search in Python code based using "link.*list" regex.

Structures linked in a single-linked list  (candidate for _Py_slist):

* Modules/_curses_panel.c: list_of_panels
* Modules/_elementtree.c: ParentLocator
* Modules/floatobject.c: free_list
* Objects/obmalloc.c: arena_object.freepools
* Python/compile.c: basicblock
* Python/pyarena.c: block
* Include/pystate.h: PyInterpreterState.tstate_head
* Python/thead.c: key

Structures linked in a double-linked list:

* Include/objimpl.h: PyGC_Head
* Include/object.c: PyObject (in debug mode)
* Modules/_collectionsmodule.c: block
* Objects/obmalloc.c: arena_object
* Include/weakrefobject.h: _PyWeakReference
msg203621 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-11-21 12:09
Oh, I just realized _Py_slist_init(), _Py_slist_prepend(), and _Py_slist_remove() functions are currently private (static in hashtable.c). These functions should be declared in hashtable.h if _Py_slist API is reused.
msg203624 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-11-21 12:24
"Reuse _Py_hashtable for the methods cache is not trivial, the API doesn't fit exactly. I didn't spend much time to try to adapt _Py_hashtable API or the methods cache."

I don't understand which part is the key and which part is the enetry data in method_cache_entry structure:

struct method_cache_entry {
    unsigned int version;
    PyObject *name;             /* reference to exactly a str or None */
    PyObject *value;            /* borrowed */
};

_PyType_Lookup() compares name and version. Does it mean that version is part of the key?

If it's possible to only store name as the key and store version and value in the data, it would be simple to reuse _Py_hashtable. I guess that it would be possible to get a cache entry using the name and later compare the version, it the version doesn't match: don't use the cache. It would require to copy the data (version+value: short structure, a few bytes) from the hash table entry to a local variable (allocated in the stack).
msg203659 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2013-11-21 18:00
FYI, the C OrderedDict implementation in #16991 implements its own doubly-linked list, built around the needs of OrderedDict.  I looked into BSD's queue.h [1], but it ended up simpler to roll my own.

[1] http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/sys/queue.h?rev=1.56
msg203669 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-11-21 20:22
> FYI, the C OrderedDict implementation in #16991 implements its own doubly-linked list, built around the needs of OrderedDict.  I looked into BSD's queue.h [1], but it ended up simpler to roll my own.

Oh nice, it's probably possible to factorize code and have more unit
tests on the implementation. It may be better to discuss this in a
separated issue.
msg203670 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-11-21 20:50
I'm not convinced myself that hashtable.c/h can be reused immediatly, so I prefer to move these two files to Modules. The files may be moved later if the containers are reused.
msg203676 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2013-11-21 21:57
> STINNER Victor added the comment:
>
> I'm not convinced myself that hashtable.c/h can be reused immediatly, so I prefer to move these two files to Modules. The files may be moved later if the containers are reused.

Please do so. I'd also like to open an issue for this (to try to make
hashtable reusable), to not forget it.
msg203815 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2013-11-22 17:25
Victor, is the attached patch up-to-date?
msg203862 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-11-22 20:30
>
> Yes, it is up to date. The only difference in the mercurial repository are
> the new location of the two hashtable files.
msg203947 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-11-23 01:37
Oh, Jim Jewett found a bug: if set_traceback_limit() is called while tracemalloc is tracing, two tracebacks are seen different because their length is different, whereas the full traceback would be the same.

To avoid this issue, I propose to add an optional nframe parameter to the set_traceback_limit() function.

With this API, it may be possible to remove the arbitrary limitation of 100 frames by allocating a buffer in start().
msg203988 - (view) Author: Roundup Robot (python-dev) Date: 2013-11-23 11:35
New changeset 6e2089dbc5ad by Victor Stinner in branch 'default':
Issue #18874: Implement the PEP 454 (tracemalloc)
http://hg.python.org/cpython/rev/6e2089dbc5ad
msg203990 - (view) Author: Roundup Robot (python-dev) Date: 2013-11-23 11:50
New changeset 66db0c66a6ee by Victor Stinner in branch 'default':
Issue #18874: Remove tracemalloc.set_traceback_limit()
http://hg.python.org/cpython/rev/66db0c66a6ee
msg204297 - (view) Author: Vajrasky Kok (vajrasky) * Date: 2013-11-25 06:45
Here is the patch to tidy up the Lib/test_tracemalloc.py.

It fixed the typo, and removed unused import and unused variables.
msg204309 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-11-25 08:31
> Here is the patch to tidy up the Lib/test_tracemalloc.py.
> It fixed the typo, and removed unused import and unused variables.

Thanks, I applied your patch.


changeset:   87547:841dec769a04
tag:         tip
user:        Victor Stinner <victor.stinner@gmail.com>
date:        Mon Nov 25 09:29:45 2013 +0100
files:       Lib/test/test_tracemalloc.py
description:
Cleanup test_tracemalloc.py. Patch written by Vajrasky Kok.
msg204310 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-11-25 08:32
>> Here is the patch to tidy up the Lib/test_tracemalloc.py.
>> It fixed the typo, and removed unused import and unused variables.
>
>Thanks, I applied your patch.

Oh, except:

-        data = [allocate_bytes(123) for count in range(1000)]
+        [allocate_bytes(123) for count in range(1000)]

If you don't store the result, the memory is immediatly released.
msg204431 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-11-25 23:09
@neologix: I run test_tracemalloc on "IRIX64 silicon 6.5 07202013 IP35" (our IRIX buildbot) which uses a 32-bit MIPS CPU, and the test pass. The code was compiled with gcc 3.4.6. So packed structures works also on 32-bit MIPS. Well, it is not surprising, int and void* have the same size: 32 bit.
msg204432 - (view) Author: Roundup Robot (python-dev) Date: 2013-11-25 23:41
New changeset 2da28004dfac by Victor Stinner in branch 'default':
Issue #18874: tracemalloc: explain the purpose of get_traces.tracebacks in a comment
http://hg.python.org/cpython/rev/2da28004dfac

New changeset b6414aa8cf77 by Victor Stinner in branch 'default':
Issue #18874: apply Jim Jewett's patch on tracemalloc doc
http://hg.python.org/cpython/rev/b6414aa8cf77
msg204435 - (view) Author: Roundup Robot (python-dev) Date: 2013-11-25 23:46
New changeset a2811425dbde by Victor Stinner in branch 'default':
Issue #18874: allow to call tracemalloc.Snapshot.statistics(cumulative=True)
http://hg.python.org/cpython/rev/a2811425dbde
msg204444 - (view) Author: Roundup Robot (python-dev) Date: 2013-11-26 00:21
New changeset cf2c4cd08dd9 by Victor Stinner in branch 'default':
Issue #18874: tracemalloc: Comment the trace_t structure
http://hg.python.org/cpython/rev/cf2c4cd08dd9

New changeset f06a50c2bf85 by Victor Stinner in branch 'default':
Issue #18874: make it more explicit than set_reentrant() only accept 0 or 1
http://hg.python.org/cpython/rev/f06a50c2bf85

New changeset 02668e24d72d by Victor Stinner in branch 'default':
Issue #18874: Fix typo
http://hg.python.org/cpython/rev/02668e24d72d
msg204472 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-11-26 12:59
The code has been merged. I didn't see any test_tracemalloc on buildbots. I tried to address all remarks on the Rietveld reviews. So I'm now closing the issue.

Please open new issue if you have more remarks. For example, I opened:

- #19798: tracemalloc: rename "max_size" to "peak_size" in get_traced_memory() result
- #19787: tracemalloc: set_reentrant() should not have to call PyThread_delete_key()
- #19786: tracemalloc: remove arbitrary limit of 100 frames

Thanks for your reviews.
msg204618 - (view) Author: Jim Jewett (Jim.Jewett) Date: 2013-11-27 22:10
These comments refer to http://hg.python.org/cpython/file/5c9af8194d3b/Doc/library/tracemalloc.rst which is newer than the current patches.

I believe I have consolidated the still-open issues that I see (including those that aren't mine) for this file:

http://hg.python.org/cpython/file/5c9af8194d3b/Doc/library/tracemalloc.rst

(line 24, new comment; also repeated at line 315)

frame (1 frame). To store 25 frames at startup: set the
:envvar:`PYTHONTRACEMALLOC` environment variable to ``25``, or use the
:option:`-X` ``tracemalloc=25`` command line option.
->

frame (1 frame). To store 25 frames at startup: set the
:envvar:`PYTHONTRACEMALLOC` environment variable to ``25``, or use the
:option:`-X` ``tracemalloc=25`` command line option.  If tracing is

started later, the maximum number of frames is a parameter to the 

 :func:`start` function.



(line 111)

If the system has little free memory, snapshots can be written on disk using
the :meth:`Snapshot.dump` method to analyze the snapshot offline. Then use the
:meth:`Snapshot.load` method reload the snapshot.

 ->

Snapshots can be stored to disk using :meth:`Snapshot.dump` and reloaded using :meth:`Snapshot.load`.



(line 180, new comment)
... The traceback is
where the :mod:`importlib` loaded data most recently: on the ``import pdb``
line of the :mod:`doctest` module. The traceback may change if a new module is
loaded.

->

The traceback represents the call stack when those allocations were made.  

As this particular summary is itemized by traceback, it also represents 

the call stack for all 903 of those allocations.


(line 252: new comment)
.. function:: clear_traces()

   Clear traces of memory blocks allocated by Python.

Add "Future allocations will still be traced."


Is this just a shortcut for stop();start()  ?  If so, please say so.  If not, please explain why.



(Line 311: new comment)
to measure how much memory is used
->
for an approximate measure of how much memory is used


I understand your reluctance to get into too many details, but ... well, it isn't a complete measure.  It misses the python objects of tracemalloc itself, it misses the overhead of the module, it misses any filenames that were kept alive only by tracing, etc.


(Line 372, new comment)
      If *inclusive* is ``True`` (include), only trace memory blocks allocated
      in a file with a name matching :attr:`filename_pattern` at line number
      :attr:`lineno`.

      If *inclusive* is ``False`` (exclude), ignore memory blocks allocated in
      a file with a name matching :attr:`filename_pattern` at line number
      :attr:`lineno`.
->
      If *inclusive* is ``True`` (include), then trace memory blocks allocated
      in a file with a name matching :attr:`filename_pattern` at line number
      :attr:`lineno`.  Also excludes any blocks not selected by any filter.

      If *inclusive* is ``False`` (exclude), ignore memory blocks allocated in
      a file with a name matching :attr:`filename_pattern` at line number
      :attr:`lineno`, even if they would otherwise be included.


(Line 394: new comment)
      This attribute is ignored if the traceback limit is less than ``2``.
->
      This attribute is meaningless if the traceback limit is ``1``.



Just after Line 428, not quite a new issue:
      Compute the differences with an old snapshot. Get statistics as a sorted
      list of :class:`StatisticDiff` instances grouped by *group_by*.

->

      Compute the differences with an old snapshot. Get statistics as a sorted
      list of :class:`StatisticDiff` instances grouped by *group_by*.

      Note that applying different Filters can cause spurious differences; users planning to archive a Snapshot may therefore wish to first add various annotations.


(Line 454, new comment)
      All inclusive filters are applied at once, a trace is ignored if no
      inclusive filters match it. A trace is ignored if at least one exclusive
      filter matchs it.
->
      If there are any inclusive filters, then they are all applied, and traces which do not match at least one of them are excluded.  A trace is also ignored if at least one exclusive
      filter matches it.


Line 544, new comment:

   .. attribute:: count_diff

      Difference of number of memory blocks between the old and the new
      snapshots (``int``): ``0`` if the memory blocks have been allocated in
      the new snapshot.
->
   .. attribute:: count_diff

      Difference of number of memory blocks between the old and the new
      snapshots (``int``).  count_diff==count if the memory blocks do not appear in the old snapshot.

Line 558, similar to above:
      the new snapshots (``int``): ``0`` if the memory blocks have been
      allocated in the new snapshot.
->
      the new snapshots (``int``).  size_diff==size if the memory blocks do not appear in the old snapshot.
msg204619 - (view) Author: Jim Jewett (Jim.Jewett) Date: 2013-11-27 22:17
Drat:  forgot one at line 277


.. function:: get_traced_memory()

   Get the current size and maximum size of memory blocks traced by the
   :mod:`tracemalloc` module as a tuple: ``(size: int, max_size: int)``.

I have a tendency to read "maximum size" as the most it is willing/able to trace, rather than the most it has traced at a single time so far.  I therefore prefer "peak_size".

Also, this doesn't suggests that the object is strictly a tuple, and that results.size or results.peak_size will not work; is that intentional?
msg204622 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-11-27 22:19
@Jim: This issue has been closed, please don't comment closed issues. I created a new issue for your new comments: #19818.
History
Date User Action Args
2013-12-10 17:49:41jceasetnosy: + jcea
2013-11-27 22:19:44hayposetmessages: + msg204622
2013-11-27 22:17:34Jim.Jewettsetmessages: + msg204619
2013-11-27 22:10:08Jim.Jewettsetnosy: + Jim.Jewett
messages: + msg204618
2013-11-26 12:59:26hayposetstatus: open -> closed
resolution: fixed
messages: + msg204472
2013-11-26 00:21:51python-devsetmessages: + msg204444
2013-11-25 23:46:01python-devsetmessages: + msg204435
2013-11-25 23:41:55python-devsetmessages: + msg204432
2013-11-25 23:09:34hayposetmessages: + msg204431
2013-11-25 08:32:33hayposetmessages: + msg204310
2013-11-25 08:31:09hayposetmessages: + msg204309
2013-11-25 06:45:08vajraskysetfiles: + minor_makeup_test_tracemalloc.patch
nosy: + vajrasky
messages: + msg204297

2013-11-23 11:50:22python-devsetmessages: + msg203990
2013-11-23 11:35:43python-devsetmessages: + msg203988
2013-11-23 01:37:31hayposetfiles: + start_nframe.patch

messages: + msg203947
2013-11-22 20:30:46hayposetmessages: + msg203862
2013-11-22 17:25:55neologixsetmessages: + msg203815
2013-11-21 21:57:51neologixsetmessages: + msg203676
2013-11-21 20:50:55hayposetmessages: + msg203670
2013-11-21 20:22:17hayposetmessages: + msg203669
2013-11-21 18:00:00eric.snowsetnosy: + eric.snow
messages: + msg203659
2013-11-21 12:24:32hayposetmessages: + msg203624
2013-11-21 12:09:27hayposetmessages: + msg203621
2013-11-21 12:07:12hayposetmessages: + msg203620
2013-11-21 11:50:06hayposetmessages: + msg203616
2013-11-21 11:43:28hayposetfiles: + 3de17d13002d.patch

messages: + msg203615
2013-11-21 08:56:21hayposetmessages: + msg203589
2013-11-21 08:46:11neologixsetmessages: + msg203587
2013-11-20 21:52:18hayposetfiles: + pack.patch

messages: + msg203536
2013-11-20 21:47:06hayposetmessages: + msg203534
2013-11-20 16:51:23hayposetfiles: + 4430e893d89f.patch

messages: + msg203508
2013-11-20 16:50:57hayposetfiles: - 69fd2d766005.patch
2013-11-20 16:50:56hayposetfiles: - 3bf73dcd0b42.diff
2013-11-20 14:51:57neologixsetmessages: + msg203485
2013-11-19 08:08:40neologixsetnosy: + neologix
messages: + msg203343
2013-11-09 18:44:31hayposetfiles: + 8b34364a66a9.patch

messages: + msg202492
2013-11-09 06:53:03neologixsetfiles: + 3bf73dcd0b42.diff
2013-11-08 17:29:23hayposetfiles: - 022955935ba3.patch
2013-11-08 17:29:09hayposetfiles: + 69fd2d766005.patch

messages: + msg202438
2013-11-04 11:17:21hayposetfiles: - 65e72bf01246.patch
2013-11-04 11:17:19hayposetfiles: - 57ae01bf96cb.patch
2013-11-04 11:15:31hayposetfiles: + 022955935ba3.patch
2013-11-03 13:28:00hayposetfiles: + 65e72bf01246.patch

messages: + msg202024
2013-10-19 00:47:13hayposetfiles: - ec121a72e848.patch
2013-10-19 00:46:52hayposetfiles: + 57ae01bf96cb.patch

messages: + msg200341
2013-10-10 14:02:08python-devsetnosy: + python-dev
messages: + msg199386
2013-10-08 20:42:29pitrousetnosy: + tim.peters
2013-10-07 00:51:05hayposetmessages: + msg199123
2013-10-06 16:00:47hayposetmessages: + msg199086
2013-10-06 15:59:35hayposetfiles: - 21f7c3df0f15.patch
2013-10-06 15:59:24hayposetfiles: + ec121a72e848.patch

messages: + msg199085
2013-10-06 08:19:21hayposetmessages: + msg199041
2013-10-06 08:01:31hayposetmessages: + msg199040
2013-10-06 07:42:29rhettingersetnosy: + rhettinger
messages: + msg199039
2013-10-01 11:57:47hayposetmessages: + msg198767
2013-10-01 11:49:51hayposetdependencies: + PyOS_Readline drops GIL and calls PyOS_StdioReadline, which isn't thread safe
messages: + msg198765
2013-09-16 23:46:23hayposetmessages: + msg197954
2013-09-16 23:44:41hayposetmessages: + msg197952
2013-09-16 23:02:46hayposetfiles: - tracemalloc.patch
2013-09-16 23:02:43hayposetfiles: - tracemalloc-2.patch
2013-09-16 23:02:41hayposetfiles: - 5d8817cc9e69.patch
2013-09-16 23:02:21hayposetfiles: + 21f7c3df0f15.patch
2013-09-16 22:57:28hayposetfiles: + 5d8817cc9e69.patch

messages: + msg197951
2013-09-05 01:17:57belopolskysetnosy: + belopolsky
2013-09-03 23:15:32hayposetmessages: + msg196879
2013-09-01 12:49:58hayposetfiles: + tracemalloc-2.patch

messages: + msg196720
2013-08-30 12:07:38hayposetfiles: + tracemalloc.patch
keywords: + patch
messages: + msg196525
2013-08-29 19:51:18hayposetmessages: + msg196484
2013-08-29 18:10:56amaury.forgeotdarcsetnosy: + amaury.forgeotdarc
messages: + msg196474
2013-08-29 00:57:58hayposetmessages: + msg196439
2013-08-29 00:18:37hayposetmessages: + msg196438
2013-08-29 00:07:26hayposetmessages: + msg196437
2013-08-28 23:53:11haypocreate