classification
Title: _tracemalloc: add support for multiple address spaces (domains)
Type: enhancement Stage:
Components: Extension Modules Versions: Python 3.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: haypo, njs, pitrou, python-dev
Priority: normal Keywords: patch

Created on 2016-03-18 21:39 by haypo, last changed 2016-03-25 12:15 by haypo. This issue is now closed.

Files
File name Uploaded Description Edit
tracemalloc.patch haypo, 2016-03-18 21:39 review
tracemalloc-2.patch haypo, 2016-03-18 22:50 review
tracemalloc-3.patch haypo, 2016-03-18 23:36 review
tracemalloc-4.patch haypo, 2016-03-21 14:01 review
tracemalloc-5.patch haypo, 2016-03-21 15:55 review
hashtable_key_size.patch haypo, 2016-03-21 18:00 review
Messages (43)
msg261997 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2016-03-18 21:39
The tracemalloc module uses a hashtable: pointer (void*) => trace. Some embedded devices use multiple address spaces. Each GPU also has its own address space. CUDA, OpenCL and OpenGL should also be seen as different address spaces.

In the issue #26530, it was proposed (required?) to support multiple addresses spaces to be able to use tracemalloc in numpy.

Attached patch enhances tracemalloc to use (pointer: void*, domain: unsigned int) as key in the hashtable. A (pointer, domain) key is stored in a hashtable entry.

In the patch, the domain is hardcoded to 0, but the issue #26530 will add new C functions to track/untrack memory blocks, and this new functions will have a domain parameter.

The patch changes how a key is passed to the hashtable API: pass a *pointer* to a key rather than directly the key value. Because of that, the patch is quite long.

The patch also removes the unused function _Py_hashtable_hash_int().

_Py_HASHTABLE_ENTRY_DATA() macro now requires the hashtable to get the key size, since the offset of data now depends on the key size.
msg261998 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2016-03-18 21:48
Le 18/03/2016 22:39, STINNER Victor a écrit :
> 
> In the issue #26530, it was proposed (required?) to support multiple
addresses spaces to be able to use tracemalloc in numpy.

Only proposed :-) Numpy itself only works on the CPU, however Nathaniel
may have other use cases of his own. Also, Numba allows use of the Numpy
API (part thereof, anyway) on GPUs.
msg262003 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2016-03-18 22:50
Patch version 2:

- modify _tracemalloc._get_traces() to include domain in traces
- add tracemalloc.DomainFilter(domain)
- add domain optional parameter to tracemalloc.Filter constructor

Currently, Snapshot._group_by() ignores domain information. I don't know how the domain should be exposed, *if* it should be exposed.

By the way, maybe we don't need to export the domain at the Python level at all? (Don't expose it in _tracemalloc._get_traces()).
msg262004 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2016-03-18 22:54
> Also, Numba allows use of the Numpy API (part thereof, anyway) on GPUs.

Do you mean that Numba requires supporting multiple address spaces to be able to use tracemalloc? Or that supporting multiple address spaces would also to *also* track GPU memory allocations?

A friend told me that CUDA, OpenGL and OpenCL can (should?) be seen as 3 different address spaces. These libraries don't expose directly addresses in the GPU address space, but may use handles.
msg262005 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2016-03-18 23:36
Patch 3:

- avoid the memory overhead of storing domain if domains are not used
- add optional use_domain parameter to tracemalloc.start()
- by default, no change: use a C void* (pointer) as key for the trace hashtable
- if start() is called with use_domain=True, use (pointer: void*, domain: unsigned int) key for the trace hashtable
msg262006 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2016-03-18 23:38
Hum, on OpenStack it's common for me to work on a patch serie. Since our development workflow doesn't support that, I used a single patch. It's hard to review it, all changes are merged into one big patch :-/
msg262125 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2016-03-21 13:40
Crap. I pushed tracemalloc-3.patch by mistake. I just reverted my mistake.

At least, it helped to find first bugs using buildbots:

http://buildbot.python.org/all/builders/PPC64%20Fedora%203.x/builds/676/steps/test/logs/stdio

[230/400/3] test_tracemalloc
python: ./Modules/_tracemalloc.c:195: set_reentrant: Assertion `PyThread_get_key_value(tracemalloc_reentrant_key) == ((PyObject *) &_Py_TrueStruct)' failed.
Fatal Python error: Aborted

Current thread 0x00003fffb0a95a10 (most recent call first):
  File "/home/shager/cpython-buildarea/3.x.edelsohn-fedora-ppc64/build/Lib/test/test_tracemalloc.py", line 85 in setUp
  File "/home/shager/cpython-buildarea/3.x.edelsohn-fedora-ppc64/build/Lib/unittest/case.py", line 596 in run
msg262127 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2016-03-21 14:01
Updated patch taking Antoine's comments in account.

Instead of filling pointer_t padding with zeros, I changed the size of the key (key_size) in the hashtable to exclude padding.
msg262129 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2016-03-21 14:45
> Instead of filling pointer_t padding with zeros, I changed the size of the key (key_size) in the hashtable to exclude padding.

What about alignment issues on strict platforms? (SPARC?)
msg262131 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2016-03-21 14:54
> What about alignment issues on strict platforms? (SPARC?)

Which kind of alignment issue do you expect?

I added "__attribute__((packed))" to frame_t. I expected SIGBUS because of that (it's even documented in a comment ;-)), but I practice nobdy complained, not even our buildbots. Hum, it looks like we only have one SPARC buildbot and it's offline...

_Py_hashtable_get() uses memcpy() to copy the value, so this function should not have any alignment issue.

The new code uses pointer dereference, so alignment issues are more likely. How can I test it on Intel CPU?

Should I also use memcpy() to retrieve the key value?
msg262132 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2016-03-21 15:09
What about compilers that don't support "__attribute__((packed))"? Instead you could use a compare func that compares struct fields...
msg262134 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2016-03-21 15:55
Patch version 5:

* add hashtable_compare_pointer_t() which compares pointer_t fields (ptr and domain) rather using _Py_hashtable_compare_direct() which causes issues with padding
* hashtable_hash_pointer_t() now uses memcpy() to get the pointer_t value rather than a pointer dereference

Antoine:
> What about compilers that don't support "__attribute__((packed))"?

packed is a small and *optional* optimization on the memory footprint of _tracemalloc structure (reduce tracemalloc.get_tracemalloc_memory()).

Using packed can introduce memory alignment issue, but since tracemalloc uses memcpy(), it *should* be fine in practice. Again, since I don't have access to SPARC CPU, I don't know how to test.

> Instead you could use a compare func that compares struct fields...

Done.
msg262145 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2016-03-21 18:00
Hum, the patch is really too hard to review. So I extract to the "key_size" part: hashtable_key_size.patch.

I added new macros to get the key, wrapper to memcpy() with an assert to check the key size in debug mode.
msg262148 - (view) Author: Roundup Robot (python-dev) Date: 2016-03-21 21:01
New changeset aca4e9af1ca6 by Victor Stinner in branch 'default':
hashtable.h now supports keys of any size
https://hg.python.org/cpython/rev/aca4e9af1ca6
msg262149 - (view) Author: Roundup Robot (python-dev) Date: 2016-03-21 22:05
New changeset 98a1c8cb882f by Victor Stinner in branch 'default':
Issue #26588: Fix compilation warning on Windows
https://hg.python.org/cpython/rev/98a1c8cb882f
msg262151 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2016-03-21 22:21
Hum, an assertion fails in set_reentrant(), helper function to set a TLS (Thread Local Storage), but I'm unable to reproduce the bug on my Fedora 23 (AMD64).

http://buildbot.python.org/all/builders/AMD64%20Debian%20root%203.x/builds/3316/steps/test/logs/stdio

[187/400] test_tracemalloc
python: ./Modules/_tracemalloc.c:174: set_reentrant: Assertion `PyThread_get_key_value(tracemalloc_reentrant_key) == ((PyObject *) &_Py_TrueStruct)' failed.
Fatal Python error: Aborted

Current thread 0x00007fecefd68700 (most recent call first):
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/test/test_tracemalloc.py", line 85 in setUp
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/unittest/case.py", line 596 in run
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/unittest/case.py", line 648 in __call__
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/unittest/suite.py", line 122 in run
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/unittest/suite.py", line 84 in __call__
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/unittest/suite.py", line 122 in run
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/unittest/suite.py", line 84 in __call__
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/unittest/runner.py", line 176 in run
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/test/support/__init__.py", line 1802 in _run_suite
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/test/support/__init__.py", line 1836 in run_unittest
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/test/test_tracemalloc.py", line 824 in test_main
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/test/libregrtest/runtest.py", line 162 in runtest_inner
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/test/libregrtest/runtest.py", line 115 in runtest
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/test/libregrtest/main.py", line 295 in run_tests_sequential
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/test/libregrtest/main.py", line 356 in run_tests
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/test/libregrtest/main.py", line 392 in main
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/test/libregrtest/main.py", line 433 in main
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/test/libregrtest/main.py", line 455 in main_in_temp_cwd
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/test/__main__.py", line 3 in <module>
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/runpy.py", line 85 in _run_code
  File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/runpy.py", line 184 in _run_module_as_main
Makefile:1026: recipe for target 'buildbottest' failed
make: *** [buildbottest] Aborted
msg262152 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2016-03-21 22:45
About memory alignment, _Py_HASHTABLE_ENTRY_DATA_AS_VOID_P() macro uses pointer dereference. It may be replaced with _Py_HASHTABLE_ENTRY_READ_DATA() to use memcpy() and avoid memory alignment issue.
msg262154 - (view) Author: Nathaniel Smith (njs) * Date: 2016-03-22 00:16
High-level questions:

- How do you anticipate the integers naming domains will be allocated? Is it like port numbers, and you'll maintain a registry somewhere ("domains 0-100 are reserved for the interpreter, pycuda has reserved 100-110, ...")?

- You have it configurable at run-time whether the domain gets included in the trace key, and if this is set to false (the default) then all the different domains just get collapsed together. How is some downstream user like pycuda expected to get sensible behavior out of this? Is the idea that pycuda should be guarding all trace/untrace calls with if (use_domain) { ... }, or...?

(I guess it could be sensible to support disabling domain stuff -- though if I were writing it I probably wouldn't bother just because it increases the number of configurations that need to be tested etc. -- but if you're doing this then IMO it should discard all trace/untrace calls that refer to non-default domains, i.e. domain=False shouldn't mean "discard domain information", it should mean "trace only the heap domain".)

- Is all this messing about with possibly-unaligned pointers etc. really easier than having one-hashtable-per-domain? You could easily get away with a linearly scanned array, even...
msg262172 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2016-03-22 10:18
Hi,

> - How do you anticipate the integers naming domains will be allocated? Is it like port numbers, and you'll maintain a registry somewhere ("domains 0-100 are reserved for the interpreter, pycuda has reserved 100-110, ...")?

I simply have no idea at all :-) I guess that the best is to
collaborate. It's not like we plan to have 1000 different users of the
API. Today, I only know two users: you and Antoine :-D

unsigned int is large: at least 32 bits. I understood that we will
only need something like 10 or maybe 20 domains.

> - You have it configurable at run-time whether the domain gets included in the trace key, and if this is set to false (the default) then all the different domains just get collapsed together. How is some downstream user like pycuda expected to get sensible behavior out of this? Is the idea that pycuda should be guarding all trace/untrace calls with if (use_domain) { ... }, or...?

In my first implementation, support for domains was always enabled.
But I don't like that since it increases the memory footprint even if
you don't use domains, which is the case for *all* users today.

I added a flag, but I'm not happy with the API. For example, it's not
possible to set the flag using PYTHONTRACEMALLOC=nframe env var nor -X
tracemalloc[=nframe] command line option, whereas I'm a big fan of -X
tracemalloc=5 command line option.

Instead of having to enable the flag early and don't allow to change
the flag later, I should try to implement a dynamic flag: start
without domain, and convert the whole table to support domain as soon
as you start to trace at least one trace with domain != 0. I pushed
the first part of my patch (support key bigger than sizeof(void*)) to
be able to focus on the second part.

If fact, converting a table is no more complex than the regular
"rehash" operation when the table becomes too big. This operation is
part of the hashtable design.

> (I guess it could be sensible to support disabling domain stuff -- though if I were writing it I probably wouldn't bother just because it increases the number of configurations that need to be tested etc. -- but if you're doing this then IMO it should discard all trace/untrace calls that refer to non-default domains, i.e. domain=False shouldn't mean "discard domain information", it should mean "trace only the heap domain".)

The design of tracemalloc is more to collect everything, but filter
later. That's why I added an optional domain parameter to
tracemalloc.Filter an a new tracemalloc.DomainFilter filter.

> - Is all this messing about with possibly-unaligned pointers etc. really easier than having one-hashtable-per-domain? You could easily get away with a linearly scanned array, even...

Having one hash table per domain requires to allocate a new hashtable
when we start to use a domain, then decide when it's time to release
it. We also need a "registry" for all trace hashtables: a simple short
array, or maybe even another hashtable?

But I think that we disagree on how traces will used. It looks like
you only care of statistics *per domain*. IMHO it's worth to get the
*total* memory (ex: CPU+GPU memory) per filename or per line number.
But also give the ability to filter domains. For example, exclude CPU
to get statistics of all GPU domains, or get CPU+shmem statistics.

I know that technically, the storage doesn't matter much: we can still
get traces of all domains at once. I like the idea of having all
traces in the same structure, even I have to agree that the new C
macros to get a key are more ugly than the previous simple direct
"void* key" pointer.

Domains are quite strange. They are part of the key in the hashtable,
but _tracemalloc._get_traces() now returns them as part of the "value"
(with my patch) :-)
msg262177 - (view) Author: Roundup Robot (python-dev) Date: 2016-03-22 11:25
New changeset 7c894911eb59 by Victor Stinner in branch 'default':
Simplify implementation of hashtable.c
https://hg.python.org/cpython/rev/7c894911eb59

New changeset cd9a40a5ea90 by Victor Stinner in branch 'default':
Remove _Py_hashtable_delete()
https://hg.python.org/cpython/rev/cd9a40a5ea90
msg262178 - (view) Author: Roundup Robot (python-dev) Date: 2016-03-22 11:59
New changeset b86cdebe0e97 by Victor Stinner in branch 'default':
tracemalloc now supports domains
https://hg.python.org/cpython/rev/b86cdebe0e97
msg262181 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2016-03-22 12:56
> New changeset b86cdebe0e97 by Victor Stinner in branch 'default':
> tracemalloc now supports domains
> https://hg.python.org/cpython/rev/b86cdebe0e97

I hate working on huge patches, it's a pain to rebase them. I chose to push a first implementation to unblock the issue #26530 "tracemalloc: add C API to manually track/untrack memory allocations".

I'm interested by your feedback on the C API.

The C implementation can still be changed later.

For example, I chose to push a first implementation which *always* store domain_t in traces keys. Later, I will write a patch to switch from the compact key (Py_uintptr_t) to key using domain (pointer_t) dynamically.
msg262185 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2016-03-22 14:37
Crap. I expected the set_reentrant() bug to be fixed, but I saw it one more time on the latest build (revision 0da4532a893389bd1ee1ddfe8227750d968023ba):

http://buildbot.python.org/all/builders/s390x%20RHEL%203.x/builds/817/steps/test/logs/stdio

0:02:54 [103/400] test_tracemalloc
python: ./Modules/_tracemalloc.c:194: set_reentrant: Assertion `PyThread_get_key_value(tracemalloc_reentrant_key) == ((PyObject *) &_Py_TrueStruct)' failed.
Fatal Python error: Aborted

Current thread 0x000003fffcd7b6f0 (most recent call first):
  File "/home/dje/cpython-buildarea/3.x.edelsohn-rhel-z/build/Lib/test/test_tracemalloc.py", line 94 in setUp
  File "/home/dje/cpython-buildarea/3.x.edelsohn-rhel-z/build/Lib/unittest/case.py", line 596 in run
  File "/home/dje/cpython-buildarea/3.x.edelsohn-rhel-z/build/Lib/unittest/case.py", line 648 in __call__
  File "/home/dje/cpython-buildarea/3.x.edelsohn-rhel-z/build/Lib/unittest/suite.py", line 122 in run
  File "/home/dje/cpython-buildarea/3.x.edelsohn-rhel-z/build/Lib/unittest/suite.py", line 84 in __call__
  File "/home/dje/cpython-buildarea/3.x.edelsohn-rhel-z/build/Lib/unittest/suite.py", line 122 in run
  File "/home/dje/cpython-buildarea/3.x.edelsohn-rhel-z/build/Lib/unittest/suite.py", line 84 in __call__
  File "/home/dje/cpython-buildarea/3.x.edelsohn-rhel-z/build/Lib/unittest/runner.py", line 176 in run
  File "/home/dje/cpython-buildarea/3.x.edelsohn-rhel-z/build/Lib/test/support/__init__.py", line 1802 in _run_suite
  File "/home/dje/cpython-buildarea/3.x.edelsohn-rhel-z/build/Lib/test/support/__init__.py", line 1836 in run_unittest
  File "/home/dje/cpython-buildarea/3.x.edelsohn-rhel-z/build/Lib/test/test_tracemalloc.py", line 989 in test_main
  File "/home/dje/cpython-buildarea/3.x.edelsohn-rhel-z/build/Lib/test/libregrtest/runtest.py", line 162 in runtest_inner
  File "/home/dje/cpython-buildarea/3.x.edelsohn-rhel-z/build/Lib/test/libregrtest/runtest.py", line 115 in runtest
  File "/home/dje/cpython-buildarea/3.x.edelsohn-rhel-z/build/Lib/test/libregrtest/main.py", line 306 in run_tests_sequential
  File "/home/dje/cpython-buildarea/3.x.edelsohn-rhel-z/build/Lib/test/libregrtest/main.py", line 367 in run_tests
  File "/home/dje/cpython-buildarea/3.x.edelsohn-rhel-z/build/Lib/test/libregrtest/main.py", line 405 in main
  File "/home/dje/cpython-buildarea/3.x.edelsohn-rhel-z/build/Lib/test/libregrtest/main.py", line 446 in main
  File "/home/dje/cpython-buildarea/3.x.edelsohn-rhel-z/build/Lib/test/libregrtest/main.py", line 468 in main_in_temp_cwd
  File "/home/dje/cpython-buildarea/3.x.edelsohn-rhel-z/build/Lib/test/__main__.py", line 3 in <module>
  File "/home/dje/cpython-buildarea/3.x.edelsohn-rhel-z/build/Lib/runpy.py", line 85 in _run_code
  File "/home/dje/cpython-buildarea/3.x.edelsohn-rhel-z/build/Lib/runpy.py", line 184 in _run_module_as_main
make: *** [buildbottest] Aborted
msg262188 - (view) Author: Roundup Robot (python-dev) Date: 2016-03-22 15:14
New changeset 4ec81b497a84 by Victor Stinner in branch 'default':
Issue #26588: add debug traces
https://hg.python.org/cpython/rev/4ec81b497a84
msg262190 - (view) Author: Roundup Robot (python-dev) Date: 2016-03-22 15:29
New changeset 340ed3ff2656 by Victor Stinner in branch 'default':
Issue #26588: fix compilation on Windows
https://hg.python.org/cpython/rev/340ed3ff2656
msg262198 - (view) Author: Roundup Robot (python-dev) Date: 2016-03-22 16:41
New changeset bbdb24611294 by Victor Stinner in branch 'default':
Issue #26588: more assertions
https://hg.python.org/cpython/rev/bbdb24611294
msg262201 - (view) Author: Roundup Robot (python-dev) Date: 2016-03-22 16:47
New changeset 636fa01842f5 by Victor Stinner in branch 'default':
Add assertions on tracemalloc_reentrant_key
https://hg.python.org/cpython/rev/636fa01842f5
msg262202 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2016-03-22 16:54
Oh, "s390x RHEL 3.x" and "AMD64 Debian root 3.x" buildbots use "-j 1" command line option. In practice, tests are not run in subprocesses, -j1 option is ignored (issue #25285).
msg262205 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2016-03-22 17:47
Failure at revision 636fa01842f597bedff7054616c65a4784c92b4a on AMD64 Windows10 3.x (regrtest runs without the -j option, so all tests are run in the same thread).

http://buildbot.python.org/all/builders/AMD64%20Windows10%203.x/builds/763/steps/test/logs/stdio
---------
0:12:37 [ 87/400/1] test_tracemalloc
[pid 2864, tid 2624] tracemalloc_start()
[pid 2864, tid 2624] tracemalloc_init()
[pid 2864, tid 2624] tracemalloc_init(): exit (already initialized)
Assertion failed: get_reentrant(), file ..\Modules\_tracemalloc.c, line 1057
Fatal Python error: Aborted

Current thread 0x00000a40 (most recent call first):
  File "D:\buildarea\3.x.bolen-windows10\build\lib\test\test_tracemalloc.py", line 97 in setUp
  File "D:\buildarea\3.x.bolen-windows10\build\lib\unittest\case.py", line 596 in run
  File "D:\buildarea\3.x.bolen-windows10\build\lib\unittest\case.py", line 648 in __call__
  File "D:\buildarea\3.x.bolen-windows10\build\lib\unittest\suite.py", line 122 in run
  File "D:\buildarea\3.x.bolen-windows10\build\lib\unittest\suite.py", line 84 in __call__
  File "D:\buildarea\3.x.bolen-windows10\build\lib\unittest\suite.py", line 122 in run
  File "D:\buildarea\3.x.bolen-windows10\build\lib\unittest\suite.py", line 84 in __call__
  File "D:\buildarea\3.x.bolen-windows10\build\lib\unittest\runner.py", line 176 in run
  File "D:\buildarea\3.x.bolen-windows10\build\lib\test\support\__init__.py", line 1802 in _run_suite
  File "D:\buildarea\3.x.bolen-windows10\build\lib\test\support\__init__.py", line 1836 in run_unittest
  File "D:\buildarea\3.x.bolen-windows10\build\lib\test\test_tracemalloc.py", line 999 in test_main
  File "D:\buildarea\3.x.bolen-windows10\build\lib\test\libregrtest\runtest.py", line 162 in runtest_inner
  File "D:\buildarea\3.x.bolen-windows10\build\lib\test\libregrtest\runtest.py", line 115 in runtest
  File "D:\buildarea\3.x.bolen-windows10\build\lib\test\libregrtest\main.py", line 306 in run_tests_sequential
  File "D:\buildarea\3.x.bolen-windows10\build\lib\test\libregrtest\main.py", line 367 in run_tests
  File "D:\buildarea\3.x.bolen-windows10\build\lib\test\libregrtest\main.py", line 405 in main
  File "D:\buildarea\3.x.bolen-windows10\build\lib\test\libregrtest\main.py", line 446 in main
  File "D:\buildarea\3.x.bolen-windows10\build\lib\test\libregrtest\main.py", line 468 in main_in_temp_cwd
  File "D:\buildarea\3.x.bolen-windows10\build\PCbuild\..\lib\test\regrtest.py", line 39 in <module>
---------

I understand that tracemalloc_start() was called with the first time and that get_reentrant() returns 0 at tracemalloc_start() entry, whereas tracemalloc_init() is supposed to set the reentrant flag to 1.
msg262206 - (view) Author: Roundup Robot (python-dev) Date: 2016-03-22 17:52
New changeset af388201c997 by Victor Stinner in branch 'default':
Issue #26588: one more assertion
https://hg.python.org/cpython/rev/af388201c997
msg262207 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2016-03-22 18:16
Failure on "x86 Ubuntu Shared 3.x" at revision af388201c9976aebc4a8a433c95bfc4a1abe014b.

I added an assertion in tracemalloc_init() to ensure that the reeentrant flag is set at the end of the function.

The reentrant flag was no more set at tracemalloc_start() entry for an unknown reason.

http://buildbot.python.org/all/builders/x86%20Ubuntu%20Shared%203.x/builds/12905/steps/test/logs/stdio
---
0:08:50 [183/400] test_tracemalloc
[pid 14926, tid 1437466048] tracemalloc_start()
[pid 14926, tid 1437466048] tracemalloc_init()
[pid 14926, tid 1437466048] tracemalloc_init(): exit (already initialized)
python: ./Modules/_tracemalloc.c:1058: tracemalloc_start: Assertion `get_reentrant()' failed.
Fatal Python error: Aborted

Current thread 0x55adfdc0 (most recent call first):
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/test_tracemalloc.py", line 97 in setUp
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/unittest/case.py", line 596 in run
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/unittest/case.py", line 648 in __call__
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/unittest/suite.py", line 122 in run
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/unittest/suite.py", line 84 in __call__
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/unittest/suite.py", line 122 in run
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/unittest/suite.py", line 84 in __call__
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/unittest/runner.py", line 176 in run
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/support/__init__.py", line 1802 in _run_suite
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/support/__init__.py", line 1836 in run_unittest
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/test_tracemalloc.py", line 999 in test_main
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/libregrtest/runtest.py", line 162 in runtest_inner
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/libregrtest/runtest.py", line 115 in runtest
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/libregrtest/main.py", line 306 in run_tests_sequential
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/libregrtest/main.py", line 367 in run_tests
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/libregrtest/main.py", line 405 in main
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/libregrtest/main.py", line 446 in main
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/libregrtest/main.py", line 468 in main_in_temp_cwd
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/__main__.py", line 3 in <module>
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/runpy.py", line 85 in _run_code
  File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/runpy.py", line 184 in _run_module_as_main
---
msg262210 - (view) Author: Roundup Robot (python-dev) Date: 2016-03-22 20:06
New changeset f2d64f91d992 by Victor Stinner in branch 'default':
Issue #26588: Don't call tracemalloc_init() at module initilization
https://hg.python.org/cpython/rev/f2d64f91d992

New changeset d0b2f70731fb by Victor Stinner in branch 'default':
Issue #26588: more debug traces
https://hg.python.org/cpython/rev/d0b2f70731fb
msg262211 - (view) Author: Roundup Robot (python-dev) Date: 2016-03-22 20:26
New changeset 708beeb65026 by Victor Stinner in branch 'default':
Issue #26588: skip test_warnings.test_tracemalloc()
https://hg.python.org/cpython/rev/708beeb65026
msg262212 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2016-03-22 21:32
The bug was seen on 5 buildbots yet:

* http://buildbot.python.org/all/builders/PPC64%20Fedora%203.x/
* http://buildbot.python.org/all/builders/AMD64%20Debian%20root%203.x/
* http://buildbot.python.org/all/builders/s390x%20RHEL%203.x/
* http://buildbot.python.org/all/builders/AMD64%20Windows10%203.x/
* http://buildbot.python.org/all/builders/x86%20Ubuntu%20Shared%203.x/
msg262214 - (view) Author: Roundup Robot (python-dev) Date: 2016-03-22 23:20
New changeset c6f30e2731af by Victor Stinner in branch 'default':
Issue #26588: remove debug traces from _tracemalloc.
https://hg.python.org/cpython/rev/c6f30e2731af

New changeset af1c1149784a by Victor Stinner in branch '3.5':
Fix _tracemalloc start/stop
https://hg.python.org/cpython/rev/af1c1149784a

New changeset 2b4731e22df8 by Victor Stinner in branch '3.5':
Enhance _tracemalloc debug mode
https://hg.python.org/cpython/rev/2b4731e22df8
msg262215 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2016-03-22 23:21
Ooook, I think that I found and fixed the random failures in the change  af1c1149784a.
msg262220 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2016-03-22 23:59
TODO for this issue: smart storage in _tracemalloc for trace keys. Use compact void* key until the first domain != 0 is used.
msg262221 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2016-03-23 00:00
(Ooops, I closed the bug by mistake.)
msg262248 - (view) Author: Roundup Robot (python-dev) Date: 2016-03-23 08:08
New changeset dd887518b569 by Victor Stinner in branch 'default':
Issue #26588: Optimize tracemalloc_realloc()
https://hg.python.org/cpython/rev/dd887518b569
msg262249 - (view) Author: Roundup Robot (python-dev) Date: 2016-03-23 08:44
New changeset 0e0a464faffb by Victor Stinner in branch 'default':
Issue #26588:
https://hg.python.org/cpython/rev/0e0a464faffb

New changeset 14a552645c30 by Victor Stinner in branch 'default':
Issue #26588:
https://hg.python.org/cpython/rev/14a552645c30
msg262250 - (view) Author: Roundup Robot (python-dev) Date: 2016-03-23 09:09
New changeset 922b632808ac by Victor Stinner in branch 'default':
Cleanup hashtable.h
https://hg.python.org/cpython/rev/922b632808ac
msg262309 - (view) Author: Roundup Robot (python-dev) Date: 2016-03-23 21:05
New changeset 06552275fa30 by Victor Stinner in branch 'default':
_tracemalloc: use compact key for traces
https://hg.python.org/cpython/rev/06552275fa30
msg262433 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2016-03-25 12:15
Oooookay, it looks like all issues have been fixed. The optimization for compact keys has been implemented. I didn't see any crash on buildbots recently, test_tracemalloc pass on all platforms.

Nathaniel asked how domains will be "allocated" or "registered", but I suggest to discuss that in the issue #26530.

I close this issue that was much more difficult to implement than I expected...
History
Date User Action Args
2016-03-25 12:15:40hayposetstatus: open -> closed
resolution: fixed
messages: + msg262433
2016-03-23 21:05:18python-devsetmessages: + msg262309
2016-03-23 09:09:49python-devsetmessages: + msg262250
2016-03-23 08:44:14python-devsetmessages: + msg262249
2016-03-23 08:08:38python-devsetmessages: + msg262248
2016-03-23 00:00:33hayposetstatus: closed -> open
resolution: fixed -> (no value)
messages: + msg262221
2016-03-22 23:59:10hayposetstatus: open -> closed
resolution: fixed
messages: + msg262220
2016-03-22 23:21:03hayposetmessages: + msg262215
2016-03-22 23:20:15python-devsetmessages: + msg262214
2016-03-22 21:32:48hayposetmessages: + msg262212
2016-03-22 20:26:38python-devsetmessages: + msg262211
2016-03-22 20:06:15python-devsetmessages: + msg262210
2016-03-22 18:16:24hayposetmessages: + msg262207
2016-03-22 17:52:17python-devsetmessages: + msg262206
2016-03-22 17:47:51hayposetmessages: + msg262205
2016-03-22 16:54:32hayposetmessages: + msg262202
2016-03-22 16:47:59python-devsetmessages: + msg262201
2016-03-22 16:41:46python-devsetmessages: + msg262198
2016-03-22 15:29:26python-devsetmessages: + msg262190
2016-03-22 15:14:07python-devsetmessages: + msg262188
2016-03-22 14:37:16hayposetmessages: + msg262185
2016-03-22 12:56:49hayposetmessages: + msg262181
2016-03-22 11:59:10python-devsetmessages: + msg262178
2016-03-22 11:25:16python-devsetmessages: + msg262177
2016-03-22 10:18:30hayposetmessages: + msg262172
2016-03-22 00:16:38njssetmessages: + msg262154
2016-03-21 22:45:06hayposetmessages: + msg262152
2016-03-21 22:21:43hayposetmessages: + msg262151
2016-03-21 22:05:29python-devsetmessages: + msg262149
2016-03-21 21:01:52python-devsetnosy: + python-dev
messages: + msg262148
2016-03-21 18:00:29hayposetfiles: + hashtable_key_size.patch

messages: + msg262145
2016-03-21 15:55:07hayposetfiles: + tracemalloc-5.patch

messages: + msg262134
2016-03-21 15:09:44pitrousetmessages: + msg262132
2016-03-21 14:54:52hayposetmessages: + msg262131
2016-03-21 14:45:06pitrousetmessages: + msg262129
2016-03-21 14:01:13hayposetfiles: + tracemalloc-4.patch

messages: + msg262127
2016-03-21 13:40:50hayposetmessages: + msg262125
2016-03-18 23:38:06hayposetmessages: + msg262006
2016-03-18 23:36:48hayposetfiles: + tracemalloc-3.patch

messages: + msg262005
2016-03-18 22:54:27hayposetmessages: + msg262004
2016-03-18 22:50:03hayposetfiles: + tracemalloc-2.patch

messages: + msg262003
2016-03-18 21:48:03pitrousetmessages: + msg261998
2016-03-18 21:39:27haypocreate