This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author njs
Recipients njs, pitrou, vstinner
Date 2016-03-18.03:57:00
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1458273423.3.0.273373089171.issue26530@psf.upfronthosting.co.za>
In-reply-to
Content
I think we're talking past each other :-).

> If I change tracemalloc, it's not to fullfit numpy requirements, it must remain very generic

Nothing about what I'm saying is relevant to numpy -- the patch attached to this bug report is already plenty for what numpy needs. (Well, it still needs a public API like PyMem_Track/Untrack or something, but never mind for now.)

The only reason I'm speaking up now is that if you're adding a manual track/untrack API, then a relatively trivial addition now makes tracemalloc vastly more powerful, so I don't want to miss this opportunity.

> void* allows to implement the rejected option of also storing the C filename an C line number:

And if you want to attach some extra metadata to traces, then that's an interesting idea that I can potentially imagine various use cases for. But it's not the idea I'm talking about :-). (FWIW, I think the biggest challenge for your idea will be how the allocation sites -- which might be in arbitrary user code -- are supposed to figure out what kind of metadata they should be attaching. And if it's information that tracemalloc can compute itself -- like C backtraces -- then there's no reason for it to be in the public API, which is the thing I'm concerned about here.)

What I'm talking about is different: I think it should be possible to re-use the tracemalloc infrastructure to track other resources besides "heap allocations". So for my use case, it's crucial that we index by (domain, pointer), because the address 0xdeadbeef on the heap is different from the address 0xdeadbeef on the GPU. We'll never want to group by pointer alone without the domain, because that would cause us to actually misinterpret the data (if I do PyMem_Track("gpu", 0xdeadbeef); PyMem_Untrack("heap", 0xdeadbeef), then this should not cause tracemalloc to forget about the gpu allocation! I think this is very different than your C backtrace example). And, it's always obvious to callers what kind of thing to pass here, because they know perfectly well whether they just allocated memory on the heap or on the GPU, so the public API is an appropriate place for this information. And, it's immediately obvious that for this use case, there will only be a few different domains in use at one time, so it's very inefficient to literally store (domain, pointer) pairs -- replacing the current pointer => trace design with a domain => (pointer => trace) design would indeed require changing tracemalloc's design a bit, but not, I think, in any fundamental way?
History
Date User Action Args
2016-03-18 03:57:03njssetrecipients: + njs, pitrou, vstinner
2016-03-18 03:57:03njssetmessageid: <1458273423.3.0.273373089171.issue26530@psf.upfronthosting.co.za>
2016-03-18 03:57:03njslinkissue26530 messages
2016-03-18 03:57:00njscreate