Author vstinner
Recipients njs, pitrou, python-dev, vstinner
Date 2016-03-22.12:54:13
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1458651254.45.0.191701463255.issue26530@psf.upfronthosting.co.za>
In-reply-to
Content
Ok, I added the following C functions:

  int _PyTraceMalloc_Track(_PyTraceMalloc_domain_t domain, Py_uintptr_t ptr, size_t size);
  int _PyTraceMalloc_Untrack(_PyTraceMalloc_domain_t domain, Py_uintptr_t ptr);

Antoine, Nathaniel: Please play with it, I'm waiting for your feedback on the API.


_PyTraceMalloc_Track() acquires the GIL for you if it was released.

I suggest to not call it from a C thread. If you want to use it from a C thread, it's better to initialize manually the Python thread state on this thread (see issue #20891) before using _PyTraceMalloc_Track().

--

_PyTraceMalloc_domain_t is an unsigned int. The type name is annoying (too long). Maybe "unsigned int" would be more readable :-) What do you think? Maybe an unsigned short is enough? _tracemalloc.c tries to use a packed structure to limit the memory footprint.

--

I chose to use the Py_uintptr_t type for the pointer instead of void*, but there is a problem with that. It gives a false hint on the expected type. In fact, the hashtable is really optimized for pointers, the hash function uses _Py_HashPointer():

    /* bottom 3 or 4 bits are likely to be 0; rotate y by 4 to avoid
       excessive hash collisions for dicts and sets */

It means that tracking file descriptor (int fd) may create a lot of a hash collisions. Well, if you handle a few file descriptors (less than 100?), it's maybe ok. If you handle tons of file descriptors, we should maybe make the hash function more configurable. Maybe even per domain?

Do you think that void* would be less a lie? :-) What do you prefer?

--

I also added a function to get the traceback where a memory block was allocated:

PyObject* _PyTraceMalloc_GetTraceback(_PyTraceMalloc_domain_t domain, Py_uintptr_t ptr);

But you should not use it, it's written for unit tests. You have to wrap the result into a tracemalloc.Traceback object:

    def get_traceback(self):
        frames = _testcapi.tracemalloc_get_traceback(self.domain, self.ptr)
        if frames is not None:
            return tracemalloc.Traceback(frames)
        else:
            return None

If this feature is useful, it should be added to the official Python API, the tracemalloc.py module. Currently, there is a tracemalloc.get_object_traceback(obj) function which is restricted to the domain 0 and expects a Python object, not a raw pointer.
History
Date User Action Args
2016-03-22 12:54:14vstinnersetrecipients: + vstinner, pitrou, njs, python-dev
2016-03-22 12:54:14vstinnersetmessageid: <1458651254.45.0.191701463255.issue26530@psf.upfronthosting.co.za>
2016-03-22 12:54:14vstinnerlinkissue26530 messages
2016-03-22 12:54:13vstinnercreate