Add *Calloc functions to CPython memory allocation API #65432

njsmith · 2014-04-15T08:56:02Z

BPO	21233
Nosy	@pitrou, @vstinner, @njsmith, @skrah, @MojoVampire
Files	calloc.patch calloc-2.patch calloc-3.patch bench_alloc.py test.c calloc-4.patch use_calloc.patch bench_alloc2.py calloc-5.patch calloc-6.patch

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2014-06-02.20:29:45.268>
created_at = <Date 2014-04-15.08:56:01.630>
labels = ['interpreter-core', 'type-feature']
title = 'Add *Calloc functions to CPython memory allocation API'
updated_at = <Date 2014-06-02.20:29:45.267>
user = 'https://github.com/njsmith'

bugs.python.org fields:

activity = <Date 2014-06-02.20:29:45.267>
actor = 'vstinner'
assignee = 'none'
closed = True
closed_date = <Date 2014-06-02.20:29:45.268>
closer = 'vstinner'
components = ['Interpreter Core']
creation = <Date 2014-04-15.08:56:01.630>
creator = 'njs'
dependencies = []
files = ['34897', '34903', '34924', '35052', '35059', '35063', '35064', '35065', '35072', '35097']
hgrepos = []
issue_num = 21233
keywords = ['patch']
message_count = 95.0
messages = ['216281', '216390', '216394', '216399', '216403', '216404', '216422', '216425', '216431', '216433', '216444', '216451', '216452', '216455', '216515', '216567', '216668', '216671', '216681', '216682', '216686', '217228', '217242', '217246', '217251', '217252', '217253', '217254', '217255', '217256', '217257', '217262', '217274', '217276', '217282', '217283', '217284', '217291', '217294', '217295', '217297', '217298', '217302', '217303', '217304', '217305', '217306', '217307', '217308', '217309', '217310', '217323', '217324', '217325', '217326', '217330', '217331', '217333', '217346', '217348', '217349', '217351', '217357', '217375', '217380', '217382', '217423', '217436', '217445', '217549', '217553', '217594', '217617', '217619', '217785', '217786', '217794', '217797', '217826', '217829', '217831', '217832', '217838', '217839', '217840', '217841', '217844', '217866', '217972', '219627', '219628', '219630', '219631', '219634', '219635']
nosy_count = 8.0
nosy_names = ['pitrou', 'vstinner', 'njs', 'skrah', 'neologix', 'python-dev', 'jtaylor', 'josh.r']
pr_nums = []
priority = 'normal'
resolution = 'fixed'
stage = None
status = 'closed'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue21233'
versions = ['Python 3.5']

njsmith · 2014-04-15T08:56:00Z

Numpy would like to switch to using the CPython allocator interface in order to take advantage of the new tracemalloc infrastructure in 3.4. But, numpy relies on the availability of calloc(), and the CPython allocator API does not expose calloc().
https://docs.python.org/3.5/c-api/memory.html#c.PyMemAllocator

So, we should add *Calloc variants. This met general approval on python-dev. Thread here:
https://mail.python.org/pipermail/python-dev/2014-April/133985.html

This would involve adding a new .calloc field to the PyMemAllocator struct, exposed through new API functions PyMem_RawCalloc, PyMem_Calloc, PyObject_Calloc. [It's not clear that all 3 would really be used, but since we have only one PyMemAllocator struct that they all share, it'd be hard to add support to only one or two of these domains and not the rest. And the higher-level calloc variants might well be used. Numpy array buffers are often small (e.g., holding only a single value), and these small buffers benefit from small-alloc optimizations; meanwhile, large buffers benefit from calloc optimizations. So it might be optimal to use a single allocator that has both.]

We might also have to rename the PyMemAllocator struct to ensure that compiling old code with the new headers doesn't silently leave garbage in the .calloc field and lead to crashes.

vstinner · 2014-04-15T21:27:55Z

Here is a first patch adding the following functions:

  void* PyMem_RawCalloc(size_t n);
  void* PyMem_Calloc(size_t n);
  void* PyObject_Calloc(size_t n);
  PyObject* _PyObject_GC_Calloc(size_t);

It adds the following field after malloc field to PyMemAllocator structure:

  void* (*calloc) (void *ctx, size_t size);

It changes the tracemalloc module to trace "calloc" allocations, add new tests and document new functions.

The patch also contains an important change: PyType_GenericAlloc() uses calloc instead of malloc+memset(0). It may be faster, I didn't check.

pitrou · 2014-04-15T21:39:09Z

So what is the point of _PyObject_GC_Calloc ?

MojoVampire · 2014-04-15T22:05:23Z

General comment on patch: For the flag value that toggles zero-ing, perhaps use a different name, e.g. setzero, clearmem, initzero or somesuch instead of calloc? calloc already gets used to refer to both the C standard function and the function pointer structure member; it's mildly confusing to have it *also* refer to a boolean flag as well.

MojoVampire · 2014-04-15T22:17:19Z

Additional comment on clarity: Might it make sense to make the calloc structure member take both the num and size arguments that the underlying calloc takes? That is, instead of:

void* (*calloc) (void *ctx, size_t size);

Declare it as:

void* (*calloc) (void *ctx, size_t num, size_t size);

Beyond potentially allowing more detailed tracing info at some later point (and much like the original calloc, potentially allowing us to verify that the components do not overflow on multiply, instead of assuming every caller must multiply and check for themselves), it also seems like it's a bit more friendly to have the prototype for the structure calloc to follow the same pattern as the other members for consistency (Principle of Least Surprise): A context pointer, plus the arguments expected by the equivalent C function.

MojoVampire · 2014-04-15T22:20:09Z

Sorry for breaking it up, but the same comment on consistent prototypes mirroring the C standard lib calloc would apply to all the API functions as well, e.g. PyMem_RawCalloc, PyMem_Calloc, PyObject_Calloc and _PyObject_GC_Calloc, not just the structure function pointer.

vstinner · 2014-04-16T02:40:56Z

So what is the point of _PyObject_GC_Calloc ?

It calls calloc(size) instead of malloc(size), calloc() which can be faster than malloc()+memset(), see:
https://mail.python.org/pipermail/python-dev/2014-April/133985.html

_PyObject_GC_Calloc() is used by PyType_GenericAlloc(). If I understand directly, it is the default allocator to allocate Python objects.

vstinner · 2014-04-16T02:49:46Z

In numpy, I found the two following functions:

/*NUMPY_API
 * Allocates memory for array data.
 */
void* PyDataMem_NEW(size_t size);

/*NUMPY_API
 * Allocates zeroed memory for array data.
 */
void* PyDataMem_NEW_ZEROED(size_t size, size_t elsize);

So it looks like it needs two size_t parameters. Prototype of the C function calloc():

void *calloc(size_t nmemb, size_t size);

I agree that it's better to provide the same prototype than calloc().

vstinner · 2014-04-16T04:21:24Z

New patch:

replace "size_t size" with "size_t nelem, size_t elsize" in the prototype of calloc functions (the parameter names come from the POSIX standard)
replace "int calloc" with "int zero" in helper functions

pitrou · 2014-04-16T05:34:58Z

Le 16/04/2014 04:40, STINNER Victor a écrit :

STINNER Victor added the comment:

> So what is the point of _PyObject_GC_Calloc ?

It calls calloc(size) instead of malloc(size)

No, the question is why you didn't simply change _PyObject_GC_Malloc
(which is a private function).

neologix · 2014-04-16T07:18:37Z

> So what is the point of _PyObject_GC_Calloc ?

It calls calloc(size) instead of malloc(size), calloc() which can be faster than malloc()+memset(), see:
https://mail.python.org/pipermail/python-dev/2014-April/133985.html

It will only make a difference if the allocated region is large enough
to be allocated by mmap (so not for 90% of objects).

vstinner · 2014-04-16T08:04:32Z

>> So what is the point of _PyObject_GC_Calloc ?
>
> It calls calloc(size) instead of malloc(size)

No, the question is why you didn't simply change _PyObject_GC_Malloc
(which is a private function).

Oh ok, I didn't understand. I don't like changing the behaviour of
functions, but it's maybe fine if the function is private.

vstinner · 2014-04-16T08:06:13Z

2014-04-16 3:18 GMT-04:00 Charles-François Natali <report@bugs.python.org>:

> It calls calloc(size) instead of malloc(size), calloc() which can be faster than malloc()+memset(), see:
> https://mail.python.org/pipermail/python-dev/2014-April/133985.html

It will only make a difference if the allocated region is large enough
to be allocated by mmap (so not for 90% of objects).

Even if there are only 10% of cases where it may be faster, I think
that it's interesting to use calloc() to allocate Python objects. You
may create large Python objects ;-)

I didn't check which objects use (indirectly) _PyObject_GC_Calloc().

skrah · 2014-04-16T09:54:37Z

I left a Rietveld comment, which probably did not get mailed.

pitrou · 2014-04-16T17:47:13Z

On mer., 2014-04-16 at 08:06 +0000, STINNER Victor wrote:

I didn't check which objects use (indirectly) _PyObject_GC_Calloc().

I've checked: lists, tuples, dicts and sets at least seem to use it.
Obviously, objects which are not tracked by the GC (such as str and
bytes) won't use it.

vstinner · 2014-04-16T19:48:02Z

Patch version 3: remove _PyObject_GC_Calloc(), modify _PyObject_GC_Malloc() instead of use calloc() instead of malloc()+memset(0).

neologix · 2014-04-17T07:19:26Z

Do you have benchmarks?

(I'm not looking for an improvement, just no regression.)

jtaylor · 2014-04-17T08:04:36Z

won't replacing _PyObject_GC_Malloc with a calloc cause Var objects (PyObject_NewVar) to be completely zeroed which I think they didn't before?
Some numeric programs stuff a lot of data into var objects and could care about python suddenly setting them to zero when they don't need it.
An example would be tinyarray.

MojoVampire · 2014-04-17T10:35:42Z

Julian: No. See the diff: http://bugs.python.org/review/21233/diff/11644/Objects/typeobject.c

The original GC_Malloc was explicitly memset-ing after confirming that it received a non-NULL pointer from the underlying malloc call; that memset is removed in favor of using the calloc call.

MojoVampire · 2014-04-17T10:39:32Z

Well, to be more specific, PyType_GenericAlloc was originally calling one of two methods that didn't zero the memory (one of which was GC_Malloc), then memset-ing. Just realized you're talking about something else; not sure if you're correct about this now, but I have to get to work, will check later if no one else does.

jtaylor · 2014-04-17T11:35:08Z

I just tested it, PyObject_NewVar seems to use RawMalloc not the GC malloc so its probably fine.

vstinner · 2014-04-27T00:05:50Z

I read again some remarks about alignement, it was suggested to provide allocators providing an address aligned to a requested alignement. This topic was already discussed in bpo-18835.

If Python doesn't provide such memory allocators, it was suggested to provide a "trace" function which can be called on the result of a successful allocator to "trace" an allocation (and a similar function for free). But this is very different from the design of the PEP-445 (new malloc API). Basically, it requires to rewrite the PEP-445.

neologix · 2014-04-27T08:30:36Z

I read again some remarks about alignement, it was suggested to provide allocators providing an address aligned to a requested alignement. This topic was already discussed in bpo-18835.

The alignement issue is really orthogonal to the calloc one, so IMO
this shouldn't be discussed here (and FWIW I don't think we should
expose those: alignement only matters either for concurrency or SIMD
instructions, and I don't think we should try to standardize this kind
of API, it's way to special-purpose (then we'd have to think about
huge pages, etc...). Whereas calloc is a simple and immediately useful
addition, not only for Numpy but also CPython).

vstinner · 2014-04-27T09:51:46Z

2014-04-27 10:30 GMT+02:00 Charles-François Natali <report@bugs.python.org>:

> I read again some remarks about alignement, it was suggested to provide allocators providing an address aligned to a requested alignement. This topic was already discussed in bpo-18835.

The alignement issue is really orthogonal to the calloc one, so IMO
this shouldn't be discussed here (and FWIW I don't think we should
expose those: alignement only matters either for concurrency or SIMD
instructions, and I don't think we should try to standardize this kind
of API, it's way to special-purpose (then we'd have to think about
huge pages, etc...). Whereas calloc is a simple and immediately useful
addition, not only for Numpy but also CPython).

This issue was opened to be able to use tracemalloc on numpy. I would
like to make sure that calloc is enough for numpy. I would prefer to
change the malloc API only once.

neologix · 2014-04-27T10:20:59Z

This issue was opened to be able to use tracemalloc on numpy. I would
like to make sure that calloc is enough for numpy. I would prefer to
change the malloc API only once.

Then please at least rename the issue. Also, I don't see why
everything should be done at once: calloc support is a self-contained
change, which is useful outside of numpy. Enhanced tracemalloc support
for numpy certainly belongs to another issue.

Regarding the *Calloc functions: how about we provide a sane API
instead of reproducing the cumbersome C API?

I mean, why not expose:
PyAPI_FUNC(void *) PyMem_Calloc(size_t size);
insteaf of
PyAPI_FUNC(void *) PyMem_Calloc(size_t nelem, size_t elsize);

AFAICT, the two arguments are purely historical (it was used when
malloc() didn't guarantee suitable alignment, and has the advantage of
performing overflow check when doing the multiplication, but in our
code we always check for it anyway).
See
https://groups.google.com/forum/#!topic/comp.lang.c/jZbiyuYqjB4
http://stackoverflow.com/questions/4083916/two-arguments-to-calloc

And http://www.eglibc.org/cgi-bin/viewvc.cgi/trunk/libc/malloc/malloc.c?view=markup
to check that calloc(nelem, elsize) is implemented as calloc(nelem *
elsize)

I'm also concerned about the change to _PyObject_GC_Malloc(): it now
calls calloc() instead of malloc(): it can definitely be slower.

neologix · 2014-04-27T10:32:47Z

Note to numpy devs: it would be great if some of you followed the
python-dev mailing list (I know it can be quite volume intensive, but
maybe simple filters could help keep the noise down): you guys have
definitely both expertise and real-life applications that could be
very valuable in helping us design the best possible public/private
APIs. It's always great to have downstream experts/end-users!

vstinner · 2014-04-27T10:36:04Z

I wrote a short microbenchmark allocating objects using my benchmark.py script.

It looks like the operation "(None,) * N" is slower with calloc-3.patch, but it's unclear how much times slower it is. I don't understand why only this operation has different speed.

Do you have ideas for other benchmarks?

Using the timeit module:

$ ./python.orig -m timeit '(None,) * 10**5'
1000 loops, best of 3: 357 usec per loop
$ ./python.calloc -m timeit '(None,) * 10**5'
1000 loops, best of 3: 698 usec per loop

But with different parameters, the difference is lower:

$ ./python.orig -m timeit -r 20 -n '1000' '(None,) * 10**5'
1000 loops, best of 20: 362 usec per loop
$ ./python.calloc -m timeit -r 20 -n '1000' '(None,) * 10**5'
1000 loops, best of 20: 392 usec per loop

Results of bench_alloc.py:

Common platform:
CFLAGS: -Wno-unused-result -Werror=declaration-after-statement -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes
CPU model: Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz
Python unicode implementation: PEP-393
Timer info: namespace(adjustable=False, implementation='clock_gettime(CLOCK_MONOTONIC)', monotonic=True, resolution=1e-09)
Timer: time.perf_counter
SCM: hg revision=462470859e57+ branch=default date="2014-04-26 19:01 -0400"
Platform: Linux-3.13.8-200.fc20.x86_64-x86_64-with-fedora-20-Heisenbug
Bits: int=32, long=64, long long=64, size_t=64, void*=64

Platform of campaign orig:
Timer precision: 42 ns
Date: 2014-04-27 12:27:26
Python version: 3.5.0a0 (default:462470859e57, Apr 27 2014, 11:52:55) [GCC 4.8.2 20131212 (Red Hat 4.8.2-7)]

Platform of campaign calloc:
Timer precision: 45 ns
Date: 2014-04-27 12:29:10
Python version: 3.5.0a0 (default:462470859e57+, Apr 27 2014, 12:04:57) [GCC 4.8.2 20131212 (Red Hat 4.8.2-7)]

-----------------------------------+--------------+---------------
Tests                              |         orig |         calloc
-----------------------------------+--------------+---------------
object()                           |    61 ns () |          62 ns
b'A' * 10                          |    55 ns () |    51 ns (-7%)
b'A' * 10**3                       |    99 ns () |          94 ns
b'A' * 10**6                       | 37.5 us () |        36.6 us
'A' * 10                           |    62 ns () |    58 ns (-7%)
'A' * 10**3                        |   107 ns () |         104 ns
'A' * 10**6                        |    37 us () |        36.6 us
'A' * 10**8                        | 16.2 ms () |        16.4 ms
decode 10 null bytes from ASCII    |   253 ns () |         248 ns
decode 10**3 null bytes from ASCII |   359 ns () |         357 ns
decode 10**6 null bytes from ASCII | 78.8 us () |        78.7 us
decode 10**8 null bytes from ASCII | 26.2 ms () |        25.9 ms
(None,) * 10**0                    |    30 ns () |          30 ns
(None,) * 10**1                    |    78 ns () |          77 ns
(None,) * 10**2                    |   427 ns () |   460 ns (+8%)
(None,) * 10**3                    |   3.5 us () |   3.7 us (+6%)
(None,) * 10**4                    | 34.7 us () | 37.2 us (+7%)
(None,) * 10**5                    |   357 us () |   390 us (+9%)
(None,) * 10**6                    | 3.86 ms () | 4.43 ms (+15%)
(None,) * 10**7                    | 50.4 ms () |        50.3 ms
(None,) * 10**8                    |   505 ms () |         504 ms
([None] * 10)[1:-1]                |   121 ns () |         120 ns
([None] * 10**3)[1:-1]             | 3.57 us () |        3.57 us
([None] * 10**6)[1:-1]             | 4.61 ms () |        4.59 ms
([None] * 10**8)[1:-1]             |   585 ms () |         582 ms
-----------------------------------+--------------+---------------
Total                              | 1.19 sec () |       1.19 sec
-----------------------------------+--------------+---------------

pitrou · 2014-04-27T11:02:55Z

Regarding the *Calloc functions: how about we provide a sane API
instead of reproducing the cumbersome C API?

Isn't the point of reproducing the C API to allow quickly switching from calloc() to PyObject_Calloc()?
(besides, it seems the OpenBSD guys like the two-argument form :-))

skrah · 2014-04-28T22:48:12Z

The order of the nelem/elsize matters for readability. Otherwise it is
not intuitive what happens after the jump to redirect in _PyObject_Alloc().

Why would you assert that 'nelem' is one?

njsmith · 2014-04-28T23:33:05Z

It would be interesting to see some NumPy benchmarks (Nathaniel?).

What is it you want to see? NumPy already uses calloc; we benchmarked it when we added it and it made a huge difference to various realistic workloads :-). What NumPy gets out of this isn't calloc, it's access to tracemalloc.

vstinner · 2014-04-29T20:59:52Z

Patch version 6:

I renamed "int zero" parameter to "int use_calloc" and move the new parameter at the first position to avoid confusion with nelem. For example, _PyObject_Alloc(ctx, 1, nbytes, 0) becomes _PyObject_Alloc(0, ctx, 1, nbytes). It also more logical to put it in the first position. In bytesobject.c, I leaved it at the parameter at the end since its meaning is different (fill bytes with zero or not) IMO.
I removed my hack (premature optimization) "assert(nelem == 1); ... malloc(elsize);" and replaced it with a less surprising "... malloc(nelem * elsize);"

Stefan & Charles-François: I hope that the patch looks better to you.

neologix · 2014-04-29T21:14:30Z

LGTM!

vstinner · 2014-04-30T10:02:30Z

@Stefan: Can you please review calloc-6.patch? Charles-François wrote that the patch looks good, but for such critical operation (memory allocation), I would prefer a second review ;)

skrah · 2014-04-30T12:58:52Z

Victor, sure, maybe not right away. If you prefer to commit very soon,
I promise to do a post commit review.

vstinner · 2014-04-30T13:23:43Z

If you prefer to commit very soon,
I promise to do a post commit review.

There is no need to hurry.

python-dev · 2014-05-02T20:31:29Z

New changeset 5b0fda8f5718 by Victor Stinner in branch 'default':
Issue bpo-21233: Add new C functions: PyMem_RawCalloc(), PyMem_Calloc(),
http://hg.python.org/cpython/rev/5b0fda8f5718

vstinner · 2014-05-02T20:35:08Z

There is no need to hurry.

I changed my mind :-p It should be easier for numpy to test the development version of Python.

Let's wait for buildbots.

vstinner · 2014-05-02T21:13:20Z

Antoine Pitrou wrote:

The real use case I envision is with huge powers of two. If I write:
x = 2 ** 1000000

I created the issue bpo-21419 for this idea.

python-dev · 2014-05-02T21:26:17Z

New changeset 62438d1b11c7 by Victor Stinner in branch 'default':
Issue bpo-21233: Oops, Fix _PyObject_Alloc(): initialize nbytes before going to
http://hg.python.org/cpython/rev/62438d1b11c7

skrah · 2014-05-03T19:29:58Z

I did a post-commit review. A couple of things:

I think Victor and I have a different view of the calloc() parameters.

      calloc(size_t nmemb, size_t size)

If a memory region of bytes is allocated, IMO 'nbytes' should be in the
place of 'nmemb' and '1' should be in the place of 'size'. That is,
"allocate nbytes elements of size 1":

      calloc(nbytes, 1)

In the commit the parameters are reversed in many places, which confuses
me quite a bit, since it means "allocate one element of size nbytes".

      calloc(1, nbytes)

I'm not happy with the refactoring in bytearray_init(). I think it would
be safer to make focused minimal changes in PyByteArray_Resize() instead.
In fact, there is a behavior change which isn't correct:

Before:
=======

        >>> x = bytearray(0)
        >>> m = memoryview(x)
        >>> x.__init__(10)
        Traceback (most recent call last):
          File "<stdin>", line 1, in <module>
        BufferError: Existing exports of data: object cannot be re-sized

 Now:
 \====

        >>> x = bytearray(0)
        >>> m = memoryview(x)
        >>> x.__init__(10)
        >>> x[0]
        0
        >>> m[0]
        Traceback (most recent call last):
          File "<stdin>", line 1, in <module>
        IndexError: index out of bounds

Somewhat similarly, I wonder if it was necessary to refactor
PyBytes_FromStringAndSize(). I find the new version more difficult
to understand.
_PyObject_Alloc(): assert(nelem <= PY_SSIZE_T_MAX / elsize) can be called
with elsize = 0.

skrah · 2014-05-03T21:39:15Z

I forgot one thing:

If WITH_VALGRIND is defined, nbytes is uninitialized in _PyObject_Alloc().

skrah · 2014-05-03T21:57:18Z

Another thing:

We need some kind of prominent documentation that existing
programs need to be changed:

Python 3.5.0a0 (default:62438d1b11c7+, May  3 2014, 23:35:03) 
[GCC 4.8.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import failmalloc
>>> failmalloc.enable()
>>> bytes(1)
Segmentation fault (core dumped)

njsmith · 2014-05-03T22:01:04Z

A simple solution would be to change the name of the struct, so that non-updated libraries will get a compile error instead of a runtime crash.

vstinner · 2014-05-03T22:49:18Z

We need some kind of prominent documentation that existing
programs need to be changed:

My final commit includes an addition to What's New in Python 3.5 doc,
including a notice in the porting section. It is not enough?

Even if the API is public, the PyMemAllocator thing is low level. It's not
part of the stable ABI. Except failmalloc, I don't know any user. I don't
expect a lot of complain and it's easy to port the code.

vstinner · 2014-05-03T22:51:33Z

If WITH_VALGRIND is defined, nbytes is uninitialized in
_PyObject_Alloc().

Did you see my second commit? It's nlt already fixed?

skrah · 2014-05-03T22:59:28Z

> 5) If WITH_VALGRIND is defined, nbytes is uninitialized in
_PyObject_Alloc().

Did you see my second commit? It's nlt already fixed?

I don't think so, I have revision 5d076506b3f5 here.

vstinner · 2014-05-03T23:00:51Z

"allocate nbytes elements of size 1"

PyObject_Malloc(100) asks to allocate one object of 100 bytes.

For PyMem_Malloc() and PyMem_RawMalloc(), it's more difficult to guess, but
IMO it's sane to bet that a single memory block of size bytes is requested.

I consider that char data[100] is a object of 100 bytes, but you call it
100 object of 1 byte.

I don't think that using nelem or elsize matters in practice.

skrah · 2014-05-03T23:25:45Z

STINNER Victor <report@bugs.python.org> wrote:

PyObject_Malloc(100) asks to allocate one object of 100 bytes.

Okay, then let's please call it:

_PyObject_Calloc(void *ctx, size_t nobjs, size_t objsize)

_PyObject_Alloc(int use_calloc, void *ctx, size_t nobjs, size_t objsize)

skrah · 2014-05-04T11:12:04Z

STINNER Victor <report@bugs.python.org> wrote:

My final commit includes an addition to What's New in Python 3.5 doc,
including a notice in the porting section. It is not enough?

I'm not sure: The usual case with ABI changes is that extensions may segfault
if they are *not* recompiled [1]. In that case documenting it in What's New is
standard procedure.

Here the extension *is* recompiled and still segfaults.

Even if the API is public, the PyMemAllocator thing is low level. It's not
part of the stable ABI. Except failmalloc, I don't know any user. I don't
expect a lot of complain and it's easy to port the code.

Perhaps it's worth asking on python-dev. Nathaniel's suggestion isn't bad
either (e.g. name it PyMemAllocatorEx).

[1] I was told on python-dev that many people in fact do not recompile.

python-dev · 2014-05-06T09:32:48Z

New changeset 358a12f4d4bc by Victor Stinner in branch 'default':
Issue bpo-21233: Fix _PyObject_Alloc() when compiled with WITH_VALGRIND defined
http://hg.python.org/cpython/rev/358a12f4d4bc

python-dev · 2014-06-02T19:57:40Z

New changeset 6374c2d957a9 by Victor Stinner in branch 'default':
Issue bpo-21233: Rename the C structure "PyMemAllocator" to "PyMemAllocatorEx" to
http://hg.python.org/cpython/rev/6374c2d957a9

vstinner · 2014-06-02T20:13:25Z

I'm not sure: The usual case with ABI changes is that extensions may segfault if they are *not* recompiled [1].

Ok, I renamed the structure PyMemAllocator to PyMemAllocatorEx, so the compilation fails because PyMemAllocator name is not defined. Modules compiled for Python 3.4 will crash on Python 3.5 if they are not recompiled, but I hope that you recompile your modules when you don't use the stable ABI.

Using PyMemAllocator is now more complex because it depends on the Python version. See for example the patch for pyfailmalloc:
https://bitbucket.org/haypo/pyfailmalloc/commits/9db92f423ac5f060d6ff499ee4bb74ebc0cf4761

Using the C preprocessor, it's possible to limit the changes.

vstinner · 2014-06-02T20:16:21Z

"Okay, then let's please call it:
_PyObject_Calloc(void *ctx, size_t nobjs, size_t objsize)
_PyObject_Alloc(int use_calloc, void *ctx, size_t nobjs, size_t objsize)"

"void * PyMem_RawCalloc(size_t nelem, size_t elsize);" prototype comes from the POSIX standad:
http://pubs.opengroup.org/onlinepubs/009695399/functions/calloc.html

I'm don't want to change the prototype in Python. Extract of Python documentation:

.. c:function:: void* PyMem_RawCalloc(size_t nelem, size_t elsize)

Allocates *nelem* elements each whose size in bytes is *elsize* (...)

python-dev · 2014-06-02T20:23:26Z

New changeset dff6b4b61cac by Victor Stinner in branch 'default':
Issue bpo-21233: Revert bytearray(int) optimization using calloc()
http://hg.python.org/cpython/rev/dff6b4b61cac

vstinner · 2014-06-02T20:28:08Z

"2) I'm not happy with the refactoring in bytearray_init(). (...)

Somewhat similarly, I wonder if it was necessary to refactor
PyBytes_FromStringAndSize(). (...)"

Ok, I reverted the change on bytearray(int) and opened the issue bpo-21644 to discuss these two optimizations.

vstinner · 2014-06-02T20:29:45Z

I reread the issue. I hope that I now addressed all issues. The remaining issue, bytearray(int) is now tracked by the new issue bpo-21644.

njsmith added interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement labels Apr 15, 2014

vstinner closed this as completed Jun 2, 2014

ezio-melotti transferred this issue from another repository Apr 10, 2022

Add *Calloc functions to CPython memory allocation API #65432

Add *Calloc functions to CPython memory allocation API #65432

Comments

njsmith commented Apr 15, 2014

njsmith commented Apr 15, 2014

vstinner commented Apr 15, 2014

pitrou commented Apr 15, 2014

MojoVampire mannequin commented Apr 15, 2014

MojoVampire mannequin commented Apr 15, 2014

MojoVampire mannequin commented Apr 15, 2014

vstinner commented Apr 16, 2014

vstinner commented Apr 16, 2014

vstinner commented Apr 16, 2014

pitrou commented Apr 16, 2014

neologix mannequin commented Apr 16, 2014

vstinner commented Apr 16, 2014

vstinner commented Apr 16, 2014

skrah mannequin commented Apr 16, 2014

pitrou commented Apr 16, 2014

vstinner commented Apr 16, 2014

neologix mannequin commented Apr 17, 2014

jtaylor mannequin commented Apr 17, 2014

MojoVampire mannequin commented Apr 17, 2014

MojoVampire mannequin commented Apr 17, 2014

jtaylor mannequin commented Apr 17, 2014

vstinner commented Apr 27, 2014

neologix mannequin commented Apr 27, 2014

vstinner commented Apr 27, 2014

neologix mannequin commented Apr 27, 2014

neologix mannequin commented Apr 27, 2014

vstinner commented Apr 27, 2014

pitrou commented Apr 27, 2014

skrah mannequin commented Apr 28, 2014

njsmith commented Apr 28, 2014

vstinner commented Apr 29, 2014

neologix mannequin commented Apr 29, 2014

vstinner commented Apr 30, 2014

skrah mannequin commented Apr 30, 2014

vstinner commented Apr 30, 2014

python-dev mannequin commented May 2, 2014

vstinner commented May 2, 2014

vstinner commented May 2, 2014

python-dev mannequin commented May 2, 2014

skrah mannequin commented May 3, 2014

skrah mannequin commented May 3, 2014

skrah mannequin commented May 3, 2014

njsmith commented May 3, 2014

vstinner commented May 3, 2014

vstinner commented May 3, 2014

skrah mannequin commented May 3, 2014

vstinner commented May 3, 2014

skrah mannequin commented May 3, 2014

skrah mannequin commented May 4, 2014

python-dev mannequin commented May 6, 2014

python-dev mannequin commented Jun 2, 2014

vstinner commented Jun 2, 2014

vstinner commented Jun 2, 2014

python-dev mannequin commented Jun 2, 2014

vstinner commented Jun 2, 2014

vstinner commented Jun 2, 2014