This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: pymalloc should align to max_align_t
Type: Stage:
Components: C API Versions:
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: methane, petr.viktorin, pitrou, ronaldoussoren, vstinner
Priority: normal Keywords:

Created on 2022-03-31 08:59 by petr.viktorin, last changed 2022-04-11 14:59 by admin.

Messages (5)
msg416421 - (view) Author: Petr Viktorin (petr.viktorin) * (Python committer) Date: 2022-03-31 08:59
malloc() returns memory that's "suitably aligned for any built-in type".
All of Python's allocation functions should do the same. 

In bpo-27987 (PR-12850, PR-13336), the alignment was raised* to 16 bytes and `long double`. This is OK for current architectures, so there is no practical issue right now.

But, C11 defines a [max_align_t] type which sounds like the *correct* thing to use for determining pymalloc/PyGC_Head alignment.
At least we should assert that obmalloc's ALIGNMENT is a multiple of `alignof(max_align_t)`, and use max_align_t rather than `long double` in PyGC_Head.



See also this python-cffi issue: https://foss.heptapod.net/pypy/cffi/-/issues/531#note_181779

[max_align_t]: https://en.cppreference.com/w/c/types/max_align_t

* (on 64-bit arches)
msg416433 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-03-31 13:39
On my x86-64 Fedora 35, gcc says 32 bytes for sizeof(max_align_t). By the way, g++ also says 32 bytes for sizeof(std::max_align_t).

GCC 11.2.1 defines max_align_t as:
---
#if (defined (__STDC_VERSION__) && __STDC_VERSION__ >= 201112L) \
  || (defined(__cplusplus) && __cplusplus >= 201103L)
#ifndef _GCC_MAX_ALIGN_T
#define _GCC_MAX_ALIGN_T
/* Type whose alignment is supported in every context and is at least
   as great as that of any standard type not using alignment
   specifiers.  */
typedef struct {
  long long __max_align_ll __attribute__((__aligned__(__alignof__(long long))));
  long double __max_align_ld __attribute__((__aligned__(__alignof__(long double))));
  /* _Float128 is defined as a basic type, so max_align_t must be
     sufficiently aligned for it.  This code must work in C++, so we
     use __float128 here; that is only available on some
     architectures, but only on i386 is extra alignment needed for
     __float128.  */
#ifdef __i386__
  __float128 __max_align_f128 __attribute__((__aligned__(__alignof(__float128))));
#endif
} max_align_t;
#endif
#endif /* C11 or C++11.  */
---
file: /usr/lib/gcc/x86_64-redhat-linux/11/include/stddef.h

It's not an union but a structure with 2 fields (1 long long, 1 long double). The __i386__ macro is not defined on Linux x86-64, so the struct does not have the 3rd 128-bit float field.

align.c:
---
#include <stddef.h>
int main() { return sizeof(max_align_t); }
---

Build and run (C):
---
$ gcc align.c -o align && (./align; echo $?)
32
---


align.cpp:
---
int main() { return sizeof(std::max_align_t); }
---

Build and run (C++):
---
$ g++ align.cpp -o align && (./align; echo $?)
32
---
msg416434 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-03-31 13:44
Objects/obmalloc.c currently relies on the SIZEOF_VOID_P macro:
---
#if SIZEOF_VOID_P > 4
#define ALIGNMENT              16               /* must be 2^N */
#define ALIGNMENT_SHIFT         4
#else
#define ALIGNMENT               8               /* must be 2^N */
#define ALIGNMENT_SHIFT         3
#endif
---

If we want to respect sizeof(max_align_t) alignment, we can compute sizeof(max_align_t) in configure and uses the result in obmalloc.c. I expect that it's either 16 or 32, so we can maybe just hardcode ALIGNMENT_SHIFT using something like: "if == 32 ... #elif == 16 ... #else #error ...".

On x86 (32-bit) Fedora 35, gcc says 48 for sizeof(max_align_t) which is way larger than the current alignment to 8 bytes!
msg416435 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-03-31 13:47
Oh, it seems like this issue is a duplicate of bpo-31912 created in 2017.
msg416939 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2022-04-07 19:36
> If we want to respect sizeof(max_align_t) alignment, we can compute sizeof(max_align_t) in configure and uses the result in obmalloc.c. I expect that it's either 16 or 32, so we can maybe just hardcode ALIGNMENT_SHIFT using something like: "if == 32 ... #elif == 16 ... #else #error ...".

This should be "alignof(max_align_t)" instead of "sizeof(...)". The size itself is not relevant.

BTW, on macOS/arm64 alignof(max_align_t) is 8, not 16 (as the code seems to expect given the pointer size). This is harmless of course.
History
Date User Action Args
2022-04-11 14:59:57adminsetgithub: 91335
2022-04-07 19:36:04ronaldoussorensetnosy: + ronaldoussoren
messages: + msg416939
2022-03-31 13:47:25vstinnersetmessages: + msg416435
2022-03-31 13:44:19vstinnersetmessages: + msg416434
2022-03-31 13:39:19vstinnersetmessages: + msg416433
2022-03-31 08:59:44petr.viktorincreate