msg274688 - (view) |
Author: Benjamin Peterson (benjamin.peterson) * |
Date: 2016-09-07 00:17 |
ubsan complains about unaligned access when structs include "long double". An example error:
runtime error: member access within misaligned address 0x7f77dbba9798 for type 'struct CDataObject', which requires 16 byte alignment
This is because (on x86 anyway), long double is 16-bytes long and requires that alignment, but obmalloc only gives a 8-byte alignment. (glibc malloc() gives 16-byte alignment.)
I'm attaching a POC patch. I don't know what the impact of increasing the alignment is on obmalloc's performance or memory usage. It's also unfortunate that this patch increases the size of PyGC_Head to 32 bytes from 24 bytes. One can imagine a more middle-ground solution to this by allowing types to specify their required alignment.
|
msg304873 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2017-10-24 08:48 |
What do we do if at some point a C type requires a larger alignment (for example a vector type readable using AVX512 instructions)?
|
msg304874 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2017-10-24 09:00 |
> ubsan complains about unaligned access when structs include "long double". An example error:
> runtime error: member access within misaligned address 0x7f77dbba9798 for type 'struct CDataObject', which requires 16 byte alignment
Can we use memcpy() to prevent such issue?
|
msg304912 - (view) |
Author: Benjamin Peterson (benjamin.peterson) * |
Date: 2017-10-24 14:41 |
My suggestion would be to pass alignof(type) into the allocator via macro. Then the allocator could at least assert it's providing good enough alignment if not provide the correct alignment.
I believe 16-byte alignment is special because it's glibc's malloc's default. So "normal" code shouldn't really be expecting anything better than 16-byte alignment. Code with higher alignment requirements will have to use APIs like the one proposed in #18835.
|
msg304917 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2017-10-24 15:18 |
> My suggestion would be to pass alignof(type) into the allocator via macro.
Do you mean using some new PyMem_ function? Or as as new tp_ field on the type declaration?
|
msg304918 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2017-10-24 15:33 |
What matters when a Python object is allocated? The start of the PyObject structure, or the start of the PyGC_Head structure? Would it be possible to align the PyObject start?
The simplest option is to store data which needs to be aligned in a second memory block allocated by PyMem_AlignedAlloc().
|
msg304919 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2017-10-24 15:35 |
Change by Antoine Pitrou: "versions: -Python 2.7, Python 3.3, Python 3.4, Python 3.5, Python 3.6"
The undefined behaviour exists and should be fixed in Python 2.7 and 3.6, no? Can we use memcpy()?
|
msg304922 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2017-10-24 16:02 |
> Can we use memcpy()?
Hmm, perhaps. Do you want to try it out (and measure any performance degradation)?
|
msg304925 - (view) |
Author: Stefan Krah (skrah) * |
Date: 2017-10-24 16:24 |
Since we have "#define PYMEM_FUNCS PYOBJ_FUNCS", I think extensions that
use PyMem_Malloc() also won't get the glibc max_align_t alignment.
But guess technically they should.
|
msg304959 - (view) |
Author: Benjamin Peterson (benjamin.peterson) * |
Date: 2017-10-25 05:36 |
Yes, we could memcpy things around to obtain the desired alignment. It would be nicer to have a builtin solution, though.
|
msg305847 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2017-11-08 14:14 |
alignment.patch: + long double dummy; /* force worst-case alignment */
Would it be possible to use max_align_t mentioned by Stefan, at least when this type is available?
What is the impact of the patch on objects size?
|
msg305939 - (view) |
Author: Benjamin Peterson (benjamin.peterson) * |
Date: 2017-11-09 06:25 |
On Wed, Nov 8, 2017, at 06:14, STINNER Victor wrote:
>
> STINNER Victor <victor.stinner@gmail.com> added the comment:
>
> alignment.patch: + long double dummy; /* force worst-case alignment
> */
>
> Would it be possible to use max_align_t mentioned by Stefan, at least
> when this type is available?
Yes, that would be the correct thing to do. I was looking for the quick
hack.
>
> What is the impact of the patch on objects size?
On 64-bit platforms, I believe it wastes a word for GC objects.
|
msg305985 - (view) |
Author: Neil Schemenauer (nascheme) * |
Date: 2017-11-09 21:35 |
FYI, this would seem to be an incentive to get my "bitmaps for small GC objects" idea implemented. I.e.
https://mail.python.org/pipermail/python-dev/2017-September/149307.html
If implemented, the extra size of the PyGC_Head would only apply to "large" objects. In my prototype, I'm thinking of using 512 bytes as the size limit for small GC objects.
|
msg311323 - (view) |
Author: Florian Weimer (fweimer) |
Date: 2018-01-31 10:57 |
This bug causes miscompilation of Python 2.7 by GCC 8 on x86-64 (with no sanitizers enabled, just compiler optimization).
I think this is a fairly conservative way for papering over the issue:
https://mail.python.org/pipermail/python-dev/2018-January/152011.html
|
msg340120 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2019-04-12 22:32 |
While this issue looked purely theorical to me 3 years ago, it is now very concrete: bpo-36618 "clang expects memory aligned on 16 bytes, but pymalloc aligns to 8 bytes".
|
msg340179 - (view) |
Author: Inada Naoki (methane) * |
Date: 2019-04-14 03:15 |
Now PyGC_Head is 16byte on 64bit platform.
Maybe, should we just change obmalloc in Python 3.8?
How about 32bit platforms?
What can we do for Python 3.7 and 2.7?
|
msg340254 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2019-04-15 10:17 |
PyGC_Head structure size depends on the Python version, sizes of 64-bit:
* 2.7: 32 bytes
* 3.6, 3.7: 24 bytes
* 3.8 (master): 16 bytes
bpo-36618 "clang expects memory aligned on 16 bytes, but pymalloc aligns to 8 bytes" should be even worse on 3.7: 24 is not aligned on 16. I don't understand why nobody saw this alignment issue previously. Maybe clang only became stricer about 16 bytes alignment recently?
2.7:
typedef union _gc_head {
struct {
union _gc_head *gc_next;
union _gc_head *gc_prev;
Py_ssize_t gc_refs;
} gc;
double dummy; /* Force at least 8-byte alignment. */
char dummy_padding[sizeof(union _gc_head_old)];
} PyGC_Head;
3.7:
typedef union _gc_head {
struct {
union _gc_head *gc_next;
union _gc_head *gc_prev;
Py_ssize_t gc_refs;
} gc;
double dummy; /* force worst-case alignment */
} PyGC_Head;
3.8:
typedef struct {
// Pointer to next object in the list.
// 0 means the object is not tracked
uintptr_t _gc_next;
// Pointer to previous object in the list.
// Lowest two bits are used for flags documented later.
uintptr_t _gc_prev;
} PyGC_Head;
In 3.8, the union used to ensure alignment on a C double is gone.
|
msg340264 - (view) |
Author: Inada Naoki (methane) * |
Date: 2019-04-15 11:21 |
I had not noticed bpo-33374 changed Python 2.7.
I don't know why it caused segv only for Python 2.7.
|
msg340265 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2019-04-15 11:24 |
The x86-64 ABI requires that memory allocated on the heap is aligned to 16 bytes.
On x86-64, glibc malloc(size) aligns on 16 bytes for size >= 16, otherwise align to 8 bytes. So the glibc doesn't respect exactly the ABI. I understand that a compiler will not use instructions which require 16B align on a memory block smaller than 16B, so align to 8B for size < 16B should be fine *in practice*.
Python objects are at least 16B because of PyObject header. Moreover, objects tracked by the GC gets additional 16B header from PyGC_Head.
But pymalloc is also used for PyMem_Malloc() since Python 3.6, and PyMem_Malloc() is used to allocate things which are not PyObject.
|
msg340266 - (view) |
Author: Florian Weimer (fweimer) |
Date: 2019-04-15 11:40 |
Minor correction: glibc malloc follows ABI on x86-64 and always returns a 16-byte-aligned pointer, independently of allocation size.
However, other mallocs (such as jemalloc and tcmalloc) may return pointers with less alignment for allocation sizes less than 16 bytes, violating ABI. They still follow ABI for allocations of 16 bytes and more.
But as you said, the distinction should not matter for Python because of the object header. Furthermore, without LTO, the compiler will not be able to detect that a pointer returned from Py_NewObject is a top-level allocation, and therefore has to be more conservative about alignment, using information from the type definitions only.
|
msg340330 - (view) |
Author: Inada Naoki (methane) * |
Date: 2019-04-16 10:14 |
> In 3.8, the union used to ensure alignment on a C double is gone.
Note that two uintptr_t is aligned 16bytes on 64bit platforms and 8bytes on 32bit platforms.
Python 3.7 is worse than 3.8.
It used "double dummy" to align by 8 bytes, not 16 bytes.
We should use "long double" to align by 16 bytes.
https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf
But it means +8 bytes for all tuples. If we backport PR-12850 to 3.7, +8 bytes for 1/2 tuples, and +16 bytes for remaining tuples.
Any ideas about reduce impact for Python 3.7?
For example, can we add 8byte dummy to PyGC_Head, and tuple use the dummy for hash? Maybe, it breaks ABI.... Not a chance...
I wonder if we can add -fmax-type-align=8 for extension types...
|
msg340331 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2019-04-16 10:20 |
> I wonder if we can add -fmax-type-align=8 for extension types...
No, we cannot: it's a temporary fix. The flag causes compilation error if it's added to old version of clang or to a C compiler different than clang.
> Any ideas about reduce impact for Python 3.7?
I don't think that it's a matter of performance here. What matters the most here is correctness.
See Florian Weimer's message:
https://bugs.python.org/issue36618#msg340261
"This issue potentially affects all compilers, not just Clang."
|
msg342440 - (view) |
Author: Inada Naoki (methane) * |
Date: 2019-05-14 09:48 |
$ ./python -m perf compare_to master.json align16.json -G --min-speed=1
Slower (13):
- pickle_list: 4.40 us +- 0.03 us -> 4.59 us +- 0.04 us: 1.04x slower (+4%)
- xml_etree_iterparse: 129 ms +- 2 ms -> 133 ms +- 2 ms: 1.04x slower (+4%)
- regex_dna: 201 ms +- 2 ms -> 207 ms +- 2 ms: 1.03x slower (+3%)
- scimark_sparse_mat_mult: 5.75 ms +- 0.01 ms -> 5.90 ms +- 0.05 ms: 1.03x slower (+3%)
- float: 147 ms +- 1 ms -> 151 ms +- 1 ms: 1.02x slower (+2%)
- unpickle: 19.0 us +- 0.6 us -> 19.4 us +- 0.7 us: 1.02x slower (+2%)
- nbody: 173 ms +- 1 ms -> 176 ms +- 1 ms: 1.02x slower (+2%)
- pickle: 12.4 us +- 0.1 us -> 12.6 us +- 0.2 us: 1.02x slower (+2%)
- html5lib: 121 ms +- 3 ms -> 123 ms +- 4 ms: 1.02x slower (+2%)
- unpickle_list: 4.88 us +- 0.04 us -> 4.95 us +- 0.12 us: 1.02x slower (+2%)
- xml_etree_process: 107 ms +- 1 ms -> 109 ms +- 1 ms: 1.01x slower (+1%)
- regex_effbot: 3.60 ms +- 0.05 ms -> 3.65 ms +- 0.03 ms: 1.01x slower (+1%)
- xml_etree_parse: 185 ms +- 1 ms -> 187 ms +- 3 ms: 1.01x slower (+1%)
Faster (11):
- nqueens: 134 ms +- 1 ms -> 130 ms +- 1 ms: 1.03x faster (-2%)
- chameleon: 14.4 ms +- 0.2 ms -> 14.1 ms +- 0.1 ms: 1.02x faster (-2%)
- pathlib: 25.7 ms +- 0.3 ms -> 25.3 ms +- 0.2 ms: 1.02x faster (-2%)
- django_template: 170 ms +- 2 ms -> 167 ms +- 3 ms: 1.01x faster (-1%)
- sympy_expand: 549 ms +- 19 ms -> 542 ms +- 8 ms: 1.01x faster (-1%)
- dulwich_log: 90.0 ms +- 0.7 ms -> 88.9 ms +- 0.7 ms: 1.01x faster (-1%)
- richards: 111 ms +- 1 ms -> 110 ms +- 1 ms: 1.01x faster (-1%)
- json_dumps: 16.7 ms +- 0.3 ms -> 16.6 ms +- 0.2 ms: 1.01x faster (-1%)
- fannkuch: 572 ms +- 4 ms -> 566 ms +- 2 ms: 1.01x faster (-1%)
- meteor_contest: 130 ms +- 1 ms -> 129 ms +- 1 ms: 1.01x faster (-1%)
- logging_format: 14.6 us +- 0.2 us -> 14.4 us +- 0.2 us: 1.01x faster (-1%)
Benchmark hidden because not significant (33): 2to3, chaos, crypto_pyaes, deltablue, go, hexiom, json_loads, logging_silent, logging_simple, mako, pickle_dict, pickle_pure_python, pidigits, python_startup, python_startup_no_site, raytrace, regex_compile, regex_v8, scimark_fft, scimark_lu, scimark_monte_carlo, scimark_sor, spectral_norm, sqlalchemy_declarative, sqlalchemy_imperative, sqlite_synth, sympy_integrate, sympy_sum, sympy_str, telco, unpack_sequence, unpickle_pure_python, xml_etree_generate
|
msg342441 - (view) |
Author: Inada Naoki (methane) * |
Date: 2019-05-14 09:49 |
$ ./python -m perf compare_to master-mem.json align16-mem.json -G --min-speed=2
Slower (30):
- float: 20.6 MB +- 12.6 kB -> 23.8 MB +- 30.3 kB: 1.16x slower (+16%)
- mako: 14.3 MB +- 760.5 kB -> 15.1 MB +- 54.1 kB: 1.06x slower (+6%)
- xml_etree_iterparse: 11.1 MB +- 11.8 kB -> 11.6 MB +- 22.1 kB: 1.05x slower (+5%)
- html5lib: 19.0 MB +- 31.0 kB -> 19.8 MB +- 51.8 kB: 1.04x slower (+4%)
- dulwich_log: 10.9 MB +- 133.1 kB -> 11.3 MB +- 29.1 kB: 1.03x slower (+3%)
- json_dumps: 7907.6 kB +- 6242 bytes -> 8156.0 kB +- 23.9 kB: 1.03x slower (+3%)
- sympy_str: 33.5 MB +- 17.5 kB -> 34.5 MB +- 23.2 kB: 1.03x slower (+3%)
- deltablue: 8163.2 kB +- 9220 bytes -> 8391.2 kB +- 15.6 kB: 1.03x slower (+3%)
- pathlib: 8296.0 kB +- 15.0 kB -> 8526.4 kB +- 33.4 kB: 1.03x slower (+3%)
- xml_etree_generate: 11.8 MB +- 87.2 kB -> 12.2 MB +- 108.2 kB: 1.03x slower (+3%)
- sympy_expand: 32.7 MB +- 15.1 kB -> 33.6 MB +- 19.9 kB: 1.03x slower (+3%)
- richards: 7081.6 kB +- 16.2 kB -> 7270.8 kB +- 55.2 kB: 1.03x slower (+3%)
- pickle: 7244.4 kB +- 12.1 kB -> 7436.4 kB +- 57.1 kB: 1.03x slower (+3%)
- pickle_pure_python: 7267.2 kB +- 12.2 kB -> 7455.2 kB +- 48.2 kB: 1.03x slower (+3%)
- pickle_dict: 7258.8 kB +- 27.6 kB -> 7446.0 kB +- 36.7 kB: 1.03x slower (+3%)
- hexiom: 7168.8 kB +- 25.1 kB -> 7352.8 kB +- 59.8 kB: 1.03x slower (+3%)
- raytrace: 7373.6 kB +- 17.0 kB -> 7562.8 kB +- 44.3 kB: 1.03x slower (+3%)
- pickle_list: 7246.8 kB +- 9067 bytes -> 7431.2 kB +- 60.2 kB: 1.03x slower (+3%)
- spectral_norm: 6913.2 kB +- 5127 bytes -> 7087.2 kB +- 39.8 kB: 1.03x slower (+3%)
- sympy_integrate: 32.6 MB +- 24.9 kB -> 33.4 MB +- 36.0 kB: 1.02x slower (+2%)
- regex_compile: 8188.4 kB +- 10.9 kB -> 8388.8 kB +- 27.2 kB: 1.02x slower (+2%)
- nqueens: 7153.6 kB +- 17.2 kB -> 7328.4 kB +- 38.3 kB: 1.02x slower (+2%)
- sqlalchemy_declarative: 18.1 MB +- 40.9 kB -> 18.5 MB +- 50.0 kB: 1.02x slower (+2%)
- django_template: 18.4 MB +- 50.2 kB -> 18.8 MB +- 23.7 kB: 1.02x slower (+2%)
- sympy_sum: 52.1 MB +- 30.8 kB -> 53.4 MB +- 26.7 kB: 1.02x slower (+2%)
- regex_v8: 8208.0 kB +- 11.2 kB -> 8399.2 kB +- 43.2 kB: 1.02x slower (+2%)
- sqlalchemy_imperative: 17.4 MB +- 51.0 kB -> 17.8 MB +- 47.9 kB: 1.02x slower (+2%)
- json_loads: 7025.6 kB +- 71.0 kB -> 7173.6 kB +- 9098 bytes: 1.02x slower (+2%)
- xml_etree_process: 11.6 MB +- 160.1 kB -> 11.8 MB +- 141.1 kB: 1.02x slower (+2%)
- logging_silent: 7275.6 kB +- 37.5 kB -> 7425.2 kB +- 41.9 kB: 1.02x slower (+2%)
Faster (1):
- sqlite_synth: 8469.6 kB +- 44.7 kB -> 8197.6 kB +- 44.7 kB: 1.03x faster (-3%)
Benchmark hidden because not significant (26): 2to3, chameleon, chaos, crypto_pyaes, fannkuch, go, logging_format, logging_simple, meteor_contest, nbody, pidigits, python_startup, python_startup_no_site, regex_dna, regex_effbot, scimark_fft, scimark_lu, scimark_monte_carlo, scimark_sor, scimark_sparse_mat_mult, telco, unpack_sequence, unpickle, unpickle_list, unpickle_pure_python, xml_etree_parse
|
msg342443 - (view) |
Author: Inada Naoki (methane) * |
Date: 2019-05-14 09:51 |
New changeset f0be4bbb9b3cee876249c23f2ae6f38f43fa7495 by Inada Naoki in branch 'master':
bpo-27987: pymalloc: align by 16bytes on 64bit platform (GH-12850)
https://github.com/python/cpython/commit/f0be4bbb9b3cee876249c23f2ae6f38f43fa7495
|
msg342445 - (view) |
Author: Stefan Krah (skrah) * |
Date: 2019-05-14 10:04 |
+16% for float seems pretty high though.
|
msg342447 - (view) |
Author: Inada Naoki (methane) * |
Date: 2019-05-14 10:18 |
yes. sys.getsizeof(3.14) is 24. And it becomes 32 byte in 16byte aligned pymalloc. (+33%)
FYI, jemalloc has 8, 16, 32 size classes, but no 24 too.
http://jemalloc.net/jemalloc.3.html#size_classes
|
msg342575 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2019-05-15 14:31 |
New changeset f24a9f3bf42709fb97b954b6dd6f90853967712e by Victor Stinner in branch '2.7':
bpo-27987: pymalloc: align by 16bytes on 64bit platform (GH-12850) (GH-13319)
https://github.com/python/cpython/commit/f24a9f3bf42709fb97b954b6dd6f90853967712e
|
msg343133 - (view) |
Author: Neil Schemenauer (nascheme) * |
Date: 2019-05-22 00:02 |
> sys.getsizeof(3.14) is 24. And it becomes 32 byte in 16byte aligned pymalloc. (+33%)
I've been doing some reading and trying to understand this issue. My understanding is that malloc() needs to return pointers that are 16-byte aligned on AMD64 but, in general, pointers don't have the be aligned that way. If you have a structure that contains a "long double" then that member also has to be 16-bit aligned.
It seems to me that we don't need to have the PyObject structure containing a Python float to be 16-byte aligned. If so, could we introduce a new obmalloc API that returns memory with 8-byte alignment, for use by objects that know they don't require 16-byte alignment? floatobject.c could use this API to avoid the 33% overhead.
The new obmalloc API could initially be internal use only until we can come up with a design we know we can live with long term.
|
msg343134 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2019-05-22 00:09 |
> It seems to me that we don't need to have the PyObject structure containing a Python float to be 16-byte aligned. If so, could we introduce a new obmalloc API that returns memory with 8-byte alignment, for use by objects that know they don't require 16-byte alignment? floatobject.c could use this API to avoid the 33% overhead.
PyMem_Malloc / PyObject_Malloc only have one parameter: "size". It knows nothing about the allocated structure.
bpo-18835 discussed the idea of adding a new API which accept an alignment parameter. The issue was closed because of the lack of concrete usage.
In the clang crash bpo-36618 (which decided us to fix this issue), C alignof() function was discussed:
https://bugs.python.org/issue36618#msg340279
Copy of serge-sans-paille's comment:
"@vstinner: once you have a portable version of alignof, you can deciding to *not* use the pool allocator if the required alignment is greater than 8B, or you could modify the pool allocator to take alignment information as an extra parameter?"
|
msg343218 - (view) |
Author: Neil Schemenauer (nascheme) * |
Date: 2019-05-22 16:43 |
We now have a concrete use case. ;-)
My idea was that we can introduce a new, CPython internal API that
aligns on 8-byte boundaries (or takes alignment as a parameter). The
API would be a stop-gap measure. We can use the API to reduce
the overhead for specific types.
E.g. for non-subclasses of float, we know the PyObject structure does
not need 16-byte alignment. We don't need a version of "alignof" to know
this. Inside floatobject.c, we could call the new objmalloc API that
gives new memory with 8-byte alignment. That would save the 33%
overhead.
E.g. in PyFloat_FromDouble, rather than:
PyObject_MALLOC(sizeof(PyFloatObject))
we could call something like:
_PyObject_MALLOC_ALIGNED(sizeof(PyFloatObject), 8)
This internal API would not be a permanent solution. Having to manually
fix each place that PyObjects are allocated and hard-coding the required
alignment is not the best solution. We can only fix specific types and
extension modules would always get the 16-byte alignment. Still, by
tweaking some of the most common types, we avoid much of the overhead
for the alignment change, at least for the average Python program.
In the long term, we would need a better solution. E.g. an API that can
take alignment requirements as a parameter. Or, a solution I like
better, have types use PyObject_New(). Then, add an alignment
specifier to type object (e.g. tp_align to go along with tp_basicsize).
Then there does not have to be a new public API that takes alignment.
|
msg343219 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2019-05-22 16:45 |
Neil, I don't see the point of having this discussion here.
|
msg343471 - (view) |
Author: Inada Naoki (methane) * |
Date: 2019-05-25 12:13 |
New changeset ea2b76bdc5f97f49701213d105b8ec2387ea2fa5 by Inada Naoki in branch '3.7':
bpo-27987: align PyGC_Head to alignof(long double) (GH-13335)
https://github.com/python/cpython/commit/ea2b76bdc5f97f49701213d105b8ec2387ea2fa5
|
msg343489 - (view) |
Author: Gregory P. Smith (gregory.p.smith) * |
Date: 2019-05-25 15:57 |
long double was changed to double seven years ago to avoid a different kind of undefined behavior... https://github.com/python/cpython/commit/e348c8d154cf6342c79d627ebfe89dfe9de23817#diff-fb41bdaf12f733cf6ab8a82677d03adc
We are going in circles here.
Submitting that PR to 3.7 caused the undefined behavior sanitizer buildbot to go back to reporting a ton more damage.
build before: https://buildbot.python.org/all/#/builders/137/builds/878
52k lines of test stdio
build after: https://buildbot.python.org/all/#/builders/137/builds/879
4900k lines of test stdio
|
msg343492 - (view) |
Author: Gregory P. Smith (gregory.p.smith) * |
Date: 2019-05-25 17:09 |
commit reverted in https://github.com/python/cpython/commit/2156fec1f7a8f9972e90cdbaf404e3fd9eaccb35
|
msg343493 - (view) |
Author: miss-islington (miss-islington) |
Date: 2019-05-25 17:18 |
New changeset 1b85f4ec45a5d63188ee3866bd55eb29fdec7fbf by Miss Islington (bot) in branch '3.7':
bpo-27987: pymalloc: align by 16bytes on 64bit platform (GH-12850)
https://github.com/python/cpython/commit/1b85f4ec45a5d63188ee3866bd55eb29fdec7fbf
|
msg343495 - (view) |
Author: Gregory P. Smith (gregory.p.smith) * |
Date: 2019-05-25 17:29 |
I'm not marking this bug as "Fixed" as the original complaint about obmalloc'd structs with a long double not being aligned is still going to be true on 32-bit platforms for 2.7 - 3.7. We've merely increased the obmalloc alignment to 16-bytes on 64-bit platforms.
So the problem should only remain for 32-bit users which at this point are likely only arm (rpi and similar low end friends not using a 64-bit OS).
|
msg343498 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2019-05-25 17:40 |
Here is what I've found for (32-bit) ARM:
- "long double" is 8 bytes long, so it's probably the same as "double"
(see http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dai0274b/index.html)
- the standard alignment for "double" is 8 bytes
(see http://infocenter.arm.com/help/topic/com.arm.doc.ihi0042d/IHI0042D_aapcs.pdf)
And on (32-bit) x86, it looks like the standard alignment for "long double" is 4 bytes:
https://www.codesynthesis.com/~boris/blog/2009/04/06/cxx-data-alignment-portability/
So I don't think there's anything to change on 32-bit Python builds *if* we only really care about ARM and x86 (which is restrictive, but using "long double" in C extension types is a bit of an exotic issue).
|
msg343499 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2019-05-25 17:41 |
And of course, someone who has this issue can at worse recompile Python without pymalloc.
|
msg343500 - (view) |
Author: Gregory P. Smith (gregory.p.smith) * |
Date: 2019-05-25 18:25 |
if someone runs into an actual need for this on 32-bit builds, please provide details and feel free to reopen the issue. closing as i don't believe there is any more for us to do.
|
msg344364 - (view) |
Author: Gregory P. Smith (gregory.p.smith) * |
Date: 2019-06-03 01:51 |
New changeset 8766cb74e186d3820db0a855ccd780d6d84461f7 by Gregory P. Smith (Inada Naoki) in branch '3.7':
[3.7] bpo-27987: align PyGC_Head to alignof(long double) (GH-13335) (GH-13581)
https://github.com/python/cpython/commit/8766cb74e186d3820db0a855ccd780d6d84461f7
|
|
Date |
User |
Action |
Args |
2022-04-11 14:58:35 | admin | set | github: 72174 |
2019-06-03 02:02:46 | vstinner | set | resolution: wont fix -> fixed versions:
- Python 2.7, Python 3.6, Python 3.9 |
2019-06-03 01:51:34 | gregory.p.smith | set | messages:
+ msg344364 |
2019-05-26 06:21:15 | methane | set | pull_requests:
+ pull_request13488 |
2019-05-25 18:25:09 | gregory.p.smith | set | status: open -> closed resolution: wont fix messages:
+ msg343500
stage: needs patch -> resolved |
2019-05-25 17:41:46 | pitrou | set | messages:
+ msg343499 |
2019-05-25 17:40:17 | pitrou | set | messages:
+ msg343498 |
2019-05-25 17:29:57 | gregory.p.smith | set | stage: patch review -> needs patch messages:
+ msg343495 versions:
+ Python 2.7, Python 3.6, Python 3.8, Python 3.9 |
2019-05-25 17:18:38 | miss-islington | set | nosy:
+ miss-islington messages:
+ msg343493
|
2019-05-25 17:10:52 | gregory.p.smith | set | versions:
- Python 2.7, Python 3.8, Python 3.9 |
2019-05-25 17:09:01 | gregory.p.smith | set | messages:
+ msg343492 |
2019-05-25 16:18:59 | gregory.p.smith | set | stage: needs patch -> patch review pull_requests:
+ pull_request13478 |
2019-05-25 15:57:23 | gregory.p.smith | set | status: closed -> open resolution: fixed -> (no value) messages:
+ msg343489
stage: resolved -> needs patch |
2019-05-25 12:13:59 | methane | set | status: open -> closed resolution: fixed stage: patch review -> resolved |
2019-05-25 12:13:36 | methane | set | messages:
+ msg343471 |
2019-05-22 16:45:26 | pitrou | set | messages:
+ msg343219 |
2019-05-22 16:43:43 | nascheme | set | messages:
+ msg343218 |
2019-05-22 00:09:17 | vstinner | set | messages:
+ msg343134 |
2019-05-22 00:02:56 | nascheme | set | messages:
+ msg343133 |
2019-05-15 14:31:34 | vstinner | set | messages:
+ msg342575 |
2019-05-15 10:41:09 | methane | set | pull_requests:
+ pull_request13248 |
2019-05-15 09:30:55 | methane | set | pull_requests:
+ pull_request13247 |
2019-05-14 16:41:09 | vstinner | set | pull_requests:
+ pull_request13230 |
2019-05-14 16:36:23 | miss-islington | set | pull_requests:
+ pull_request13229 |
2019-05-14 10:18:48 | methane | set | messages:
+ msg342447 |
2019-05-14 10:04:15 | skrah | set | messages:
+ msg342445 |
2019-05-14 09:51:20 | methane | set | messages:
+ msg342443 |
2019-05-14 09:49:10 | methane | set | messages:
+ msg342441 |
2019-05-14 09:48:20 | methane | set | messages:
+ msg342440 |
2019-04-16 10:20:56 | vstinner | set | messages:
+ msg340331 |
2019-04-16 10:14:54 | methane | set | messages:
+ msg340330 |
2019-04-16 01:54:37 | methane | set | stage: patch review pull_requests:
+ pull_request12775 |
2019-04-15 11:40:27 | fweimer | set | messages:
+ msg340266 |
2019-04-15 11:24:14 | vstinner | set | messages:
+ msg340265 |
2019-04-15 11:21:31 | methane | set | messages:
+ msg340264 |
2019-04-15 10:17:06 | vstinner | set | messages:
+ msg340254 |
2019-04-14 03:15:46 | methane | set | messages:
+ msg340179 |
2019-04-13 01:08:48 | methane | set | nosy:
+ methane
|
2019-04-12 22:40:53 | gregory.p.smith | set | versions:
+ Python 3.8, Python 3.9 |
2019-04-12 22:32:05 | vstinner | set | messages:
+ msg340120 |
2018-04-27 21:30:22 | tgrigg | set | nosy:
+ tgrigg
|
2018-01-31 10:57:24 | fweimer | set | versions:
+ Python 2.7 |
2018-01-31 10:57:18 | fweimer | set | nosy:
+ fweimer messages:
+ msg311323
|
2018-01-31 00:12:39 | gregory.p.smith | set | nosy:
+ twouters, gregory.p.smith
|
2017-11-09 21:35:54 | nascheme | set | nosy:
+ nascheme messages:
+ msg305985
|
2017-11-09 06:25:35 | benjamin.peterson | set | messages:
+ msg305939 |
2017-11-08 14:14:26 | vstinner | set | messages:
+ msg305847 |
2017-10-25 05:36:39 | benjamin.peterson | set | messages:
+ msg304959 |
2017-10-24 16:24:36 | skrah | set | nosy:
+ skrah messages:
+ msg304925
|
2017-10-24 16:02:43 | pitrou | set | messages:
+ msg304922 |
2017-10-24 15:35:23 | vstinner | set | messages:
+ msg304919 |
2017-10-24 15:33:57 | vstinner | set | messages:
+ msg304918 |
2017-10-24 15:19:04 | pitrou | set | versions:
- Python 2.7, Python 3.3, Python 3.4, Python 3.5, Python 3.6 |
2017-10-24 15:18:43 | pitrou | set | messages:
+ msg304917 |
2017-10-24 14:41:45 | benjamin.peterson | set | messages:
+ msg304912 |
2017-10-24 09:00:09 | vstinner | set | nosy:
+ vstinner messages:
+ msg304874
|
2017-10-24 08:48:14 | pitrou | set | nosy:
+ pitrou messages:
+ msg304873
|
2016-09-07 00:17:10 | benjamin.peterson | create | |