> In 3.8, the union used to ensure alignment on a C double is gone.

Note that two uintptr_t is aligned 16bytes on 64bit platforms and 8bytes on 32bit platforms.

Python 3.7 is worse than 3.8.
It used "double dummy" to align by 8 bytes, not 16 bytes.
We should use "long double" to align by 16 bytes.

But it means +8 bytes for all tuples.  If we backport PR-12850 to 3.7, +8 bytes for 1/2 tuples, and +16 bytes for remaining tuples.

Any ideas about reduce impact for Python 3.7?
For example, can we add 8byte dummy to PyGC_Head, and tuple use the dummy for hash?  Maybe, it breaks ABI....  Not a chance...

I wonder if we can add -fmax-type-align=8 for extension types...
