Message156905
It is possible to reduce PyASCIIObject.state to 8 bits instead of 32, move it to the end (exchange wstr and state) of the structure and pack the structure. As a result, the structure size is reduced by 3 bytes (state type changes from int to char).
I expect a low or not overhead on performances because only PyASCIIObject.state field is affected and this field size is 8 bits.
See also the issue #14419 which relies on memory alignment (of the ASCII string data) to optimize the ASCII decoder. If I understand correctly, my patch disables the possibility of this optimization.
--
Example on Linux 32 bits:
$ cat x.c
#include <Python.h>
int main()
{
printf("sizeof(PyASCIIObject)=%u bytes\n", sizeof(PyASCIIObject));
printf("sizeof(PyCompactUnicodeObject)=%u bytes\n", sizeof(PyCompactUnicodeObject));
printf("sizeof(PyUnicodeObject)=%u bytes\n", sizeof(PyUnicodeObject));
return 0;
}
# unpatched
$ gcc -I Include/ -I . x.c -o x && ./x
sizeof(PyASCIIObject)=24 bytes
sizeof(PyCompactUnicodeObject)=36 bytes
sizeof(PyUnicodeObject)=40 bytes
# pack the 3 structures
$ gcc -I Include/ -I . x.c -o x && ./x
sizeof(PyASCIIObject)=21 bytes
sizeof(PyCompactUnicodeObject)=33 bytes
sizeof(PyUnicodeObject)=37 bytes
--
We might also pack PyCompactUnicodeObject and PyUnicodeObject but it would have a bad impact on performances because utf8_length, utf8, wstr_length and data would not be aligned anymore. |
|
Date |
User |
Action |
Args |
2012-03-27 11:14:19 | vstinner | set | recipients:
+ vstinner, loewis, pitrou, serhiy.storchaka |
2012-03-27 11:14:18 | vstinner | set | messageid: <1332846858.96.0.396992270129.issue14422@psf.upfronthosting.co.za> |
2012-03-27 11:14:17 | vstinner | link | issue14422 messages |
2012-03-27 11:14:17 | vstinner | create | |
|