Author Ma Lin
Recipients Greg Price, Ma Lin, aeros, mark.dickinson, rhettinger, sir-sigurd
Date 2019-09-03.04:02:44
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1567483364.75.0.314519224234.issue38015@roundup.psfhosted.org>
In-reply-to
Content
Commit 5e63ab0 replaces macro with this inline function:

    static inline int
    is_small_int(long long ival)
    {
        return -NSMALLNEGINTS <= ival && ival < NSMALLPOSINTS;
    }

(by default, NSMALLNEGINTS is 5, NSMALLPOSINTS is 257)


However, when invoking this function, and `sizeof(value) < sizeof(long long)`, there is an unnecessary type casting.

For example, on 32-bit platform, if `value` is `Py_ssize_t`, it needs to be converted to 8-byte `long long` type.

The following assembly code is the beginning part of `PyLong_FromSsize_t(Py_ssize_t v)` function.
(32-bit x86 build generated by GCC 9.2, with `-m32 -O2` option)

Use macro before commit 5e63ab0:
        mov     eax, DWORD PTR [esp+4]
        add     eax, 5
        cmp     eax, 261
        ja      .L2
        sal     eax, 4
        add     eax, OFFSET FLAT:small_ints
        add     DWORD PTR [eax], 1
        ret
.L2:    jmp     PyLong_FromSsize_t_rest(int)

Use inlined function:
        push    ebx
        mov     eax, DWORD PTR [esp+8]
        mov     edx, 261
        mov     ecx, eax
        mov     ebx, eax
        sar     ebx, 31
        add     ecx, 5
        adc     ebx, 0
        cmp     edx, ecx
        mov     edx, 0
        sbb     edx, ebx
        jc      .L7
        cwde
        sal     eax, 4
        add     eax, OFFSET FLAT:small_ints+80
        add     DWORD PTR [eax], 1
        pop     ebx
        ret
.L7:    pop     ebx
        jmp     PyLong_FromSsize_t_rest(int)

On 32-bit x86 platform, 8-byte `long long` is implemented in using two registers, so the machine code is much longer than macro version.

At least these hot functions are suffered from this:
  PyObject* PyLong_FromSsize_t(Py_ssize_t v)
  PyObject* PyLong_FromLong(long v)

Replacing the inline function with a macro version will fix this:
#define IS_SMALL_INT(ival) (-NSMALLNEGINTS <= (ival) && (ival) < NSMALLPOSINTS)

If you want to see assembly code generated by major compilers, you can paste attached file demo.c to https://godbolt.org/
- demo.c was original written by Greg Price.
- use `-m32 -O2` to generate 32-bit build.
History
Date User Action Args
2019-09-03 04:02:44Ma Linsetrecipients: + Ma Lin, rhettinger, mark.dickinson, Greg Price, sir-sigurd, aeros
2019-09-03 04:02:44Ma Linsetmessageid: <1567483364.75.0.314519224234.issue38015@roundup.psfhosted.org>
2019-09-03 04:02:44Ma Linlinkissue38015 messages
2019-09-03 04:02:44Ma Lincreate