classification
Title: Microoptimize PyType_Lookup for cache hits
Type: Stage: resolved
Components: Interpreter Core Versions: Python 3.10
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: dino.viehland, methane, pablogsal, vstinner
Priority: normal Keywords: patch

Created on 2021-03-09 19:21 by dino.viehland, last changed 2021-03-27 00:02 by pablogsal. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 24804 merged dino.viehland, 2021-03-09 19:22
PR 24949 merged pablogsal, 2021-03-20 19:17
Messages (7)
msg388377 - (view) Author: Dino Viehland (dino.viehland) * (Python committer) Date: 2021-03-09 19:21
The common case going through _PyType_Lookup is to have a cache hit.  There's some small tweaks which can make this a little cheaper:

1) the name field identity is used for a cache hit, and is kept alive by the cache.  So there's no need to read the hash code of the name - instead the address can be used as the hash.

2) There's no need to check if the name is cachable on the lookup either, it probably is, and if it is, it'll be in the cache.

3) If we clear the version tag when invalidating a type then we don't actually need to check for a valid version tag bit.
msg388485 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2021-03-11 08:21
$ ./python -m pyperf compare_to -G --min-speed=3 master.json pytype.json
Slower (1):
- unpack_sequence: 62.2 ns +- 0.6 ns -> 66.1 ns +- 0.9 ns: 1.06x slower

Faster (29):
- nbody: 182 ms +- 1 ms -> 152 ms +- 2 ms: 1.19x faster
- regex_effbot: 4.00 ms +- 0.05 ms -> 3.45 ms +- 0.08 ms: 1.16x faster
- spectral_norm: 178 ms +- 1 ms -> 158 ms +- 2 ms: 1.13x faster
- crypto_pyaes: 140 ms +- 1 ms -> 125 ms +- 1 ms: 1.12x faster
- dulwich_log: 88.0 ms +- 11.8 ms -> 80.1 ms +- 0.5 ms: 1.10x faster
- scimark_fft: 522 ms +- 4 ms -> 477 ms +- 14 ms: 1.09x faster
- mako: 18.2 ms +- 0.1 ms -> 16.7 ms +- 0.1 ms: 1.09x faster
- raytrace: 542 ms +- 4 ms -> 497 ms +- 3 ms: 1.09x faster
- regex_dna: 251 ms +- 1 ms -> 230 ms +- 1 ms: 1.09x faster
- regex_v8: 29.6 ms +- 0.2 ms -> 27.2 ms +- 0.1 ms: 1.09x faster
- pidigits: 227 ms +- 0 ms -> 210 ms +- 0 ms: 1.08x faster
- fannkuch: 564 ms +- 7 ms -> 525 ms +- 4 ms: 1.08x faster
- telco: 7.70 ms +- 0.12 ms -> 7.17 ms +- 0.18 ms: 1.07x faster
- float: 130 ms +- 1 ms -> 122 ms +- 1 ms: 1.07x faster
- unpickle_pure_python: 350 us +- 4 us -> 327 us +- 3 us: 1.07x faster
- scimark_sparse_mat_mult: 6.86 ms +- 0.08 ms -> 6.47 ms +- 0.07 ms: 1.06x faster
- chaos: 122 ms +- 1 ms -> 115 ms +- 1 ms: 1.06x faster
- nqueens: 113 ms +- 1 ms -> 107 ms +- 1 ms: 1.06x faster
- json_dumps: 15.4 ms +- 0.1 ms -> 14.7 ms +- 0.1 ms: 1.05x faster
- pickle_pure_python: 505 us +- 6 us -> 480 us +- 5 us: 1.05x faster
- logging_simple: 9.88 us +- 0.17 us -> 9.40 us +- 0.18 us: 1.05x faster
- deltablue: 8.19 ms +- 0.13 ms -> 7.79 ms +- 0.15 ms: 1.05x faster
- scimark_lu: 190 ms +- 3 ms -> 182 ms +- 5 ms: 1.05x faster
- scimark_monte_carlo: 125 ms +- 2 ms -> 120 ms +- 2 ms: 1.05x faster
- pickle: 11.3 us +- 0.2 us -> 10.8 us +- 0.1 us: 1.05x faster
- logging_silent: 189 ns +- 4 ns -> 182 ns +- 4 ns: 1.04x faster
- pickle_dict: 31.7 us +- 0.2 us -> 30.5 us +- 0.1 us: 1.04x faster
- django_template: 51.6 ms +- 0.7 ms -> 49.8 ms +- 0.6 ms: 1.04x faster
- pyflate: 755 ms +- 7 ms -> 730 ms +- 7 ms: 1.03x faster

Benchmark hidden because not significant (30): 2to3, chameleon, genshi_text, genshi_xml, go, hexiom, json_loads, logging_format, meteor_contest, pathlib, pickle_list, python_startup, python_startup_no_site, regex_compile, richards, scimark_sor, sqlalchemy_declarative, sqlalchemy_imperative, sqlite_synth, sympy_expand, sympy_integrate, sympy_sum, sympy_str, tornado_http, unpickle, unpickle_list, xml_etree_parse, xml_etree_iterparse, xml_etree_generate, xml_etree_process

Geometric mean: 1.04x faster
msg388506 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-03-11 14:16
> Geometric mean: 1.04x faster

Hey, it's good to see my new pyperf feature being used :-D IMO it helps to more easily compare a large set of benchmarks.
msg389134 - (view) Author: Dino Viehland (dino.viehland) * (Python committer) Date: 2021-03-20 01:16
Setup a micro-benchmark, foo.c:

#define PY_SSIZE_T_CLEAN
#include <Python.h>
#include <string.h>
#include <stdlib.h>

int
main(int argc, char *argv[])
{
    wchar_t *program = Py_DecodeLocale(argv[0], NULL);
    if (program == NULL) {
        fprintf(stderr, "Fatal error: cannot decode argv[0]\n");
        exit(1);
    }
    Py_SetProgramName(program);  /* optional but recommended */
    Py_Initialize();
    PyObject *pName = PyUnicode_DecodeFSDefault("foo");
    if (pName == NULL) { printf("no foo\n"); PyErr_Print(); }
    PyObject *pModule = PyImport_Import(pName);
    if (pModule == NULL) { printf("no mod\n"); PyErr_Print(); return 0; }
    PyObject *cls = PyObject_GetAttrString(pModule, "C");
    if (cls == NULL) { printf("no cls\n"); }
    PyObject *fs[20];
    for(int i = 0; i<20; i++) {
         char buf[4];
         sprintf(buf, "f%d", i);
         fs[i] = PyUnicode_DecodeFSDefault(buf);
    }
    for(int i = 0; i<100000000; i++) {
         for(int j = 0; j<20; j++) {
             if(_PyType_Lookup(cls, fs[j])==NULL) {
		printf("Uh oh\n");
	     }
         }
    }

   if (Py_FinalizeEx() < 0) {
        exit(120);
    }
    PyMem_RawFree(program);
    return 0;
}


Lib/foo.py:
import time


class C:
    pass

for i in range(20):
    setattr(C, f"f{i}", lambda self: None)


obj hash: 0m6.222s
str hash: 0m6.327s
baseline: 0m6.784s
msg389173 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-03-20 19:12
New changeset ee48c7d54147ae906776b9f6f96e8920e097d0c4 by Dino Viehland in branch 'master':
bpo-43452: Micro-optimizations to PyType_Lookup (GH-24804)
https://github.com/python/cpython/commit/ee48c7d54147ae906776b9f6f96e8920e097d0c4
msg389418 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-03-24 00:04
New changeset a054f6b2b1d9f75edfb5ec2247bbf60f07491977 by Pablo Galindo in branch 'master':
bpo-43452: Document the PyType_Lookup optimizations in the What's New for 3.10 (GH-24949)
https://github.com/python/cpython/commit/a054f6b2b1d9f75edfb5ec2247bbf60f07491977
msg389577 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-03-27 00:02
Seems that commit ee48c7d54147ae906776b9f6f96e8920e097d0c4  has introduced a regression:

https://bugs.python.org/issue43636
History
Date User Action Args
2021-03-27 00:02:33pablogsalsetmessages: + msg389577
2021-03-24 00:04:59pablogsalsetmessages: + msg389418
2021-03-20 19:17:30pablogsalsetpull_requests: + pull_request23707
2021-03-20 19:13:03pablogsalsetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2021-03-20 19:12:13pablogsalsetnosy: + pablogsal
messages: + msg389173
2021-03-20 01:16:25dino.viehlandsetmessages: + msg389134
2021-03-11 14:16:50vstinnersetnosy: + vstinner
messages: + msg388506
2021-03-11 08:21:25methanesetnosy: + methane
messages: + msg388485
2021-03-09 19:22:28dino.viehlandsetkeywords: + patch
stage: patch review
pull_requests: + pull_request23572
2021-03-09 19:21:17dino.viehlandcreate