This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author petdance
Recipients pablogsal, petdance
Date 2020-01-09.02:59:50
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <>
I tried out some experimenting with the lookup table vs. the switch

The relevant diff (not including the patches to the code generator) is:

--- Parser/token.c
+++ Parser/token.c
@@ -77,31 +77,36 @@
 PyToken_OneChar(int c1)
-    switch (c1) {
-    case '%': return PERCENT;
-    case '&': return AMPER;
-    case '(': return LPAR;
-    case ')': return RPAR;
-    case '*': return STAR;
-    case '+': return PLUS;
-    case ',': return COMMA;
-    case '-': return MINUS;
-    case '.': return DOT;
-    case '/': return SLASH;
-    case ':': return COLON;
-    case ';': return SEMI;
-    case '<': return LESS;
-    case '=': return EQUAL;
-    case '>': return GREATER;
-    case '@': return AT;
-    case '[': return LSQB;
-    case ']': return RSQB;
-    case '^': return CIRCUMFLEX;
-    case '{': return LBRACE;
-    case '|': return VBAR;
-    case '}': return RBRACE;
-    case '~': return TILDE;
-    }
+    static char op_lookup[] = {
+        OP,        OP,        OP,        OP,        OP,
+        OP,        OP,        OP,        OP,        OP,
+        OP,        OP,        OP,        OP,        OP,
+        OP,        OP,        OP,        OP,        OP,
+        OP,        OP,        OP,        OP,        OP,
+        OP,        OP,        OP,        OP,        OP,
+        OP,        OP,        OP,        OP,        OP,
+        OP,        OP,        PERCENT,   AMPER,     OP,
+        LPAR,      RPAR,      STAR,      PLUS,      COMMA,
+        MINUS,     DOT,       SLASH,     OP,        OP,
+        OP,        OP,        OP,        OP,        OP,
+        OP,        OP,        OP,        COLON,     SEMI,
+        LESS,      EQUAL,     GREATER,   OP,        AT,
+        OP,        OP,        OP,        OP,        OP,
+        OP,        OP,        OP,        OP,        OP,
+        OP,        OP,        OP,        OP,        OP,
+        OP,        OP,        OP,        OP,        OP,
+        OP,        OP,        OP,        OP,        OP,
+        OP,        LSQB,      OP,        RSQB,      CIRCUMFLEX,
+        OP,        OP,        OP,        OP,        OP,
+        OP,        OP,        OP,        OP,        OP,
+        OP,        OP,        OP,        OP,        OP,
+        OP,        OP,        OP,        OP,        OP,
+        OP,        OP,        OP,        OP,        OP,
+        OP,        OP,        OP,        LBRACE,    VBAR,
+        RBRACE,    TILDE
+    };
+    if (c1>=37 && c1<=126)
+        return op_lookup[c1];
     return OP;

To test the speed change, I couldn't use pyperformance, because the only
thing I wanted to time was the In my testing, I didn't use pyperformance
because the only part of the code I wanted to test was the actual
compilation of the code.  My solution for this was to find the 100 largest
*.py files in the cpython repo and compile them like so:

    python -m py_compile $(List-of-big-*.py-files)

The speedup was significant: My table-driven lookup ran the compile tests
about 10% than the existing switch approach.  That was without
--enable-optimizations in my configure.

However, as pablogsal suspected, with PGO enabled, the two approaches ran
the code in pretty much the same speed.

I do think that there may be merit in using a table-driven approach that
generates less code and doesn't rely on PGO speeding things up.

If anyone's interested, all my work is on branch Issue39150 in my fork
Date User Action Args
2020-01-09 02:59:51petdancesetrecipients: + petdance, pablogsal
2020-01-09 02:59:51petdancesetmessageid: <>
2020-01-09 02:59:51petdancelinkissue39150 messages
2020-01-09 02:59:50petdancecreate