classification
Title: "Zero cost" exception handling
Type: performance Stage: patch review
Components: Versions:
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Mark.Shannon Nosy List: Dennis Sweeney, Guido.van.Rossum, Mark.Shannon, Yonatan Goldschmidt, ammar2, chris.jerdonek, corona10, erlendaasland, gvanrossum, hauntsaninja, pablogsal, petr.viktorin, rhettinger, scoder, serhiy.storchaka
Priority: release blocker Keywords: patch

Created on 2020-04-08 09:41 by Mark.Shannon, last changed 2021-05-12 10:25 by Mark.Shannon.

Pull Requests
URL Status Linked Edit
PR 25729 merged Mark.Shannon, 2021-04-30 13:27
PR 26012 merged Dennis Sweeney, 2021-05-10 06:39
PR 26021 merged Mark.Shannon, 2021-05-10 16:46
PR 26059 merged Mark.Shannon, 2021-05-12 10:25
Messages (28)
msg365974 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2020-04-08 09:41
C++ and Java support what is known as "zero cost" exception handling.
The "zero cost" refers to the cost when no exception is raised. There is still a cost when exceptions are thrown.

The basic principle is that the compiler generates tables indicating where control should be transferred to when an exception is raised. When no exception is raised, there is no runtime overhead.

(C)Python should support "zero cost" exceptions.


Now that the bytecodes for exception handling are regular (meaning that their stack effect can be statically determined) it is possible for the bytecode compiler to emit exception handling tables.

Doing so would have two main benefits.
1. "try" and "with" statements would be faster (and "async for", but that is an implementation detail).
2. Calls to Python functions would be faster as frame objects would be considerably smaller. Currently each frame carries 240 bytes of overhead for exception handling.
msg365975 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-04-08 10:23
+1! I was going to implement this, but first I wanted to implement support of line number ranges instead of just line numbers (co_lineno). We need to design some compact portable format for address to address mapping (or address range to address mapping if it is more efficient).

Are you already working on this Mark? I would be glad to make a review.
msg366976 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2020-04-22 05:38
This is an exciting prospect.  Am looking forward to it :-)
msg370612 - (view) Author: Dong-hee Na (corona10) * (Python committer) Date: 2020-06-02 15:28
+1
msg370642 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2020-06-03 06:07
To clarify, would there be any observable difference in behavior aside from speed? And would there be any limitations in when the speedup can be applied?
msg370645 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-06-03 08:39
The only observable changes will be changes in the code object: new attributes and constructor parameters, changed .pyc format, dis output, etc.
msg391509 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2021-04-21 10:23
The changes to pyc format aren't user visible so shouldn't matter,
but what about the dis output?

Consider this program:

def f():
    try:
        1/0
    except:
        return "fail"

Currently it compiles to:
  2           0 SETUP_FINALLY            7 (to 16)

  3           2 LOAD_CONST               1 (1)
              4 LOAD_CONST               2 (0)
              6 BINARY_TRUE_DIVIDE
              8 POP_TOP
             10 POP_BLOCK
             12 LOAD_CONST               0 (None)
             14 RETURN_VALUE

  4     >>   16 POP_TOP
             18 POP_TOP
             20 POP_TOP

  5          22 POP_EXCEPT
             24 LOAD_CONST               3 ('fail')
             26 RETURN_VALUE

With zero-cost exception handling, it will compile to something like:  
  2           0 NOP
  3           2 LOAD_CONST               1 (1)
              4 LOAD_CONST               2 (0)
              6 BINARY_TRUE_DIVIDE
              8 POP_TOP
             10 LOAD_CONST               0 (None)
             12 RETURN_VALUE

  None       14 PUSH_EXCEPT

  4          16 POP_TOP
             18 POP_TOP
             20 POP_TOP

  5          22 POP_EXCEPT
             24 LOAD_CONST               3 ('fail')
             26 RETURN_VALUE

(There are additional optimizations that should be applied, but those are a separate issue)

The problem is that the exception handling flow is no longer visible.
Should we add it back in somehow, or just append the exception jump table?
msg391513 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2021-04-21 12:01
We can add a new column for the offset or the index of the error handler. Or add pseudo-instructions (which do not correspond to any bytecode) at boundaries of the code with some error handler.
msg391522 - (view) Author: Guido van Rossum (Guido.van.Rossum) Date: 2021-04-21 14:13
I like Serhiy’s idea.

BTW, what are the three POP_TOP op codes in a row popping?
msg392305 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2021-04-29 13:31
> BTW, what are the three POP_TOP op codes in a row popping?

When exceptions are pushed to the stack, they are pushed as a triple: (exc, type, traceback)
so when we pop them, we need three pops.
msg392306 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2021-04-29 13:37
Responding to Serhiy's suggestions:
1 Add another column:
   Adding another column makes for lots of repetition in larger try blocks, and pushes useful information further to the right.
2 Add pseudo-instructions
  I find those misleading, as they aren't really there, and probably won't even correspond to the original SETUP_XXX instructions.
msg392309 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2021-04-29 13:53
I've played around with a few formats, and what I've ended up with is this:

1. Use the >> marker for for exception targets, as well as normal branch targets.
2. Add a text version of the exception handler table at the end of the disassembly.

This has all the information, without too much visual clutter.

The function `f` above looks like this:
>>> dis.dis(f)
  2           0 NOP

  3           2 LOAD_CONST               1 (1)
              4 LOAD_CONST               2 (0)
              6 BINARY_TRUE_DIVIDE
              8 POP_TOP
             10 NOP
             12 LOAD_CONST               0 (None)
             14 RETURN_VALUE
        >>   16 NOP
             18 PUSH_EXC_INFO

  4          20 POP_TOP
             22 POP_TOP
             24 POP_TOP

  5          26 NOP
             28 POP_EXCEPT
             30 LOAD_CONST               3 ('fail')
             32 RETURN_VALUE
        >>   34 POP_EXCEPT_AND_RERAISE
ExceptionTable:
  2 to 8 -> 16 (depth 0)
  18 to 24 -> 34 (depth 3) lasti


The 'lasti' field indicates that the offset of the last instruction is pushed to the stack, which is needed for cleanup-then-reraise code.
msg393321 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-05-09 14:54
This seems to have broken the address sanitizer buildbot:

https://buildbot.python.org/all/#/builders/582/builds/116/steps/5/logs/stdio

Example error:

================================================================
==28597==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x604000011cbd at pc 0x55e7e3cceedc bp 0x7ffc74448490 sp 0x7ffc74448480
READ of size 1 at 0x604000011cbd thread T0
    #0 0x55e7e3cceedb in skip_to_next_entry Python/ceval.c:4798
    #1 0x55e7e3cceedb in get_exception_handler Python/ceval.c:4866
    #2 0x55e7e3cceedb in _PyEval_EvalFrameDefault Python/ceval.c:4465
    #3 0x55e7e3ed10d7 in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    #4 0x55e7e3ed10d7 in _PyEval_Vector Python/ceval.c:5160
    #5 0x55e7e3d1922e in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    #6 0x55e7e3d1922e in object_vacall Objects/call.c:734
    #7 0x55e7e3d1ec50 in _PyObject_CallMethodIdObjArgs Objects/call.c:825
    #8 0x55e7e3f49dd7 in import_find_and_load Python/import.c:1499
    #9 0x55e7e3f49dd7 in PyImport_ImportModuleLevelObject Python/import.c:1600
    #10 0x55e7e3cd839b in import_name Python/ceval.c:6101
    #11 0x55e7e3cd839b in _PyEval_EvalFrameDefault Python/ceval.c:3693
    #12 0x55e7e3ed09ea in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    #13 0x55e7e3ed09ea in _PyEval_Vector Python/ceval.c:5160
    #14 0x55e7e3ed09ea in PyEval_EvalCode Python/ceval.c:1136
    #15 0x55e7e420b908 in builtin_exec_impl Python/bltinmodule.c:1065
    #16 0x55e7e420b908 in builtin_exec Python/clinic/bltinmodule.c.h:371
    #17 0x55e7e4196590 in cfunction_vectorcall_FASTCALL Objects/methodobject.c:426
    #18 0x55e7e3d1a592 in PyVectorcall_Call Objects/call.c:255
    #19 0x55e7e3cd15f4 in do_call_core Python/ceval.c:6028
    #20 0x55e7e3cd15f4 in _PyEval_EvalFrameDefault Python/ceval.c:4283
    #21 0x55e7e3ed10d7 in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    #22 0x55e7e3ed10d7 in _PyEval_Vector Python/ceval.c:5160
    #23 0x55e7e3cd424e in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    #24 0x55e7e3cd424e in PyObject_Vectorcall Include/cpython/abstract.h:123
    #25 0x55e7e3cd424e in call_function Python/ceval.c:5976
    #26 0x55e7e3cd424e in _PyEval_EvalFrameDefault Python/ceval.c:4187
    #27 0x55e7e3ed10d7 in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    #28 0x55e7e3ed10d7 in _PyEval_Vector Python/ceval.c:5160
    #29 0x55e7e3cd4384 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    #30 0x55e7e3cd4384 in PyObject_Vectorcall Include/cpython/abstract.h:123
    #31 0x55e7e3cd4384 in call_function Python/ceval.c:5976
    #32 0x55e7e3cd4384 in _PyEval_EvalFrameDefault Python/ceval.c:4204
    #33 0x55e7e3ed10d7 in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    #34 0x55e7e3ed10d7 in _PyEval_Vector Python/ceval.c:5160
    #35 0x55e7e3ce0934 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    #36 0x55e7e3ce0934 in PyObject_Vectorcall Include/cpython/abstract.h:123
    #37 0x55e7e3ce0934 in call_function Python/ceval.c:5976
    #38 0x55e7e3ce0934 in _PyEval_EvalFrameDefault Python/ceval.c:4219
    #39 0x55e7e3ed10d7 in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    #40 0x55e7e3ed10d7 in _PyEval_Vector Python/ceval.c:5160
    #41 0x55e7e3ce0934 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    #42 0x55e7e3ce0934 in PyObject_Vectorcall Include/cpython/abstract.h:123
    #43 0x55e7e3ce0934 in call_function Python/ceval.c:5976
    #44 0x55e7e3ce0934 in _PyEval_EvalFrameDefault Python/ceval.c:4219
    #45 0x55e7e3ed10d7 in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    #46 0x55e7e3ed10d7 in _PyEval_Vector Python/ceval.c:5160
    #47 0x55e7e3ce0934 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    #48 0x55e7e3ce0934 in PyObject_Vectorcall Include/cpython/abstract.h:123
    #49 0x55e7e3ce0934 in call_function Python/ceval.c:5976
    #50 0x55e7e3ce0934 in _PyEval_EvalFrameDefault Python/ceval.c:4219
    #51 0x55e7e3ed10d7 in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    #52 0x55e7e3ed10d7 in _PyEval_Vector Python/ceval.c:5160
    #53 0x55e7e3cd424e in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    #54 0x55e7e3cd424e in PyObject_Vectorcall Include/cpython/abstract.h:123
    #55 0x55e7e3cd424e in call_function Python/ceval.c:5976
    #56 0x55e7e3cd424e in _PyEval_EvalFrameDefault Python/ceval.c:4187
    #57 0x55e7e3ed10d7 in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    #58 0x55e7e3ed10d7 in _PyEval_Vector Python/ceval.c:5160
    #59 0x55e7e3cd424e in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    #60 0x55e7e3cd424e in PyObject_Vectorcall Include/cpython/abstract.h:123
    #61 0x55e7e3cd424e in call_function Python/ceval.c:5976
    #62 0x55e7e3cd424e in _PyEval_EvalFrameDefault Python/ceval.c:4187
    #63 0x55e7e3ed10d7 in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    #64 0x55e7e3ed10d7 in _PyEval_Vector Python/ceval.c:5160
    #65 0x55e7e3cd71ad in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    #66 0x55e7e3cd71ad in PyObject_Vectorcall Include/cpython/abstract.h:123
    #67 0x55e7e3cd71ad in call_function Python/ceval.c:5976
    #68 0x55e7e3cd71ad in _PyEval_EvalFrameDefault Python/ceval.c:4237
    #69 0x55e7e3ed09ea in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    #70 0x55e7e3ed09ea in _PyEval_Vector Python/ceval.c:5160
    #71 0x55e7e3ed09ea in PyEval_EvalCode Python/ceval.c:1136
    #72 0x55e7e420b908 in builtin_exec_impl Python/bltinmodule.c:1065
    #73 0x55e7e420b908 in builtin_exec Python/clinic/bltinmodule.c.h:371
    #74 0x55e7e4196590 in cfunction_vectorcall_FASTCALL Objects/methodobject.c:426
    #75 0x55e7e3d1a592 in PyVectorcall_Call Objects/call.c:255
    #76 0x55e7e3cd15f4 in do_call_core Python/ceval.c:6028
    #77 0x55e7e3cd15f4 in _PyEval_EvalFrameDefault Python/ceval.c:4283
    #78 0x55e7e3ed10d7 in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    #79 0x55e7e3ed10d7 in _PyEval_Vector Python/ceval.c:5160
    #80 0x55e7e3cd424e in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    #81 0x55e7e3cd424e in PyObject_Vectorcall Include/cpython/abstract.h:123
    #82 0x55e7e3cd424e in call_function Python/ceval.c:5976
    #83 0x55e7e3cd424e in _PyEval_EvalFrameDefault Python/ceval.c:4187
    #84 0x55e7e3ed10d7 in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    #85 0x55e7e3ed10d7 in _PyEval_Vector Python/ceval.c:5160
    #86 0x55e7e3cd4384 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    #87 0x55e7e3cd4384 in PyObject_Vectorcall Include/cpython/abstract.h:123
    #88 0x55e7e3cd4384 in call_function Python/ceval.c:5976
    #89 0x55e7e3cd4384 in _PyEval_EvalFrameDefault Python/ceval.c:4204
    #90 0x55e7e3ed10d7 in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    #91 0x55e7e3ed10d7 in _PyEval_Vector Python/ceval.c:5160
    #92 0x55e7e3ce0934 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    #93 0x55e7e3ce0934 in PyObject_Vectorcall Include/cpython/abstract.h:123
    #94 0x55e7e3ce0934 in call_function Python/ceval.c:5976
    #95 0x55e7e3ce0934 in _PyEval_EvalFrameDefault Python/ceval.c:4219
    #96 0x55e7e3ed10d7 in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    #97 0x55e7e3ed10d7 in _PyEval_Vector Python/ceval.c:5160
    #98 0x55e7e3ce0934 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    #99 0x55e7e3ce0934 in PyObject_Vectorcall Include/cpython/abstract.h:123
    #100 0x55e7e3ce0934 in call_function Python/ceval.c:5976
    #101 0x55e7e3ce0934 in _PyEval_EvalFrameDefault Python/ceval.c:4219
    #102 0x55e7e3ed10d7 in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    #103 0x55e7e3ed10d7 in _PyEval_Vector Python/ceval.c:5160
    #104 0x55e7e3ce0934 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    #105 0x55e7e3ce0934 in PyObject_Vectorcall Include/cpython/abstract.h:123
    #106 0x55e7e3ce0934 in call_function Python/ceval.c:5976
    #107 0x55e7e3ce0934 in _PyEval_EvalFrameDefault Python/ceval.c:4219
    #108 0x55e7e3ed10d7 in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    #109 0x55e7e3ed10d7 in _PyEval_Vector Python/ceval.c:5160
    #110 0x55e7e3cd424e in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    #111 0x55e7e3cd424e in PyObject_Vectorcall Include/cpython/abstract.h:123
    #112 0x55e7e3cd424e in call_function Python/ceval.c:5976
    #113 0x55e7e3cd424e in _PyEval_EvalFrameDefault Python/ceval.c:4187
    #114 0x55e7e3ed10d7 in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    #115 0x55e7e3ed10d7 in _PyEval_Vector Python/ceval.c:5160
    #116 0x55e7e3cd424e in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    #117 0x55e7e3cd424e in PyObject_Vectorcall Include/cpython/abstract.h:123
    #118 0x55e7e3cd424e in call_function Python/ceval.c:5976
    #119 0x55e7e3cd424e in _PyEval_EvalFrameDefault Python/ceval.c:4187
    #120 0x55e7e3ed10d7 in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    #121 0x55e7e3ed10d7 in _PyEval_Vector Python/ceval.c:5160
    #122 0x55e7e3ce0934 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    #123 0x55e7e3ce0934 in PyObject_Vectorcall Include/cpython/abstract.h:123
    #124 0x55e7e3ce0934 in call_function Python/ceval.c:5976
    #125 0x55e7e3ce0934 in _PyEval_EvalFrameDefault Python/ceval.c:4219
    #126 0x55e7e3ed10d7 in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    #127 0x55e7e3ed10d7 in _PyEval_Vector Python/ceval.c:5160
    #128 0x55e7e3cd71ad in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    #129 0x55e7e3cd71ad in PyObject_Vectorcall Include/cpython/abstract.h:123
    #130 0x55e7e3cd71ad in call_function Python/ceval.c:5976
    #131 0x55e7e3cd71ad in _PyEval_EvalFrameDefault Python/ceval.c:4237
    #132 0x55e7e3ed10d7 in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    #133 0x55e7e3ed10d7 in _PyEval_Vector Python/ceval.c:5160
    #134 0x55e7e3ce0934 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    #135 0x55e7e3ce0934 in PyObject_Vectorcall Include/cpython/abstract.h:123
    #136 0x55e7e3ce0934 in call_function Python/ceval.c:5976
    #137 0x55e7e3ce0934 in _PyEval_EvalFrameDefault Python/ceval.c:4219
    #138 0x55e7e3ed10d7 in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    #139 0x55e7e3ed10d7 in _PyEval_Vector Python/ceval.c:5160
    #140 0x55e7e3ce0934 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    #141 0x55e7e3ce0934 in PyObject_Vectorcall Include/cpython/abstract.h:123
    #142 0x55e7e3ce0934 in call_function Python/ceval.c:5976
    #143 0x55e7e3ce0934 in _PyEval_EvalFrameDefault Python/ceval.c:4219
    #144 0x55e7e3ed10d7 in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    #145 0x55e7e3ed10d7 in _PyEval_Vector Python/ceval.c:5160
    #146 0x55e7e3cd4384 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    #147 0x55e7e3cd4384 in PyObject_Vectorcall Include/cpython/abstract.h:123
    #148 0x55e7e3cd4384 in call_function Python/ceval.c:5976
    #149 0x55e7e3cd4384 in _PyEval_EvalFrameDefault Python/ceval.c:4204
    #150 0x55e7e3ed10d7 in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    #151 0x55e7e3ed10d7 in _PyEval_Vector Python/ceval.c:5160
    #152 0x55e7e3cd4384 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    #153 0x55e7e3cd4384 in PyObject_Vectorcall Include/cpython/abstract.h:123
    #154 0x55e7e3cd4384 in call_function Python/ceval.c:5976
    #155 0x55e7e3cd4384 in _PyEval_EvalFrameDefault Python/ceval.c:4204
    #156 0x55e7e3ed10d7 in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    #157 0x55e7e3ed10d7 in _PyEval_Vector Python/ceval.c:5160
    #158 0x55e7e4150719 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    #159 0x55e7e4150719 in method_vectorcall Objects/classobject.c:53
    #160 0x55e7e3d1a663 in PyVectorcall_Call Objects/call.c:267
    #161 0x55e7e3cd15f4 in do_call_core Python/ceval.c:6028
    #162 0x55e7e3cd15f4 in _PyEval_EvalFrameDefault Python/ceval.c:4283
    #163 0x55e7e3ed10d7 in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    #164 0x55e7e3ed10d7 in _PyEval_Vector Python/ceval.c:5160
    #165 0x55e7e3ce0934 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    #166 0x55e7e3ce0934 in PyObject_Vectorcall Include/cpython/abstract.h:123
    #167 0x55e7e3ce0934 in call_function Python/ceval.c:5976
    #168 0x55e7e3ce0934 in _PyEval_EvalFrameDefault Python/ceval.c:4219
    #169 0x55e7e3ed09ea in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    #170 0x55e7e3ed09ea in _PyEval_Vector Python/ceval.c:5160
    #171 0x55e7e3ed09ea in PyEval_EvalCode Python/ceval.c:1136
    #172 0x55e7e420b908 in builtin_exec_impl Python/bltinmodule.c:1065
    #173 0x55e7e420b908 in builtin_exec Python/clinic/bltinmodule.c.h:371
    #174 0x55e7e4196590 in cfunction_vectorcall_FASTCALL Objects/methodobject.c:426
    #175 0x55e7e3ce0934 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    #176 0x55e7e3ce0934 in PyObject_Vectorcall Include/cpython/abstract.h:123
    #177 0x55e7e3ce0934 in call_function Python/ceval.c:5976
    #178 0x55e7e3ce0934 in _PyEval_EvalFrameDefault Python/ceval.c:4219
    #179 0x55e7e3ed10d7 in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    #180 0x55e7e3ed10d7 in _PyEval_Vector Python/ceval.c:5160
    #181 0x55e7e3ce0934 in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    #182 0x55e7e3ce0934 in PyObject_Vectorcall Include/cpython/abstract.h:123
    #183 0x55e7e3ce0934 in call_function Python/ceval.c:5976
    #184 0x55e7e3ce0934 in _PyEval_EvalFrameDefault Python/ceval.c:4219
    #185 0x55e7e3ed10d7 in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    #186 0x55e7e3ed10d7 in _PyEval_Vector Python/ceval.c:5160
    #187 0x55e7e3d1a592 in PyVectorcall_Call Objects/call.c:255
    #188 0x55e7e3ceddf2 in pymain_run_module Modules/main.c:293
    #189 0x55e7e3cef08b in pymain_run_python Modules/main.c:581
    #190 0x55e7e3cf10c4 in Py_RunMain Modules/main.c:666
    #191 0x55e7e3cf10c4 in pymain_main Modules/main.c:696
    #192 0x55e7e3cf10c4 in Py_BytesMain Modules/main.c:720
    #193 0x7f9d8dd5ab24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)
    #194 0x55e7e3ced49d in _start (/buildbot/buildarea/3.x.pablogsal-arch-x86_64.asan/build/python+0x17649d)
0x604000011cbd is located 0 bytes to the right of 45-byte region [0x604000011c90,0x604000011cbd)
allocated by thread T0 here:
    #0 0x7f9d8e124459 in __interceptor_malloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:145
    #1 0x55e7e3cf7770 in _PyBytes_FromSize Objects/bytesobject.c:126
    #2 0x55e7e3cf7770 in PyBytes_FromStringAndSize Objects/bytesobject.c:159
    #3 0x55e7e3f686f1 in r_object Python/marshal.c:1066
    #4 0x55e7e3f6960d in r_object Python/marshal.c:1375
    #5 0x55e7e3f67ee0 in r_object Python/marshal.c:1171
    #6 0x55e7e3f694c3 in r_object Python/marshal.c:1348
    #7 0x55e7e3f6efca in PyMarshal_ReadObjectFromString Python/marshal.c:1562
    #8 0x55e7e3f48b1d in PyImport_ImportFrozenModuleObject Python/import.c:1140
    #9 0x55e7e3f490cc in PyImport_ImportFrozenModule Python/import.c:1194
    #10 0x55e7e3f7e1e6 in init_importlib Python/pylifecycle.c:141
    #11 0x55e7e3f7e1e6 in pycore_interp_init Python/pylifecycle.c:811
    #12 0x55e7e3f84536 in pyinit_config Python/pylifecycle.c:840
    #13 0x55e7e3f84536 in pyinit_core Python/pylifecycle.c:1003
    #14 0x55e7e3f855fc in Py_InitializeFromConfig Python/pylifecycle.c:1188
    #15 0x55e7e3ced749 in pymain_init Modules/main.c:66
    #16 0x55e7e3cf107a in pymain_main Modules/main.c:687
    #17 0x55e7e3cf107a in Py_BytesMain Modules/main.c:720
    #18 0x7f9d8dd5ab24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)
SUMMARY: AddressSanitizer: heap-buffer-overflow Python/ceval.c:4798 in skip_to_next_entry
Shadow bytes around the buggy address:
  0x0c087fffa340: fa fa 00 00 00 00 00 03 fa fa 00 00 00 00 00 00
  0x0c087fffa350: fa fa 00 00 00 00 03 fa fa fa 00 00 00 00 00 03
  0x0c087fffa360: fa fa 00 00 00 00 03 fa fa fa 00 00 00 00 00 05
  0x0c087fffa370: fa fa 00 00 00 00 03 fa fa fa fd fd fd fd fd fd
  0x0c087fffa380: fa fa 00 00 00 00 00 03 fa fa 00 00 00 00 00 05
=>0x0c087fffa390: fa fa 00 00 00 00 00[05]fa fa 00 00 00 00 00 01
  0x0c087fffa3a0: fa fa 00 00 00 00 00 00 fa fa 00 00 00 00 00 00
  0x0c087fffa3b0: fa fa 00 00 00 00 00 01 fa fa 00 00 00 00 07 fa
  0x0c087fffa3c0: fa fa 00 00 00 00 00 00 fa fa 00 00 00 00 07 fa
  0x0c087fffa3d0: fa fa 00 00 00 00 00 00 fa fa 00 00 00 00 00 00
  0x0c087fffa3e0: fa fa 00 00 00 00 00 00 fa fa 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
==28597==ABORTING
make: *** [Makefile:1249: buildbottest] Error 1
msg393322 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-05-09 15:04
To reproduce with a modern gcc:

% export ASAN_OPTIONS=detect_leaks=0:allocator_may_return_null=1:handle_segv=0
% ./configure --with-address-sanitizer --without-pymalloc
% make -j -s
% ./python -m test test_statistics
=================================================================
==51490==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6040000113fd at pc 0x564ec89e0edc bp 0x7ffcffffba70 sp 0x7ffcffffba60
READ of size 1 at 0x6040000113fd thread T0
    #0 0x564ec89e0edb in skip_to_next_entry Python/ceval.c:4798
    #1 0x564ec89e0edb in get_exception_handler Python/ceval.c:4866
    #2 0x564ec89e0edb in _PyEval_EvalFrameDefault Python/ceval.c:4465
    #3 0x564ec8be30d7 in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    #4 0x564ec8be30d7 in _PyEval_Vector Python/ceval.c:5160
    #5 0x564ec8a2b22e in _PyObject_VectorcallTstate Include/cpython/abstract.h:114
    #6 0x564ec8a2b22e in object_vacall Objects/call.c:734
    #7 0x564ec8a30c50 in _PyObject_CallMethodIdObjArgs Objects/call.c:825
    #8 0x564ec8c5bdd7 in import_find_and_load Python/import.c:1499
    #9 0x564ec8c5bdd7 in PyImport_ImportModuleLevelObject Python/import.c:1600
    #10 0x564ec89ea39b in import_name Python/ceval.c:6101
    #11 0x564ec89ea39b in _PyEval_EvalFrameDefault Python/ceval.c:3693
    #12 0x564ec8be29ea in _PyEval_EvalFrame Include/internal/pycore_ceval.h:46
    #13 0x564ec8be29ea in _PyEval_Vector Python/ceval.c:5160
    #14 0x564ec8be29ea in PyEval_EvalCode Python/ceval.c:1136
....
msg393324 - (view) Author: Ammar Askar (ammar2) * (Python triager) Date: 2021-05-09 16:37
Seconded, also seeing the same ASAN failure on the fuzzers with a blame for this commit.
msg393342 - (view) Author: Dennis Sweeney (Dennis Sweeney) * Date: 2021-05-09 23:26
I tried some debugging code:

diff --git a/Python/ceval.c b/Python/ceval.c
index f745067069..a8668dbac2 100644
--- a/Python/ceval.c
+++ b/Python/ceval.c
@@ -4864,6 +4864,18 @@ get_exception_handler(PyCodeObject *code, int index)
             return res;
         }
         scan = skip_to_next_entry(scan);
+        if (scan
+            >= (unsigned char *)PyBytes_AS_STRING(code->co_exceptiontable)
+            + PyBytes_GET_SIZE(code->co_exceptiontable))
+        {
+            printf("co_name: --------------------------\n");
+            _PyObject_Dump(code->co_name);
+            printf("co_filename: ----------------------\n");
+            _PyObject_Dump(code->co_filename);
+            printf("co_exceptiontable: -------------\n");
+            _PyObject_Dump(code->co_exceptiontable);
+            printf("\n\n\n\n\n");
+        }
     }
     res.b_handler = -1;
     return res;


It output this:

Python 3.11.0a0 (heads/main-dirty:092f9ddb5e, May  9 2021, 18:45:56) [MSC v.1927 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from test.test_statistics import *
co_name: --------------------------
object address  : 00000254B63EFB80
object refcount : 7
object type     : 00007FFA1C7E71C0
object type name: str
object repr     : '_find_and_load'
co_filename: ----------------------
object address  : 00000254B63967A0
object refcount : 76
object type     : 00007FFA1C7E71C0
object type name: str
object repr     : '<frozen importlib._bootstrap>'
co_exceptiontable: -------------
object address  : 00000254B63EB290
object refcount : 1
object type     : 00007FFA1C7C9A40
object type name: bytes
object repr     : b'\x84\x10"\x03\xa2\x04&\x0b\xa7\x03&\x0b'


>>> unittest.main()
............................................................................................................................................................................................................................................................................................................................................................................
----------------------------------------------------------------------
Ran 364 tests in 24.409s

OK


Here is the disassembly of the offending function:

>>> from dis import dis
>>> from importlib._bootstrap import _find_and_load
>>> dis(_find_and_load)
1024           0 LOAD_GLOBAL              0 (_ModuleLockManager)
               2 LOAD_FAST                0 (name)
               4 CALL_FUNCTION            1
               6 BEFORE_WITH
               8 POP_TOP

1025          10 LOAD_GLOBAL              1 (sys)
              12 LOAD_ATTR                2 (modules)
              14 LOAD_METHOD              3 (get)
              16 LOAD_FAST                0 (name)
              18 LOAD_GLOBAL              4 (_NEEDS_LOADING)
              20 CALL_METHOD              2
              22 STORE_FAST               2 (module)

1026          24 LOAD_FAST                2 (module)
              26 LOAD_GLOBAL              4 (_NEEDS_LOADING)
              28 IS_OP                    0
              30 POP_JUMP_IF_FALSE       27 (to 54)

1027          32 LOAD_GLOBAL              5 (_find_and_load_unlocked)
              34 LOAD_FAST                0 (name)
              36 LOAD_FAST                1 (import_)
              38 CALL_FUNCTION            2

1024          40 ROT_TWO
              42 LOAD_CONST               1 (None)
              44 DUP_TOP
              46 DUP_TOP
              48 CALL_FUNCTION            3
              50 POP_TOP

1027          52 RETURN_VALUE

1026     >>   54 NOP

1024          56 LOAD_CONST               1 (None)
              58 DUP_TOP
              60 DUP_TOP
              62 CALL_FUNCTION            3
              64 POP_TOP
              66 JUMP_FORWARD            11 (to 90)
         >>   68 PUSH_EXC_INFO
              70 WITH_EXCEPT_START
              72 POP_JUMP_IF_TRUE        39 (to 78)
              74 RERAISE                  4
         >>   76 POP_EXCEPT_AND_RERAISE
         >>   78 POP_TOP
              80 POP_TOP
              82 POP_TOP
              84 POP_EXCEPT
              86 POP_TOP
              88 POP_TOP

1029     >>   90 LOAD_FAST                2 (module)
              92 LOAD_CONST               1 (None)
              94 IS_OP                    0
              96 POP_JUMP_IF_FALSE       60 (to 120)

1030          98 LOAD_CONST               2 ('import of {} halted; None in sys.modules')

1031         100 LOAD_METHOD              6 (format)
             102 LOAD_FAST                0 (name)
             104 CALL_METHOD              1

1030         106 STORE_FAST               3 (message)

1032         108 LOAD_GLOBAL              7 (ModuleNotFoundError)
             110 LOAD_FAST                3 (message)
             112 LOAD_FAST                0 (name)
             114 LOAD_CONST               3 (('name',))
             116 CALL_FUNCTION_KW         2
             118 RAISE_VARARGS            1

1034     >>  120 LOAD_GLOBAL              8 (_lock_unlock_module)
             122 LOAD_FAST                0 (name)
             124 CALL_FUNCTION            1
             126 POP_TOP

1035         128 LOAD_FAST                2 (module)
             130 RETURN_VALUE
ExceptionTable:
  8 to 38 -> 68 [1] lasti
  68 to 74 -> 76 [5] lasti
  78 to 82 -> 76 [5] lasti


I don't know whether there just needs to be a sentinel 128 appended to all co_exceptiontable, or if there is a more subtle bug.
msg393372 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2021-05-10 09:24
Thanks everyone for the triaging and fixing.
msg393433 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-05-10 21:26
It seems that we have broken the stable ABI of PyCode_NewWithPosOnlyArgs as it has now another parameter. We need to either create a new private constructor or a new public constructor, but the ABI cannot change.
msg393448 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2021-05-11 09:13
I know PyCode_NewWithPosOnlyArgs is declared as "PyAPI_FUNC" but that can't make it part of the ABI unless it has stable behavior.
It can't have stable behavior because its inputs are complex, undefined, have altered semantics and are interlinked in complex ways.

Passing the same arguments to PyCode_NewWithPosOnlyArgs for both 3.9 and 3.10 will cause one or other version to crash (interpreter crash, not just program crash).


We need to stop adding "PyAPI_FUNC" to everything.
Adding a PyAPI_FUNC does not magically make for ABI compatibility, there is a lot more to it than that.

The only sane ways to construct a code object are to load it from disk, to compile an AST, or to use
codeobject.replace(). Any purported ABI compatibility claims are just misleading and a trap.


I can revert the API changes and add a new function, but I think that is dangerously misleading. A compilation error is preferable to an interpreter crash.
msg393455 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-05-11 12:53
I agree with you but we already went through this when I added positional-only arguments and everyone complained that Cython and other projects were broken and we changed a stable API function so I am just mentioning that we are here again.

Honestly, I think what you describe makes sense and this constructor should never be stable (as the Python one is not stable). If we add column offsets that's another parameter more than we would need to add. But in any case that's just my opinion and we should reach some conclusion collectively, that's why I mentioned this here.
msg393456 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-05-11 13:05
In any case, if we decide to let it stay, at the very least this behaviour (that these functions are not stable) needs to be documented everywhere: what's new, C-API docs...etc And probably we need to somehow add it to the future deprecations of 3.10 for visibility.

As I mentioned, I simphatise with your argument and I think it makes sense, but we cannot just do it in a BPO issue, I'm afraid.
msg393458 - (view) Author: Petr Viktorin (petr.viktorin) * (Python committer) Date: 2021-05-11 14:02
PyCode_NewWithPosOnlyArgs is not part of the stable ABI. It is OK to break its ABI in a minor version (i.e. 3.11).

The PyAPI_FUNC makes it part of the public *API*. It needs to be source- compatible; the number of arguments can't change. Could yo u add a new function?

I wouldn't remove PyCode_NewWithPosOnlyArgs from the public C API, which can be CPython-specific and used by projects like Cython that need some low-level access for performance. But PEP 387 applies, so if it is deprecated in 3.11, it can be removed in 3.13.
msg393460 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-05-11 14:12
> The PyAPI_FUNC makes it part of the public *API*. It needs to be source- compatible; the number of arguments can't change. Could yo u add a new function?

Unfortunately, no: new functions cannot be added easily because the new field that receives is needed and is a complicated field created by the compiler. The old API is not enough anymore and making a compatibility layer is a huge complexity.
msg393461 - (view) Author: Petr Viktorin (petr.viktorin) * (Python committer) Date: 2021-05-11 14:21
Then, according to PEP 387, "The steering council may grant exceptions to this policy."

I think API breaks like this do need coordination at the project level.
msg393462 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-05-11 14:30
> I think API breaks like this do need coordination at the project level.

Absolutely, that's why I said before:

> As I mentioned, I simphatise with your argument and I think it makes 
 sense, but we cannot just do it in a BPO issue, I'm afraid.
msg393463 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2021-05-11 14:37
It is very little effort to add back the old function, so that isn't the problem. It won't work properly, but it never did anyway. So I guess that's sort of compatible.

Maybe the best thing is to put a big red warning in the docs and hope that warns away people from using it?
msg393464 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-05-11 14:52
> It is very little effort to add back the old function, so that isn't the problem. It won't work properly, but it never did anyway. So I guess that's sort of compatible.

It won't work properly is an incompatible change. Before, if you extract all fields from a code object and pass it down to the constructor, everything will work.

> Maybe the best thing is to put a big red warning in the docs and hope that warns away people from using it?

I think code object constructors must be part of the private CAPI due to what we are experiencing. But again, this is something we cannot decide on this bpo issue. Either a python-dev thread needs to be open or a Steering Council request in the https://github.com/python/steering-council repo needs to be opened.
msg393470 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2021-05-11 20:17
Just a comment regarding the change to "PyCode_NewWithPosOnlyArgs()". As Pablo mentioned, this has happened before. And that's OK! Exactly because this has happened before, it's clearly not a part of the API that is meant to be stable.

I can easily adapt Cython to make this work in the next patch-level release of CPython 3.11 (or the current one, since alpha-1 seems not so close), but any adaptation will be patch-level dependent. Meaning, for each such change, there will be a couple of weeks or months until the C preprocessor makes the code compile again. And during that time, people won't be able to test their code to report issues.

So, I'd rather have compatibility broken and stay that way, than going one way now and changing it back later, thus going through the same adaptation period twice.

That being said, any such change means that maintainers will have to rebuild their packages with a new Cython release to adapt them to Py3.11. Many will, but some won't, for whatever reason.
History
Date User Action Args
2021-05-12 10:25:15Mark.Shannonsetpull_requests: + pull_request24699
2021-05-11 20:17:57scodersetnosy: + scoder
messages: + msg393470
2021-05-11 14:52:29pablogsalsetmessages: + msg393464
2021-05-11 14:37:24Mark.Shannonsetmessages: + msg393463
2021-05-11 14:30:20pablogsalsetmessages: + msg393462
2021-05-11 14:21:58petr.viktorinsetmessages: + msg393461
2021-05-11 14:12:18pablogsalsetmessages: + msg393460
2021-05-11 14:02:43petr.viktorinsetmessages: + msg393458
2021-05-11 13:05:37pablogsalsetmessages: + msg393456
2021-05-11 12:53:59pablogsalsetmessages: + msg393455
2021-05-11 09:13:58Mark.Shannonsetmessages: + msg393448
2021-05-10 21:27:21pablogsalsetnosy: + petr.viktorin
2021-05-10 21:26:59pablogsalsetmessages: + msg393433
2021-05-10 16:46:11Mark.Shannonsetpull_requests: + pull_request24671
2021-05-10 09:24:38Mark.Shannonsetmessages: + msg393372
2021-05-10 06:39:16Dennis Sweeneysetpull_requests: + pull_request24662
2021-05-09 23:26:26Dennis Sweeneysetnosy: + Dennis Sweeney
messages: + msg393342
2021-05-09 16:37:50ammar2setnosy: + ammar2
messages: + msg393324
2021-05-09 15:04:26pablogsalsetmessages: + msg393322
2021-05-09 15:02:20pablogsalsetpriority: normal -> release blocker
2021-05-09 14:54:47pablogsalsetnosy: + pablogsal
messages: + msg393321
2021-04-30 13:27:37Mark.Shannonsetkeywords: + patch
stage: patch review
pull_requests: + pull_request24440
2021-04-29 13:53:50Mark.Shannonsetmessages: + msg392309
2021-04-29 13:37:18Mark.Shannonsetmessages: + msg392306
2021-04-29 13:31:19Mark.Shannonsetmessages: + msg392305
2021-04-22 07:23:06erlendaaslandsetnosy: + erlendaasland
2021-04-21 14:13:57Guido.van.Rossumsetnosy: + Guido.van.Rossum
messages: + msg391522
2021-04-21 12:01:37serhiy.storchakasetmessages: + msg391513
2021-04-21 10:23:21Mark.Shannonsetmessages: + msg391509
2021-04-15 16:46:41gvanrossumsetnosy: + gvanrossum
2020-06-03 08:39:20serhiy.storchakasetmessages: + msg370645
2020-06-03 06:07:08chris.jerdoneksetnosy: + chris.jerdonek
messages: + msg370642
2020-06-02 15:28:04corona10setnosy: + corona10
messages: + msg370612
2020-06-02 15:02:50Yonatan Goldschmidtsetnosy: + Yonatan Goldschmidt
2020-04-22 05:38:29rhettingersetnosy: + rhettinger
messages: + msg366976
2020-04-22 04:59:47hauntsaninjasetnosy: + hauntsaninja
2020-04-08 10:23:55serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg365975
2020-04-08 09:41:03Mark.Shannoncreate