Issue 17611: Move unwinding of stack for "pseudo exceptions" from interpreter to compiler.

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/61811

classification

Title:	Move unwinding of stack for "pseudo exceptions" from interpreter to compiler.
Type:	enhancement	Stage:	resolved
Components:	Interpreter Core	Versions:	Python 3.8

process

Status:	closed	Resolution:	fixed
Dependencies:	24340 32416	Superseder:
Assigned To:		Nosy List:	Demur Rumed, Mark.Shannon, benjamin.peterson, christian.heimes, mark.dickinson, nascheme, ncoghlan, pitrou, rhettinger, serhiy.storchaka, trent
Priority:	normal	Keywords:	patch

Created on 2013-04-01 16:31 by Mark.Shannon, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
b16527f84774.diff	Mark.Shannon, 2013-04-01 16:31		review
unwinding.patch	pitrou, 2014-04-29 18:21		review
perf_unwind_stack.txt	nascheme, 2017-11-06 22:41

Pull Requests
URL	Status	Linked	Edit
PR 2827	closed	pitrou, 2017-07-23 19:20
PR 4682	closed	nascheme, 2017-12-02 22:16
PR 5006	merged	serhiy.storchaka, 2017-12-24 19:24
PR 5071	closed	Mark.Shannon, 2018-01-01 14:09
PR 5143	closed	serhiy.storchaka, 2018-01-09 21:46

Repositories containing patches
https://bitbucket.org/markshannon/cpython_simpler_bytecode

Messages (81)
msg185741 - (view)	Author: Mark Shannon (Mark.Shannon) *	Date: 2013-04-01 16:31
The handling of "pseudo exceptions" (return, break and continue) are currently handled in the interpreter. This make the interpreter loop more complex and slower than it needs to be. This change moves the handling of pseudo exceptions into the compiler. The net effects of this patch are: Simpler interpreter loop: no 'psuedo-exceptions', fewer bytecodes and some simplifed bytecodes. Eliminate the 'why_code' state variable in the interpreter. Execution is always in the 'normal' state except during explicit exception handling. Small increase in size and complexity of compiler. Speedup of 1.5% (Intel i7); this should be considered a happy side-effect rather than a motivation for the change.
msg185909 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2013-04-03 10:20
Simplifying the eval loop is a rather good thing. Could you post examples of bytecode disassembly before and after the patch?
msg185916 - (view)	Author: Stefan Krah (skrah) *	Date: 2013-04-03 11:23
The change seems to slow down the pi float benchmark. Run this and hit Ctrl-C after the first decimal result appears: ./python Modules/_decimal/tests/bench.py I've run the benchmark seven times and the average for float is something like 8-10% slower (64-bit Linux, Core 2 Duo). Decimal is not affected.
msg185946 - (view)	Author: Mark Shannon (Mark.Shannon) *	Date: 2013-04-03 19:39
Stefan, I tried running your pi_bench (increasing the numer of iterations by x10 to reduce jitter). The run to run variation exceed the difference between the two versions. Here are the fastest of 7 runs on my machine. i3-2370M CPU @ 2.40GHz × 4. linux 64bit. With patch: 1.154152s Without patch: 1.157730s I'm somewhat surprised by the slowdown on your machine. There should be no difference in the bytecodes executed or their implementation in the pi_float loop.
msg185950 - (view)	Author: Mark Shannon (Mark.Shannon) *	Date: 2013-04-03 20:25
Antoine, Two bytecode examples. For the following function: >>> def f(x): ... try: ... return x ... except: ... pass Without the patch: 2 0 SETUP_EXCEPT 8 (to 11) 3 3 LOAD_FAST 0 (x) 6 RETURN_VALUE 7 POP_BLOCK 8 JUMP_FORWARD 8 (to 19) 4 >> 11 POP_TOP 12 POP_TOP 13 POP_TOP 5 14 POP_EXCEPT 15 JUMP_FORWARD 1 (to 19) 18 END_FINALLY >> 19 LOAD_CONST 0 (None) 22 RETURN_VALUE With the patch: 2 0 SETUP_EXCEPT 9 (to 12) 3 3 LOAD_FAST 0 (x) 6 POP_BLOCK 7 RETURN_VALUE 8 POP_BLOCK 9 JUMP_FORWARD 8 (to 20) 4 >> 12 POP_TOP 13 POP_TOP 14 POP_TOP 5 15 POP_EXCEPT 16 JUMP_FORWARD 1 (to 20) 19 RERAISE >> 20 LOAD_CONST 0 (None) 23 RETURN_VALUE The difference is the 'POP_BLOCK' inserted at offset 7 before the 'RETURN_VALUE', so that the RETURN_VALUE bytecode does not have to do any unwinding of the block stack. For loops the SETUP_LOOP and the matching POP_BLOCK are eliminated: >>> def f(x): ... for y in x: ... g(x) Without the patch: 2 0 SETUP_LOOP 24 (to 27) 3 LOAD_FAST 0 (x) 6 GET_ITER >> 7 FOR_ITER 16 (to 26) 10 STORE_FAST 1 (y) 3 13 LOAD_GLOBAL 0 (g) 16 LOAD_FAST 1 (y) 19 CALL_FUNCTION 1 (1 positional, 0 keyword pair) 22 POP_TOP 23 JUMP_ABSOLUTE 7 >> 26 POP_BLOCK >> 27 LOAD_CONST 0 (None) 30 RETURN_VALUE With the patch: 2 0 LOAD_FAST 0 (x) 3 GET_ITER >> 4 FOR_ITER 16 (to 23) 7 STORE_FAST 1 (y) 3 10 LOAD_GLOBAL 0 (g) 13 LOAD_FAST 0 (x) 16 CALL_FUNCTION 1 (1 positional, 0 keyword pair) 19 POP_TOP 20 END_ITER 4 >> 23 LOAD_CONST 0 (None) 26 RETURN_VALUE END_ITER is a synonym for JUMP_ABSOLUTE. It is needed so that frame.set_lineno() can identify loop blocks.
msg185951 - (view)	Author: Stefan Krah (skrah) *	Date: 2013-04-03 20:32
I'm surprised, too, but after a couple of retries the results stay the same. On an i7 there's also a difference, but not quite as large. I'm using b16527f84774.diff, which applies cleanly apart from importlib.h (which I just regenerated). With an increased loop count the difference is also the same (10% on the Core2 Duo). Core2 Duo, 3.16GHz, Linux 64-bit ================================ patched: -------- (0.10855+0.10901+0.1168+0.1119+0.1095+0.1090+0.1084+0.1130+0.1235+0.1183) / 10 = 0.113 unpatched: ---------- (0.0995+0.1049+0.1009+0.1021+0.0986+0.1005+0.0988+0.10171+0.0987+0.0992) / 10 = 0.10 (i7, 2.80GHz, Linux 64-bit) ============================ patched: -------- (0.1021+0.1078+0.1072+0.1058+0.1091+0.1067+0.1064+0.1071+0.1069+0.1067) / 10 = 0.107 unpatched: ---------- (0.1029+0.1033+0.1023+0.1029+0.1050+0.1020+0.1031+0.1024+0.10149+0.10247) / 10 = 0.103
msg185952 - (view)	Author: Stefan Krah (skrah) *	Date: 2013-04-03 20:53
BTW, there's no general slowdown on the Core2 Duo: In the patched version, the telco benchmark is consistently 4% faster.
msg185954 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2013-04-03 20:57
> BTW, there's no general slowdown on the Core2 Duo: In the patched version, > the telco benchmark is consistently 4% faster. It's customary for even innocent changes in ceval to produce small changes in some benchmarks. It's only really important to consider the global average.
msg185955 - (view)	Author: Stefan Krah (skrah) *	Date: 2013-04-03 21:28
Antoine Pitrou <report@bugs.python.org> wrote: > It's customary for even innocent changes in ceval to produce small > changes in some benchmarks. It's only really important to consider the > global average. Yes, the float benchmark appears to be particularly sensitive. In the 3.3.0 release it also runs 8% slower than it does now (unpatched). I'm trying to keep track of changes because in 2.7 the (float) benchmark runs something like 25% faster than in 3.3.0.
msg217534 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2014-04-29 18:21
Here is an updated patch keeping up with recent changes in the source tree.
msg228816 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2014-10-08 20:53
> END_ITER is a synonym for JUMP_ABSOLUTE. It is needed so that frame.set_lineno() can identify loop blocks. I was thinking about this. Isn't it enough to consider all backwards edges as ending a loop block?
msg228817 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2014-10-08 20:53
Ah, no, I guess "continue" would also create a backwards edge. Nevermind, sorry.
msg231250 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2014-11-16 15:06
I like the idea. I have not make faultfinding review yet, but one thing is questionable to me. Is it necessary to get rid of the END_FINALLY opcode? Currently Python code try: stmt1 finally: stmt2 is compiled to: SETUP_FINALLY L1 // stmt1 POP_BLOCK LOAD_CONST None L1: // stmt2 END_FINALLY With the patch it is compiled to: SETUP_FINALLY L1 // stmt1 POP_BLOCK // stmt2 JUMP_ABSOLUTE L2 L1: // stmt2 RERAISE L2: Finally body is duplicated. In case of large finally body this increases the size of the bytecode. Line numbers are duplicated in disassembly listing, this is confused. And I afraid that due to increasing the size of the bytecode, some relative jumps can became too long. If it is absolutely necessary to remove the END_FINALLY opcode, I suggest to generate following bytecode: SETUP_FINALLY L1 // stmt1 POP_BLOCK LOAD_CONST None L1: // stmt2 POP_JUMP_IF_FALSE L2 RERAISE L2: But it looks to me that special END_FINALLY opcode which combines POP_JUMP_IF_FALSE/RERAISE would be better for performance and compiled code size.
msg258673 - (view)	Author: STINNER Victor (vstinner) *	Date: 2016-01-20 11:37
FYI The issue #26107 changed code.co_lnotab to support negative line number deltas.
msg267329 - (view)	Author: Philip Dubé (Demur Rumed) *	Date: 2016-06-04 22:05
Java duplicates finally bytecode too: http://cliffhacks.blogspot.ca/2008/02/java-6-tryfinally-compilation-without.html Tho they avoid jsr instruction because it causes non trivial bytecode validation. Still, they would've had to concluded that finally blocks are often small enough to be alright to inline
msg267418 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2016-06-05 15:03
The patch no longer applied cleanly. Mark, can you please update your patch?
msg268456 - (view)	Author: Neil Schemenauer (nascheme) *	Date: 2016-06-13 16:46
This looks to be a good idea and a good time to merge it now the bytecode has changed to 16-bit. The increase in complexity to compile.c is not bad and reducing the complexity of the eval loop is worth it, IMHO.
msg298910 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2017-07-23 19:22
I updated the patch for git master. Some cleanup is in order.
msg298934 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2017-07-24 08:09
So, the approach of duplicating finally blocks tends to lead to a non-trivial bytecode increase. There is even a potential combinatorial explosion with nested "try..finally" block: def f(): try: ... finally: try: ... finally: # etc. Such a chain of N nested "finally"s will emit O(2**N) opcodes. I'm not sure it's a pratical concern but I find it is detrimental to the readibility of bytecode (which is a nice thing to have when doing low-level debugging or tweaking). I think we can massage the PR to remove the "finally" block duplication while keeping the predictable stack effect.
msg298935 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2017-07-24 08:25
For example we could generate the following bytecode: SETUP_FINALLY L1 // ... stmt1 POP_BLOCK JUMP_FINALLY L1 L1: // ... stmt2 RERAISE JUMP_FORWARD L2 L2: JUMP_FINALLY would push 3 Nones on the stack. RERAISE would raise only if non-None values are popped.
msg298943 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2017-07-24 09:21
Ok, I've pushed an update with removes the finally block duplication.
msg298949 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2017-07-24 09:59
Great! I tried to update the patch myself, but there were too much conflicts. Thank you very much Antoine for taking this! Seems there are bugs in frame_setlineno() related to code duplications. And deduplicating the 'finally' blocks doesn't solve all of them. See comments on GitHub. I don't like the new END_ITER instruction. This complicates (and slows down) the evaluation loop and the peepholer. This also adds a limitation on the peepholer (END_ITER can't be optimized out). We can use other way for determaining the end of the 'for' block. The argument of the FOR_ITER instruction points to the end of the 'for' block. Just scan all addresses from 0 to max_addr and count all FOR_ITER instructions and their targets in the range between min_addr and max_addr.
msg298960 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2017-07-24 11:47
> I don't like the new END_ITER instruction. This complicates (and slows down) the evaluation loop and the peepholer. I don't think it slows down anything in the eval loop. I agree it adds a bit of complexity. > This also adds a limitation on the peepholer (END_ITER can't be optimized out). It's an unconditional backedge, how could you optimize it?
msg298972 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2017-07-24 14:05
Ok, I've now gotten rid of END_ITER.
msg305684 - (view)	Author: Neil Schemenauer (nascheme) *	Date: 2017-11-06 22:41
The attached pyperformance report compares cpython:master to nascheme:unwind_stack. If I've done the tests correctly, the unwind_stack version is slightly faster.
msg307477 - (view)	Author: Nick Coghlan (ncoghlan) *	Date: 2017-12-03 00:39
I like this change, and think we should go ahead with it, but just wanted to note that I suspect it may make the "Badly timed signals may lead to __exit__ being skipped even after __enter__ succeeded" problem slightly easier to hit: https://bugs.python.org/issue29988 That's not a new problem though, and this change should make it easier to apply the conventional solutions (i.e. only checking for signals when execution jumps backwards within a function, as well as in function pre-or-post-ambles)
msg307478 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2017-12-03 00:49
At this point in the release cycle, it would be nice to get this PR applied soon so that there is more time see if there are any minor follow-on issues.
msg307479 - (view)	Author: Nick Coghlan (ncoghlan) *	Date: 2017-12-03 00:50
While there are some good comments in line in the PR, I think it would be helpful if these changes were accompanied by some additions to https://devguide.python.org/compiler/ that explain how the various pieces of the solution work together to manage the execution stack.
msg307497 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2017-12-03 09:53
I would point out that Serhiy posted some comments in the followup PR which aren't addressed by Neil's followup PR. They relate to tracing.
msg307548 - (view)	Author: Mark Shannon (Mark.Shannon) *	Date: 2017-12-04 11:46
I plan to resurrect my original design over the Christmas break. The reason that I want to get back to the original design is its consistency and relative simplicity. Duplicating the finally block for every exit from the try body might sound expensive, but it only increases the size of the bytecode by an estimated 0.4%. Using statement counts as a proxy for bytecodes, I ran the code query https://lgtm.com/query/1505907426052/ over a range of open stack and other large projects. The size increase is in the range 0.26% to 0.44% (Note that the statement counts include dependencies) I plan to start from https://github.com/python/cpython/pull/2827/commits/693b9398b5fd202fa5864f9cc76fa1bc7f84f62e as that adheres to the original design, but cleans up the code. Antoine that is on your branch, are you OK with me appropriating it?
msg307549 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2017-12-04 11:50
Le 04/12/2017 à 12:46, Mark Shannon a écrit : > > I plan to start from > https://github.com/python/cpython/pull/2827/commits/693b9398b5fd202fa5864f9cc76fa1bc7f84f62e > as that adheres to the original design, but cleans up the code. > Antoine that is on your branch, are you OK with me appropriating it? Definitely.
msg307555 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2017-12-04 12:25
I started on Antoine's PR and work on different approach (https://github.com/serhiy-storchaka/cpython/pull/1) which don't duplicate the code for continue/break/return. Instead it uses some kind of subroutines. END_FINALLY expects the one of three cases: 1. NULL (or None). Normal execution thread in try/finally. Continue from the instruction following END_FINALLY. 2. An integer. This is an address of returning. Continue from the specified address. 3. An exception (6 items). Raises the specified exception. WITH_CLEANUP_FINISH behaves similarly. The statements continue/break/return insert the instruction CALL_FINALLY which pushes the address of the following instruction on the stack and jumps to the start of the finally (or with cleanup) block. There can be several CALL_FINALLY instructions if you need to execute several finally blocks. At the jump instruction is inserted for continue and break, and RETURN_VALUE for return. Currently I'm trying to simplify the code.
msg307558 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2017-12-04 12:33
Le 04/12/2017 à 13:25, Serhiy Storchaka a écrit : > > I started on Antoine's PR and work on different approach (https://github.com/serhiy-storchaka/cpython/pull/1) which don't duplicate the code for continue/break/return. Instead it uses some kind of subroutines. END_FINALLY expects the one of three cases: > > 1. NULL (or None). Normal execution thread in try/finally. Continue from the instruction following END_FINALLY. > > 2. An integer. This is an address of returning. Continue from the specified address. > > 3. An exception (6 items). Raises the specified exception. The problem is that makes the stack consumption of END_FINALLY variable. How about always consuming 6 items, even when most of them are unused padding?
msg307564 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2017-12-04 13:16
> The problem is that makes the stack consumption of END_FINALLY variable. > How about always consuming 6 items, even when most of them are unused > padding? Right. Though the value of the stack increment makes sense only for the case 1, because in other cases the next executed instruction is not the one following END_FINALLY, but we need to reserve the stack for the case 3, and counting additional "virtual" items in PUSH_NO_EXCEPT and END_FINALLY will help to calculate the maximal stack consumption, even when in runtime they push/pop only one item. This is yet one argument for special purposed opcode PUSH_NO_EXCEPT (maybe I'll rename it to START_FINALLY or BEGIN_FINALLY). Other reason -- this will help to determine bounds of finally blocks in the frame.f_lineno setter.
msg307570 - (view)	Author: Mark Shannon (Mark.Shannon) *	Date: 2017-12-04 15:21
Serhiy, I assume that you plan to use something like the JVM's JSR/RET instruction pair. https://docs.oracle.com/javase/specs/jvms/se9/html/jvms-6.html Is that right? My reasons for preferring the finally-block duplication approach is that it keeps the interpreter simpler and minimises the amount work done when no exception is raised. As demonstrated, the increase in the static size of the bytecode is negligible.
msg307572 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2017-12-04 15:49
Le 04/12/2017 à 16:21, Mark Shannon a écrit : > > My reasons for preferring the finally-block duplication approach is that it keeps the interpreter simpler and minimises the amount work done when no exception is raised. As demonstrated, the increase in the static size of the bytecode is negligible. Whichever approach wins at the end, I would like it to reuse the tests I wrote to check that no stack size explosion happens when combining or juxtaposing several control flow constructs.
msg307577 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2017-12-04 16:56
Right, this is similar to how the JSR/RET pair was used in for calling the finally block. It seems the main drawback of duplicating of the finally code is not increasing the code size (which can be exponential in degenerate cases), but the problem with determining the boundaries of the finally block. The main purpose of this issue (getting rid of "pseudo exceptions" in the interpreter) will be achieved in any case. This is the first step. In the second step I'm going to get rid of dynamic PyTryBlock and instructions like SETUP_EXCEPT and POP_BLOCK. I think all necessary information can be determined at compile time and saved in an auxiliary structure, similarly as line numbers are saved in the packed co_lnotab array instead of be set by SET_LINENO instructions. Boundaries of finally block could be stored in the same structure. This will make try/except/ finally virtually zero-overhead. In the third step we could generate duplicates of finally blocks. This will be easier if other changes already applied and tested. Currently there is other problem with Antoine's PR (I didn't know about them when made a review). There is a gap between calling the __enter__ method and the SETUP_FINALLY instruction. If the exception is raised in the gap, the __exit__ method will be never called. For example: a = [] with CM() as a[0]: # IndexError ... I still haven't fixed this problem. Maybe the simplest solution would be to return the SETUP_WITH instruction which calls __enter__ and creates the finally block prior to saving the result of __enter__.
msg307582 - (view)	Author: Mark Shannon (Mark.Shannon) *	Date: 2017-12-04 17:19
On 04/12/17 16:56, Serhiy Storchaka wrote: > > Serhiy Storchaka <storchaka+cpython@gmail.com> added the comment: > > Right, this is similar to how the JSR/RET pair was used in for calling the > finally block. > > It seems the main drawback of duplicating of the finally code is not increasing > the code size (which can be exponential in degenerate cases), but the problem > with determining the boundaries of the finally block. What exactly is the problem? Why do you need to determine the boundaries of finally blocks? ... > > This is the first step. In the second step I'm going to get rid of dynamic > PyTryBlock and instructions like SETUP_EXCEPT and POP_BLOCK. I think all > necessary information can be determined at compile time and saved in an > auxiliary structure, similarly as line numbers are saved in the packed > co_lnotab array instead of be set by SET_LINENO instructions. Boundaries of > finally block could be stored in the same structure. This will make try/except/ > finally virtually zero-overhead. Fine, but it isn't necessary for this issue. The overhead of try/except is already quite low. > > Currently there is other problem with Antoine's PR (I didn't know about them > when made a review). There is a gap between calling the __enter__ method and > the SETUP_FINALLY instruction. That wasn't a problem with the original patch.
msg307584 - (view)	Author: Mark Shannon (Mark.Shannon) *	Date: 2017-12-04 17:41
Actually looking back at the original patch, the gap between __enter__ and SETUP_FINALLY was in the original patch. My mistake. The fix would to be restore SETUP_WITH. The complexity we really want to get rid of is at the end of the with statement not the start.
msg307585 - (view)	Author: Neil Schemenauer (nascheme) *	Date: 2017-12-04 17:53
After studying the patch and doing some reading, I prefer the finally-block duplication approach as well. Java does it that way as well and it works for them. It would be be interesting to compile a large body of packages and see what the increase in bytecode size actually is with the duplication. My gut feeling is that it will not be a big deal. There is a bug with the PR regarding the final bodies. Exits from the final body cause the whole fblock stack to unwind, not just the blocks enclosing the final body. Unwind looks at 'c' to get the fblock stack. I think we need to allocate fblockinfo on the C stack and then use a back pointer to enclosing block. When you get into a final body that creates its own fblockinfo (e.g. a try/except inside the finally), the current code doesn't work. The fact that the whole test suite passes with these issues tells me that the amount of stuff happening in final bodies must be pretty simple in most code. I spent a good part of Sunday trying to understand how the PR works. It seems to me that the ceval/bytecode changes are pretty solid. The compiler needs some bug fixes. Further optimisations could be done at a later time. I'm curious to see Serhiy's approach though.
msg307656 - (view)	Author: Mark Shannon (Mark.Shannon) *	Date: 2017-12-05 11:11
On 04/12/17 17:53, Neil Schemenauer wrote: > There is a bug with the PR regarding the final bodies. Exits from the final body cause the whole fblock stack to unwind, not just the blocks enclosing the final body. Unwind looks at 'c' to get the fblock stack. I think we need to allocate fblockinfo on the C stack and then use a back pointer to enclosing block. When you get into a final body that creates its own fblockinfo (e.g. a try/except inside the finally), the current code doesn't work. I don't really follow you. Could you provide a code example that will expose this error?
msg307671 - (view)	Author: Neil Schemenauer (nascheme) *	Date: 2017-12-05 17:21
There is some more explanation in the PR and sample code. We unwind, if we hit a finally fblock, we emit code of the body of it. If inside the block, there is another return statement, we unwind again. That causes an infinite loop in the compiler. The commit 0610860 is a fix, I think. I have a cleaner version but haven't pushed it yet. There are still some remaining bugs though, caused by "return" in the finally body. The problem as I see it is that "unwind_exception" in ceval pushes a EXCEPT_HANDLER fblock if we are inside a SETUP_EXCEPT or SETUP_FINALLY block. If there is a return in the finally body, the compiler doesn't know if it should POP_BLOCK or not. At least, that is my understanding now. I think the best way forward is to split this patch into multiple parts. I have a simple patch that changes fblockinfo to be a singly linked list (like blockinfo). Next, I think we could change the exception opcodes to have a fixed stack effect (don't think that requires unwind to be done by compiler). Rather than pushing three values for an exception, could we just build a tuple and push that? Seems simpler to me vs having ROT_FOUR. Finally, we can tackle the compiler based unwind.
msg307680 - (view)	Author: Mark Shannon (Mark.Shannon) *	Date: 2017-12-05 20:16
0610860 doesn't include any tests. What is it fixing? 3794016 passes the same set of tests. Do have an actual code example that produces erroneous behaviour?
msg307687 - (view)	Author: Neil Schemenauer (nascheme) *	Date: 2017-12-05 21:24
def func(): try: try: raise RuntimeError except: return 1 finally: return 3 func() Sorry, I've been collecting a whole slew of code snippets that cause issues with the "unwind_stack" branch. I haven't organized them and made unit tests out of them though. Two packages I found hairy try/except/finally code are "tox" and "gevent".
msg307827 - (view)	Author: Mark Shannon (Mark.Shannon) *	Date: 2017-12-07 23:04
Thanks, that example shows the essence of the problem. I understand you perfectly now.
msg308928 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2017-12-22 08:55
Are one of the variants of this patch ready to apply? I'm hoping that the window for 3.7 isn't missed.
msg308939 - (view)	Author: Neil Schemenauer (nascheme) *	Date: 2017-12-22 18:24
Hello Raymond, I don't have anything ready to go. PR 4682 has some major outstanding issues. Specifically, exits from finally bodies did not cleanup the fblock stack correctly. I still think this patch is a good idea and that it can work. My goal is to have something that performs better but, more importantly is more understandable from a code maintenance point of view. I expect it will not be ready for 3.7 however.
msg308969 - (view)	Author: Neil Schemenauer (nascheme) *	Date: 2017-12-23 21:21
Ho ho ho! I spent some time to try to resolve the remaining issues with PR 4682. I think everything now works. Returns from a final body of a try/finally was the remaining issue. I've solved it by introducing the POP_NO_EXCEPT opcode. I added a number of comments to help explain to future maintainers what is going on. The PR still could use some polishing. I have some test cases to add, to trigger the previously buggy try/finally + return case. Also, I think the fblock_unwind_<block type> functions in the compiler could be done more simply. I don't think we need a separate function for each type, just use a case statement on the fblock type. That's a fairly minor detail though. I would also like to add a bunch more comments. I've spent many hours figuring out how all this stuff works, not that I totally understand everthing. The test case test_contextlib test_exit_exception_with_existing_context() was especially brain-busting. I'm a bit horrified that Python has become so complex to support such things. Anyhow, I'm off for holidays so no time for further polish until after Christmas.
msg309015 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2017-12-24 19:41
The main problem was with the frame.f_lineno setter. I have finally solved it in PR 5006. It solves also the regression in the with statement mentioned in msg307577 (by just restoring the original SETUP_WITH), and some other minor issues which I forgot. Tests for the frame.f_lineno setter are provided in issue32416. I want to backport these changes and add new tests in PR 5006 (which allows to catch more illegal jumps than the current code), thus a separate issue. PR 2827 and PR 4682 crash with these tests. Additionally PR 5006 may fix several issues with stack size calculation. I'll add tests later if there are reproducers.
msg309138 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2017-12-28 19:39
Results of microbenchmarks: $ ./python -m perf timeit -s 'a = list(range(1000))' -- 'for i in a: pass' Mean +- std dev: 6.31 us +- 0.09 us $ ./python -m perf timeit -s 'a = list(range(1000))' -- ' for i in a: try: pass finally: pass ' Unpatched: Mean +- std dev: 16.3 us +- 0.2 us PR 2827: Mean +- std dev: 16.2 us +- 0.2 us PR 4682: Mean +- std dev: 16.2 us +- 0.2 us PR 5006: Mean +- std dev: 14.5 us +- 0.4 us $ ./python -m perf timeit -s 'a = list(range(1000))' -- ' for i in a: try: continue finally: pass ' Unpatched: Mean +- std dev: 24.0 us +- 0.5 us PR 2827: Mean +- std dev: 11.9 us +- 0.1 us PR 4682: Mean +- std dev: 12.0 us +- 0.1 us PR 5006: Mean +- std dev: 19.0 us +- 0.3 us $ ./python -m perf timeit -s 'a = list(range(1000))' -- ' for i in a: while True: try: break finally: pass ' Unpatched: Mean +- std dev: 25.9 us +- 0.5 us PR 2827: Mean +- std dev: 11.9 us +- 0.1 us PR 4682: Mean +- std dev: 12.0 us +- 0.1 us PR 5006: Mean +- std dev: 18.9 us +- 0.1 us PR 2827 and PR 4682 have the same performance. The overhead of the finally block is smaller in PR 5006, perhaps because BEGIN_FINALLY pushes 1 NULL instead of 6 NULLs. CALL_FINALLY adds 4.5 ns in the latter too examples. This overhead could be decreased by using special cache for Python integers that represent return addresses or using separate stack for return addresses. But this looks as an overkill to me now. 4.5 ns is pretty small overhead, the simple `i = i` have the same timing.
msg309139 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2017-12-28 19:41
Yes, those benchmarks look fine. Have you tried with the benchmarks suite?
msg309141 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2017-12-28 19:53
I don't expect any differences in macrobenchamrks. try/except/finally are rarely used in tight loops.
msg309161 - (view)	Author: Neil Schemenauer (nascheme) *	Date: 2017-12-29 05:26
I wonder if I should drop PR 4682. I spent some more time working on it today. I switched to the same scheme as Serhiy for the no-exception case, i.e. push a single NULL value, rather than six NULLs. In ceval, we need to handle the non-exception case specially anyhow so I think it better to avoid the "stack churn". With the micro-benchmarks that Serhiy posted, PR 4682 is as fast or slightly faster than PR 5006. Doesn't really matter in real code though. I did not push my latest changes as I did not fix the frame.f_lineno issue yet. I think it is fixable without major changes. Should I bother though? Is there some other reason to prefer duplicating the final bodies rather than using a subroutine jump (as in 5006)? A very minor speedup is not worth it I think because the compiler is a bit more complicated. The settrace logic is more complicated too.
msg309163 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2017-12-29 08:23
Le 29/12/2017 à 06:26, Neil Schemenauer a écrit : > > I wonder if I should drop PR 4682. My main criterion (apart from performance issues which don't seem to exist) for choosing between the two PRs would be implementation complexity. I find Serhiy's PR pleasantly uncomplicated (it even simplifies the code in frameobject.c). Your PR seems to add more complexity, so I would be inclined to choose Serhiy's PR.
msg309169 - (view)	Author: Mark Shannon (Mark.Shannon) *	Date: 2017-12-29 11:48
Why all these competing pull requests? It does feel like my original patch has been hijacked. Also, before any more PRs, we need to decide whether to use subroutines or code duplication for finally blocks. Here is my attempt at an objective comparison of the two approaches. JSR style Code duplication Speed Slower Faster Interpreter More complex Simpler Bytecode generation Simpler More complex Bytecode size ~ +0.1% ~ +0.4% The increase in bytecode size is insignificant in terms of overall memory use, but the performance increase is significant (although small). Unless I have missed anything important, this says to me that the code duplication approach is better.
msg309170 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2017-12-29 12:00
Le 29/12/2017 à 12:48, Mark Shannon a écrit : > > Why all these competing pull requests? It does feel like my original patch has been hijacked. The original patch had a number of issues which needed to be fixed (in addition to forward-port it to the latest git master), such as not fixing some cases of stack consumption calculation. > Also, before any more PRs, we need to decide whether to use subroutines or code duplication for finally blocks. > > Here is my attempt at an objective comparison of the two approaches. > > JSR style Code duplication > Speed Slower Faster > Interpreter More complex Simpler > Bytecode generation Simpler More complex > Bytecode size ~ +0.1% ~ +0.4% Actually, when looking at your original patch (https://bugs.python.org/review/17611/) vs. Serhiy's https://github.com/python/cpython/pull/5006/, it looks like interpreter complexity is the same. Speed also seems to not be significantly different. So I would rephrase it as (compared to legacy bytecode generation): Subroutine Code duplication Speed Same Insignificantly faster Interpreter Simpler Simpler Bytecode generation Slightly more complex Slightly more complex Bytecode size ~ +0.1% ~ +0.4% That said, if you want to submit an updated version of the code duplication approach (with the same amount of added tests and debugging), we could compare on more equal grounds.
msg309181 - (view)	Author: Mark Shannon (Mark.Shannon) *	Date: 2017-12-29 16:45
When I said "significant", I meant from a statistically, not as a judgement meaning "useful" or "worthwhile". The code duplication approach is significantly faster in the tests. Whether the small speed difference is worth worrying about is a different matter. Also, the comparisons were with each other, not the current interpreter. Both approaches are better than the current implementation. One point I didn't cover is jumping to a new line in the debugger. Implementing that reliably for finally blocks with code duplication is tricky and would mean adding a number of marker bytecodes. Which is a big point in favour of the JSR style. If we are going to use the JSR approach, we should do it properly. PR 5006 still has elements of the current design such as the overly complex END_FINALLY, WITH_CLEANUP_START and WITH_CLEANUP_FINISH bytecodes. None of those should be necessary.
msg309182 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2017-12-29 17:40
> Whether the small speed difference is worth worrying about is a different matter. I don't think so. Though one could run the benchmarks suite to get an idea of the impact on real-world code.
msg309185 - (view)	Author: Neil Schemenauer (nascheme) *	Date: 2017-12-29 18:08
I apologize if my extra PR is causing confusion. My original intention was to merely forward port Antoine changes so they could compile with the 'master' version of Python. Over time I have made some fixes to it. I have kept it open because I'm not sure which approach is better (JSR or duplication). I did a quick test on the .pyc code size increase. It is actually tiny for the Python code in Lib/. PR 5006: 21259894 bytes PR 4682: 21263081 bytes (+0.015%) Running Serhiy's microbenchmarks: for i in a: try: pass finally: pass PR 4682: Mean +- std dev: 11.0 us +- 0.1 us PR 5006: Mean +- std dev: 10.9 us +- 0.6 us for i in a: try: continue finally: pass PR 4682: Mean +- std dev: 9.46 us +- 0.85 us PR 5006: Mean +- std dev: 14.3 us +- 0.3 us for i in a: while True: try: break finally: pass PR 4682: Mean +- std dev: 9.20 us +- 0.09 us PR 5006: Mean +- std dev: 14.3 us +- 0.6 us
msg309202 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2017-12-29 22:05
I consider PR 5006 as just a first step. I had two goals: 1) Fix all regressions in PR 2827. First at all the regression in the f_lineno setter. 2) Simplify the patch. This is critically important for getting the review. And it decreases the chance of regressions. All changes not directly related to the main goal (declared in the title of this issue) were removed. Other nice improvements can be made in separate PRs. Maybe finally we will came to the duplication of the code. But this will require making other small steps which can have their own value.
msg309211 - (view)	Author: Nick Coghlan (ncoghlan) *	Date: 2017-12-30 01:03
Another point in favour of the JSR approach is that it should make it easier for tools like coverage.py to continue mapping opcode coverage to line coverage. I also like Serhiy's proposed strategy of separating the initial introduction of the compiler based exception handling from the subsequent simplification of the exception handling opcodes.
msg309333 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2018-01-01 16:18
So we now have three competing PRs... each of which is not trivial. It would be nice if somehow there could be some conciliation, or at least a synthetic comparison. I don't think I want to review all three proposals (and I already spent time reviewing previous versions of Serhiy's).
msg309335 - (view)	Author: Mark Shannon (Mark.Shannon) *	Date: 2018-01-01 17:02
It shouldn't be a surprise that I have submitted a PR. The original patch was my work and a month ago I said I would update it over the Christmas break. To avoid "competing" PRs in future, simply asking before appropriating other peoples work would help. I've signed the CLA, so I don't think there are any legal issues with Serhiy's or Neil's PRs containing my code, but it does seem a bit presumptive. Having said that, if someone can convince me that Serhiy's PR is better, then I have no problem with it being merged.
msg309339 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2018-01-01 17:38
New opcodes in PR 5071 look interesting, but I think it is better to leave this experiment for a separate issue. Complexities of PRs (ignoring tests, documentation, and generated files): PR 4682: Include/opcode.h \| 12 +- Lib/opcode.py \| 21 +- Objects/frameobject.c \| 241 ++++++++----- Objects/genobject.c \| 10 +- Python/ceval.c \| 426 +++++++++++------------ Python/compile.c \| 884 +++++++++++++++++++++++++++++++++--------------- Python/peephole.c \| 18 +- 7 files changed, 979 insertions(+), 633 deletions(-) PR 5006: Include/opcode.h \| 8 +- Lib/opcode.py \| 11 +- Objects/frameobject.c \| 203 ++++++++------------- Objects/genobject.c \| 10 +- Python/ceval.c \| 344 ++++++++++++++++------------------- Python/compile.c \| 474 +++++++++++++++++++++++++++++------------------- Python/peephole.c \| 33 ++-- 7 files changed, 561 insertions(+), 522 deletions(-) PR 5071: Include/frameobject.h \| 5 +- Include/opcode.h \| 18 +- Lib/opcode.py \| 25 +- Objects/frameobject.c \| 224 ++++++--------- Objects/genobject.c \| 10 +- Python/ceval.c \| 507 ++++++++++++--------------------- Python/compile.c \| 681 +++++++++++++++++++++++++++++++------------- Python/peephole.c \| 49 +--- Python/wordcode_helpers.h \| 33 ++- 9 files changed, 827 insertions(+), 725 deletions(-) PR 4682: +4 opcodes, -4 opcodes, and changed numerical values of 2 opcodes PR 5006: +4 opcodes, -4 opcodes PR 5071: +8 opcodes, -8 opcodes, and changed numerical value of 1 opcode It looks to me that PR 5006 has the lowest complexity and is the most easy for reviewing. I'm working on it since taking Antoine's PR 5 months ago.
msg309341 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2018-01-01 17:54
I took a quick look at Mark's PR. The main thing going for it IMHO is that it produces simpler bytecode: it removes the complicated with/finally-related opcodes, while Serhiy's has non-trivial END_FINALLY and POP_FINALLY. It would be nice to know if it's able to fix the stack size issues as in the following tests: https://github.com/python/cpython/pull/5006/files?diff=unified#diff-dae68b96e8fdcb924e1ea46c31f51aec
msg309348 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2018-01-01 20:13
> It would be nice to know if it's able to fix the stack size issues as in the following tests: I just worked on it. See issue24340.
msg309349 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2018-01-01 21:11
> I just worked on it. I meant Mark's PR :-)
msg309363 - (view)	Author: Mark Shannon (Mark.Shannon) *	Date: 2018-01-02 09:32
On 01/01/18 17:54, Antoine Pitrou wrote: > > Antoine Pitrou <pitrou@free.fr> added the comment: > > I took a quick look at Mark's PR. The main thing going for it IMHO is that it produces simpler bytecode: it removes the complicated with/finally-related opcodes, while Serhiy's has non-trivial END_FINALLY and POP_FINALLY. > > It would be nice to know if it's able to fix the stack size issues as in the following tests: > https://github.com/python/cpython/pull/5006/files?diff=unified#diff-dae68b96e8fdcb924e1ea46c31f51aec Yes, it can. The key to determining a correct stack size is that each bytecode has a fixed stack delta for each exiting edge. For example, FOR_ITER is good because it has a delta of 0 when entering the body and a delta of -1 when leaving the loop. But (the old) WITH_CLEANUP_START is bad because it has a variable delta. With fixed deltas, it is not too hard to construct an algorithm to determine the correct stack size. The presence of JSR-style finally blocks complicates things a little, but not that much. An outline algorithm would be: Find all "finally" subgraphs, so as to distinguish them from the main CFG - It might be easiest to do this by marking the basic blocks where generating the code. For each subgraph compute max-depth, by keeping a running total of depth. Jumps to finally blocks should not alter the depth, but would potentially increase the max-depth. The max-depth for the main CFG is the stack depth for the function. > > ---------- > > _______________________________________ > Python tracker <report@bugs.python.org> > <https://bugs.python.org/issue17611> > _______________________________________ >
msg309366 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2018-01-02 11:24
> With fixed deltas, it is not too hard to construct an algorithm to determine the correct stack size. The presence of JSR-style finally blocks complicates things a little, but not that much. Any chance you can put that patch somewhere to get an idea of the additional complexity? I'd like to gauge the total complexity of each approach including the stack size computation fix (since fixing that issue is one of the end goals). Also, that would help compare your PR and Serhiy's on more equal terms.
msg309372 - (view)	Author: Mark Shannon (Mark.Shannon) *	Date: 2018-01-02 14:11
I've added a commit to compute the stack size. It should compute the exact stack size in all cases, and thus fix bpo-24340.
msg309385 - (view)	Author: Mark Shannon (Mark.Shannon) *	Date: 2018-01-02 17:13
> It looks like the build core dumped on Travis-CI I failed to account for the extra stuff left on the stack when handling exceptions and jumping out of a finally block due to a `break`. Caught by Serhiy's tests in https://github.com/python/cpython/pull/5078. I could add a work around, but it would be a bit hacky. I'm now convinced that code duplication is the only way to do this properly. The only real advantage of JSR-style finally blocks is the ease of implementation of `frame.set_lineno()`. Rather than go around in circles any more, I suggest that we merge Serhiy's PR (PR 5006). I can then cherry-pick my commits to clean up the implementation of the `with` statement (as a new PR). I then plan to implement `try-finally` using code duplication, but that might not make it into 3.7 and we shouldn't hold up this issue any longer.
msg309388 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2018-01-02 17:32
Le 02/01/2018 à 18:13, Mark Shannon a écrit : > > Rather than go around in circles any more, I suggest that we merge Serhiy's PR (PR 5006). I can then cherry-pick my commits to clean up the implementation of the `with` statement (as a new PR). Fair enough. I'm making a final review round on Serhiy's PR and hopefully this will be in. Also looking forward to the cleanup followup.
msg309424 - (view)	Author: Neil Schemenauer (nascheme) *	Date: 2018-01-03 17:54
On 2017-12-29, Mark Shannon wrote: > One point I didn't cover is jumping to a new line in the debugger. > Implementing that reliably for finally blocks with code > duplication is tricky and would mean adding a number of marker > bytecodes. Which is a big point in favour of the JSR style. Could we virtually execute the code, forwards or backwards, starting from f_lasti? For conditional jumps, we would follow both branches. Stop when we find the line number we want. If we hit an opcode that causes a fblock push or pop, abort. I started implementing this as a proof of concept but didn't finish. It seems to fit most naturally as an exported function of Python/peephole.c. That file already does a lot of similar stuff.
msg309740 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2018-01-09 21:50
PR 5143 is yet one alternative. It is a variant of PR 5006 that restores SETUP_EXCEPT and gets rid of the new opcode BEGIN_FINALLY. SETUP_FINALLY, SETUP_WITH and SETUP_ASYNC_WITH now push NULL on the stack instead. PR 5143 removes and add fewer opcodes than PR 5006, but changes the behavior of more opcodes. I don't know what of them looks simpler.
msg309742 - (view)	Author: Mark Shannon (Mark.Shannon) *	Date: 2018-01-09 22:14
Can we stick to the one PR, please? Once it is merged then we can improve things. Also, I don't like leaving NULLs on the stack, they are an invitation to seg-fault. PR 5143 leaves more NULLs on the stack for longer.
msg309771 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2018-01-10 15:57
Please let's stick with PR 5006.
msg310043 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2018-01-16 08:21
What should be done for PR 5006? Changes of this PR have been minimized for easier review. But many interesting things can made after merging it. For example simplify 'with'-related opcodes. Or implement static blocks (this will make the 'try' statement zero-overhead, make checks in frame.f_lineno setter more reliable, and unblock some bytecode optimizations). It would be better to have more time before a feature freeze.
msg310044 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2018-01-16 08:34
Apparently PR 5006 has a conflict. Otherwise, is it ready for review?
msg310046 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2018-01-16 08:37
Simplifying "with"-related opcodes (and perhaps "finally" ones) sounds nice. "try" blocks are already zero-overhead for practical purposes (i.e. I don't think I've met any workload where moving a "try" around did improve performance significantly).
msg312596 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2018-02-22 21:33
New changeset 520b7ae27e39d1c77ea74ccd1b184d7cb43f9dcb by Serhiy Storchaka in branch 'master': bpo-17611. Move unwinding of stack for "pseudo exceptions" from interpreter to compiler. (GH-5006) https://github.com/python/cpython/commit/520b7ae27e39d1c77ea74ccd1b184d7cb43f9dcb
msg312815 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2018-02-25 14:02
Finally it is merged, and seems successfully. Thank you all involved, and first at all Mark, the original author. This unblocked further changes, issue32489 and issue32949. And we can try implement other ideas. Changes in the f_lineno setter fixed crashes in issue17288 and issue28883. But while writing tests for it I found other crashers, will open new issues soon.

History
Date	User	Action	Args
2022-04-11 14:57:43	admin	set	github: 61811
2018-02-25 14:02:36	serhiy.storchaka	set	status: open -> closed resolution: fixed messages: + msg312815 stage: patch review -> resolved
2018-02-22 21:33:32	serhiy.storchaka	set	messages: + msg312596
2018-02-01 17:10:36	serhiy.storchaka	set	versions: + Python 3.8, - Python 3.7
2018-01-16 08:37:24	pitrou	set	messages: + msg310046
2018-01-16 08:34:37	pitrou	set	messages: + msg310044
2018-01-16 08:21:49	serhiy.storchaka	set	messages: + msg310043
2018-01-10 21:04:28	serhiy.storchaka	set	pull_requests: - pull_request4951
2018-01-10 15:57:19	pitrou	set	messages: + msg309771
2018-01-09 22:14:45	Mark.Shannon	set	messages: + msg309742
2018-01-09 21:50:57	serhiy.storchaka	set	messages: + msg309740
2018-01-09 21:46:04	serhiy.storchaka	set	pull_requests: + pull_request5002
2018-01-03 17:54:21	nascheme	set	messages: + msg309424
2018-01-02 17:32:12	pitrou	set	messages: + msg309388
2018-01-02 17:13:35	Mark.Shannon	set	messages: + msg309385
2018-01-02 14:11:53	Mark.Shannon	set	messages: + msg309372
2018-01-02 11:24:27	pitrou	set	messages: + msg309366
2018-01-02 09:32:38	Mark.Shannon	set	messages: + msg309363
2018-01-01 21:40:27	serhiy.storchaka	set	pull_requests: + pull_request4951
2018-01-01 21:11:19	pitrou	set	messages: + msg309349
2018-01-01 20:13:02	serhiy.storchaka	set	dependencies: + co_stacksize estimate can be highly off messages: + msg309348
2018-01-01 17:54:11	pitrou	set	messages: + msg309341
2018-01-01 17:38:44	serhiy.storchaka	set	messages: + msg309339
2018-01-01 17:02:25	Mark.Shannon	set	messages: + msg309335
2018-01-01 16:18:52	pitrou	set	messages: + msg309333
2018-01-01 14:09:51	Mark.Shannon	set	pull_requests: + pull_request4945
2017-12-30 01:03:30	ncoghlan	set	messages: + msg309211
2017-12-29 22:05:45	serhiy.storchaka	set	messages: + msg309202
2017-12-29 18:08:53	nascheme	set	messages: + msg309185
2017-12-29 17:40:01	pitrou	set	messages: + msg309182
2017-12-29 16:45:55	Mark.Shannon	set	messages: + msg309181
2017-12-29 12:00:08	pitrou	set	messages: + msg309170
2017-12-29 11:48:50	Mark.Shannon	set	messages: + msg309169
2017-12-29 08:23:24	pitrou	set	messages: + msg309163
2017-12-29 05:26:45	nascheme	set	messages: + msg309161
2017-12-28 19:53:12	serhiy.storchaka	set	messages: + msg309141
2017-12-28 19:41:50	pitrou	set	messages: + msg309139
2017-12-28 19:39:21	serhiy.storchaka	set	messages: + msg309138
2017-12-24 19:41:34	serhiy.storchaka	set	messages: + msg309015
2017-12-24 19:24:53	serhiy.storchaka	set	pull_requests: + pull_request4895
2017-12-23 21:21:03	nascheme	set	messages: + msg308969
2017-12-23 17:19:27	serhiy.storchaka	set	dependencies: + Refactor and add new tests for the f_lineno setter
2017-12-22 18:24:42	nascheme	set	messages: + msg308939
2017-12-22 08:55:32	rhettinger	set	messages: + msg308928
2017-12-07 23:04:31	Mark.Shannon	set	messages: + msg307827
2017-12-05 21:24:42	nascheme	set	messages: + msg307687
2017-12-05 20:16:25	Mark.Shannon	set	messages: + msg307680
2017-12-05 17:21:11	nascheme	set	messages: + msg307671
2017-12-05 11:11:58	Mark.Shannon	set	messages: + msg307656
2017-12-04 17:53:08	nascheme	set	messages: + msg307585
2017-12-04 17:41:52	Mark.Shannon	set	messages: + msg307584
2017-12-04 17:19:48	Mark.Shannon	set	messages: + msg307582
2017-12-04 16:56:25	serhiy.storchaka	set	messages: + msg307577
2017-12-04 15:49:17	pitrou	set	messages: + msg307572
2017-12-04 15:21:48	Mark.Shannon	set	messages: + msg307570
2017-12-04 13:16:14	serhiy.storchaka	set	messages: + msg307564
2017-12-04 12:33:36	pitrou	set	messages: + msg307558
2017-12-04 12:25:34	serhiy.storchaka	set	messages: + msg307555
2017-12-04 11:50:46	pitrou	set	messages: + msg307549
2017-12-04 11:46:34	Mark.Shannon	set	messages: + msg307548
2017-12-03 09:53:28	pitrou	set	messages: + msg307497
2017-12-03 00:50:37	ncoghlan	set	messages: + msg307479
2017-12-03 00:49:28	rhettinger	set	messages: + msg307478
2017-12-03 00:39:57	ncoghlan	set	nosy: + ncoghlan messages: + msg307477
2017-12-02 22:16:49	nascheme	set	pull_requests: + pull_request4594
2017-11-06 22:41:04	nascheme	set	files: + perf_unwind_stack.txt messages: + msg305684
2017-11-06 19:51:38	nascheme	set	nosy: nascheme, rhettinger, mark.dickinson, pitrou, christian.heimes, benjamin.peterson, trent, Mark.Shannon, serhiy.storchaka, Demur Rumed
2017-08-30 12:36:10	serhiy.storchaka	set	pull_requests: - pull_request3290
2017-08-30 12:32:26	serhiy.storchaka	set	pull_requests: + pull_request3290
2017-08-30 12:27:05	serhiy.storchaka	set	pull_requests: - pull_request3288
2017-08-30 12:18:41	serhiy.storchaka	set	pull_requests: + pull_request3288
2017-07-25 13:25:22	rhettinger	set	messages: - msg185785
2017-07-24 14:05:13	pitrou	set	messages: + msg298972
2017-07-24 11:47:57	pitrou	set	messages: + msg298960
2017-07-24 10:06:42	vstinner	set	nosy: - vstinner
2017-07-24 09:59:41	serhiy.storchaka	set	messages: + msg298949
2017-07-24 09:21:42	pitrou	set	messages: + msg298943
2017-07-24 08:25:51	pitrou	set	messages: + msg298935
2017-07-24 08:09:05	pitrou	set	messages: + msg298934
2017-07-23 19:49:42	serhiy.storchaka	set	stage: needs patch -> patch review
2017-07-23 19:22:39	pitrou	set	messages: + msg298910
2017-07-23 19:20:32	pitrou	set	pull_requests: + pull_request2879
2017-07-22 21:12:35	pitrou	set	versions: + Python 3.7, - Python 3.6
2016-06-13 16:46:19	nascheme	set	nosy: + nascheme messages: + msg268456
2016-06-09 12:16:36	serhiy.storchaka	set	stage: patch review -> needs patch
2016-06-05 15:03:23	serhiy.storchaka	set	messages: + msg267418
2016-06-05 04:00:43	christian.heimes	set	nosy: + christian.heimes
2016-06-04 22:05:07	Demur Rumed	set	nosy: + Demur Rumed messages: + msg267329
2016-06-04 21:16:35	serhiy.storchaka	set	versions: + Python 3.6, - Python 3.5
2016-01-20 11:37:31	vstinner	set	nosy: + vstinner messages: + msg258673
2014-11-16 15:06:13	serhiy.storchaka	set	versions: + Python 3.5 messages: + msg231250 components: + Interpreter Core type: enhancement stage: patch review
2014-10-14 16:58:12	skrah	set	nosy: - skrah
2014-10-08 20:54:15	pitrou	set	nosy: + serhiy.storchaka
2014-10-08 20:53:51	pitrou	set	messages: + msg228817
2014-10-08 20:53:25	pitrou	set	messages: + msg228816
2014-04-29 18:21:23	pitrou	set	files: + unwinding.patch messages: + msg217534
2013-04-03 21:46:09	trent	set	nosy: + trent
2013-04-03 21:28:58	skrah	set	messages: + msg185955
2013-04-03 20:58:02	mark.dickinson	set	nosy: + mark.dickinson
2013-04-03 20:57:22	pitrou	set	messages: + msg185954
2013-04-03 20:53:15	skrah	set	messages: + msg185952
2013-04-03 20:32:27	skrah	set	messages: + msg185951
2013-04-03 20:27:10	benjamin.peterson	set	nosy: + benjamin.peterson
2013-04-03 20:25:54	Mark.Shannon	set	messages: + msg185950
2013-04-03 19:39:33	Mark.Shannon	set	messages: + msg185946
2013-04-03 11:23:13	skrah	set	nosy: + skrah messages: + msg185916
2013-04-03 10:20:13	pitrou	set	nosy: + pitrou messages: + msg185909
2013-04-02 01:21:43	rhettinger	set	nosy: + rhettinger messages: + msg185785
2013-04-01 16:32:02	Mark.Shannon	set	files: + b16527f84774.diff keywords: + patch
2013-04-01 16:31:20	Mark.Shannon	create