Created on 2011-04-10 20:25 by rhettinger, last changed 2017-04-17 04:42 by ncoghlan.
|issue11822.diff||torsten, 2011-12-10 23:34|
|issue11822_nested_disassembly.diff||ncoghlan, 2014-09-07 01:01||Updated and enhanced for 3.5||review|
|issue11822_nested_disassembly_with_off_switch.diff||ncoghlan, 2014-09-07 05:30||"nested" levels below 0 now turn off the new feature||review|
|PR 1155||open||serhiy.storchaka, 2017-04-15 14:22|
|msg133478 - (view)||Author: Raymond Hettinger (rhettinger) *||Date: 2011-04-10 20:25|
Now that list comprehensions mask run their internals in code objects (the same way that genexps do), it is getting harder to use dis() to see what code is generated. For example, the pow() call isn't shown in the following disassembly: >>> dis('[x**2 for x in range(3)]') 1 0 LOAD_CONST 0 (<code object <listcomp> at 0x1005d1e88, file "<dis>", line 1>) 3 MAKE_FUNCTION 0 6 LOAD_NAME 0 (range) 9 LOAD_CONST 1 (3) 12 CALL_FUNCTION 1 15 GET_ITER 16 CALL_FUNCTION 1 19 RETURN_VALUE I propose that dis() build-up a queue undisplayed code objects and then disassemble each of those after the main disassembly is done (effectively making it recursive and displaying code objects in the order that they are first seen in the disassembly). For example, the output shown above would be followed by a disassembly of its internal code object: <code object <listcomp> at 0x1005d1e88, file "<dis>", line 1>: 1 0 BUILD_LIST 0 3 LOAD_FAST 0 (.0) >> 6 FOR_ITER 16 (to 25) 9 STORE_FAST 1 (x) 12 LOAD_FAST 1 (x) 15 LOAD_CONST 0 (2) 18 BINARY_POWER 19 LIST_APPEND 2 22 JUMP_ABSOLUTE 6 >> 25 RETURN_VALUE
|msg133524 - (view)||Author: Alexander Belopolsky (belopolsky) *||Date: 2011-04-11 14:42|
Would you like to display lambdas as well? >>> dis('lambda x: x**2') 1 0 LOAD_CONST 0 (<code object <lambda> at 0x1005c9ad0, file "<dis>", line 1>) 3 MAKE_FUNCTION 0 6 RETURN_VALUE <code object <lambda> at 0x1005cb140, file "<dis>", line 1>: 1 0 LOAD_FAST 0 (x) 3 LOAD_CONST 1 (2) 6 BINARY_POWER 7 RETURN_VALUE I like the idea, but would rather see code objects expanded in-line, possibly indented rather than at the end.
|msg133542 - (view)||Author: Antoine Pitrou (pitrou) *||Date: 2011-04-11 21:08|
I think it should be enabled with an optional argument. Otherwise in some cases you'll get lots of additional output while you're only interested in the top-level code.
|msg133543 - (view)||Author: Raymond Hettinger (rhettinger) *||Date: 2011-04-11 21:13|
If you disassemble a function, you typically want to see all the code in that function. This isn't like pdb where you're choosing to step over or into another function outside the one being viewed.
|msg133544 - (view)||Author: Antoine Pitrou (pitrou) *||Date: 2011-04-11 21:21|
> If you disassemble a function, you typically want to see all the code > in that function. That depends on the function. If you do event-driven programming (say, Twisted deferreds with addCallback()), you don't necessarily want to see the disassembly of the callbacks that are passed to the various framework functions. Also, if you do so recursively, it might become *very* unwieldy. So I don't think there's anything "typical" here. It depends on what you intend to focus on.
|msg133545 - (view)||Author: Alexander Belopolsky (belopolsky) *||Date: 2011-04-11 21:48|
On Mon, Apr 11, 2011 at 5:21 PM, Antoine Pitrou <firstname.lastname@example.org> wrote: .. Raymond>> If you disassemble a function, you typically want to see all the code Raymond>> [defined] in that function. +1 (with clarification in ) If the function calls a function defined elsewhere, I don't want to see the called function disassembly when I disassemble the caller. In this case it is very easy to disassemble interesting functions with separate dis() calls. In the case like the following, however: def f(): def g(x): return x**2 dis(f) 2 0 LOAD_CONST 1 (<code object g at 0x10055ce88, file "x.py", line 2>) 3 MAKE_FUNCTION 0 6 STORE_FAST 0 (g) ... when I see '<code object g at 0x10055ce88, ..>', I have to do something unwieldy such as 3 0 LOAD_FAST 0 (x) 3 LOAD_CONST 1 (2) 6 BINARY_POWER 7 RETURN_VALUE > > That depends on the function. If you do event-driven programming (say, > Twisted deferreds with addCallback()), you don't necessarily want to see > the disassembly of the callbacks that are passed to the various > framework functions. Also, if you do so recursively, it might become > *very* unwieldy. Can you provide some examples of this? Nested functions are typically short and even if they are long, the size disassembly would be proportional to the line count of the function being disassembled, which is expected.
|msg137415 - (view)||Author: Nick Coghlan (ncoghlan) *||Date: 2011-06-01 05:10|
Note that Yaniv Aknin (author of the Python's Innards series of blog posts) has a recursive dis variant that may be useful for inspiration: https://bitbucket.org/yaniv_aknin/pynards/src/c4b61c7a1798/common/blog.py As shown there, this recursive behaviour can also be useful for code_info/show_code.
|msg149198 - (view)||Author: Torsten Landschoff (torsten) *||Date: 2011-12-10 23:34|
I offer the attached patch as a starting point to fulfill this feature request. The patch changes the output to insert the disassembly of local code objects on the referencing line. As that made the output unreadable to me, I added indentation for the nested code (by 4 spaces, hoping that nobody will nest code 10 levels deep :-) This results in the following output for the original example: >>> from dis import dis >>> dis('[x**2 for x in range(3)]') 1 0 LOAD_CONST 0 (<code object <listcomp> at 0x7f24a67dde40, file "<dis>", line 1>) 1 0 BUILD_LIST 0 3 LOAD_FAST 0 (.0) >> 6 FOR_ITER 16 (to 25) 9 STORE_FAST 1 (x) 12 LOAD_FAST 1 (x) 15 LOAD_CONST 0 (2) 18 BINARY_POWER 19 LIST_APPEND 2 22 JUMP_ABSOLUTE 6 >> 25 RETURN_VALUE 3 LOAD_CONST 1 ('<listcomp>') 6 MAKE_FUNCTION 0 9 LOAD_NAME 0 (range) 12 LOAD_CONST 2 (3) 15 CALL_FUNCTION 1 18 GET_ITER 19 CALL_FUNCTION 1 22 RETURN_VALUE
|msg149199 - (view)||Author: Raymond Hettinger (rhettinger) *||Date: 2011-12-10 23:44|
Thank you :-)
|msg226525 - (view)||Author: Nick Coghlan (ncoghlan) *||Date: 2014-09-07 01:01|
Sorry for the long delay in doing anything with this patch. Unfortunately, trunk has moved on quite a bit since this patch was submitted, and it's no longer directly applicable. However, the basic principle is sound, so this is a new patch that aligns with the changes made in 3.4 to provide an iterator based bytecode introspection API. It also changes the indenting to be based on the structure of the bytecode disassembly - nested lines start aligned with the opcode *name* on the preceding line. This will get unreadable with more than two or three levels of nesting, but at that point, hard to read disassembly for the top level function is the least of your worries. (A potentially useful option may to be add a flag to turn off the implicit recursion, easily restoring the old single level behaviour. I'd like the recursive version to be the default though, since it's far more useful given that Python 3 comprehensions all involve a nested code object) A descriptive header makes the new output more self-explanatory. Note that I did try repeating the code object repr from the LOAD_CONST opcode in the new header - it was pretty unreadable, and redundant given the preceding line of disassembly. Two examples, one showing Torsten's list comprehension from above, and another showing that the nested line numbers work properly. This can't be applied as is - it's still missing tests, docs, and fixes to disassembly output tests that assume the old behaviour. >>> dis.dis('[x**2 for x in range(3)]') 1 0 LOAD_CONST 0 (<code object <listcomp> at 0x7f459ec4a0c0, file "<dis>", line 1>) Disassembly for nested code object 1 0 BUILD_LIST 0 3 LOAD_FAST 0 (.0) >> 6 FOR_ITER 16 (to 25) 9 STORE_FAST 1 (x) 12 LOAD_FAST 1 (x) 15 LOAD_CONST 0 (2) 18 BINARY_POWER 19 LIST_APPEND 2 22 JUMP_ABSOLUTE 6 >> 25 RETURN_VALUE 3 LOAD_CONST 1 ('<listcomp>') 6 MAKE_FUNCTION 0 9 LOAD_NAME 0 (range) 12 LOAD_CONST 2 (3) 15 CALL_FUNCTION 1 (1 positional, 0 keyword pair) 18 GET_ITER 19 CALL_FUNCTION 1 (1 positional, 0 keyword pair) 22 RETURN_VALUE >>> def f(): ... print("Hello") ... def g(): ... for x in range(10): ... yield x ... return g ... >>> dis.dis(f) 2 0 LOAD_GLOBAL 0 (print) 3 LOAD_CONST 1 ('Hello') 6 CALL_FUNCTION 1 (1 positional, 0 keyword pair) 9 POP_TOP 3 10 LOAD_CONST 2 (<code object g at 0x7f459ec4a540, file "<stdin>", line 3>) Disassembly for nested code object 4 0 SETUP_LOOP 25 (to 28) 3 LOAD_GLOBAL 0 (range) 6 LOAD_CONST 1 (10) 9 CALL_FUNCTION 1 (1 positional, 0 keyword pair) 12 GET_ITER >> 13 FOR_ITER 11 (to 27) 16 STORE_FAST 0 (x) 5 19 LOAD_FAST 0 (x) 22 YIELD_VALUE 23 POP_TOP 24 JUMP_ABSOLUTE 13 >> 27 POP_BLOCK >> 28 LOAD_CONST 0 (None) 31 RETURN_VALUE 13 LOAD_CONST 3 ('f.<locals>.g') 16 MAKE_FUNCTION 0 19 STORE_FAST 0 (g) 6 22 LOAD_FAST 0 (g) 25 RETURN_VALUE
|msg226531 - (view)||Author: Nick Coghlan (ncoghlan) *||Date: 2014-09-07 05:30|
I didn't want to add a second argument to turn off the new behaviour, so I changed it such that passing a value < 0 for "nested" turns off the new feature entirely. Levels >= 0 enable it, defining which level to start with. The default level is "0" so there's no implied prefix, and nested code objects are displayed by default. This picks up at least comprehensions, lambda expressions and nested functions. I haven't checked how it handles nested classes yet. I used this feature to get the old tests passing again by turning off the recursion feature. New tests for the new behaviour are still needed. I also tweaked the header to show the *name* of the code object. The full repr is to noisy, but the generic message was hard to read when there were multiple nested code objects.
|msg254220 - (view)||Author: Stéphane Wirtel (matrixise) *||Date: 2015-11-06 21:16|
Hi all, For this feature, I have an other output: stephane@sg1 /tmp> python3 dump_bytecode.py <module> -------- 3 0 LOAD_BUILD_CLASS 1 LOAD_CONST 0 (<code object User at 0x10b830270, file "<show>", line 3>) 4 LOAD_CONST 1 ('User') 7 MAKE_FUNCTION 0 10 LOAD_CONST 1 ('User') 13 CALL_FUNCTION 2 (2 positional, 0 keyword pair) 16 STORE_NAME 0 (User) 8 19 LOAD_NAME 0 (User) 22 LOAD_CONST 2 ('user') 25 LOAD_CONST 3 ('password') 28 CALL_FUNCTION 2 (2 positional, 0 keyword pair) 31 STORE_NAME 1 (user) 34 LOAD_CONST 4 (None) 37 RETURN_VALUE <module>.User ------------- 3 0 LOAD_NAME 0 (__name__) 3 STORE_NAME 1 (__module__) 6 LOAD_CONST 0 ('User') 9 STORE_NAME 2 (__qualname__) 4 12 LOAD_CONST 1 (<code object __init__ at 0x10b824270, file "<show>", line 4>) 15 LOAD_CONST 2 ('User.__init__') 18 MAKE_FUNCTION 0 21 STORE_NAME 3 (__init__) 24 LOAD_CONST 3 (None) 27 RETURN_VALUE <module>.User.__init__ ---------------------- 5 0 LOAD_FAST 1 (email) 3 LOAD_FAST 0 (self) 6 STORE_ATTR 0 (email) 6 9 LOAD_FAST 2 (password) 12 LOAD_FAST 0 (self) 15 STORE_ATTR 1 (password) 18 LOAD_CONST 0 (None) 21 RETURN_VALUE
|msg267425 - (view)||Author: Serhiy Storchaka (serhiy.storchaka) *||Date: 2016-06-05 16:27|
I like Stéphane's idea about placing the output for nested code object at the same level after the output for the main code object.
|msg270751 - (view)||Author: Stéphane Wirtel (matrixise) *||Date: 2016-07-18 13:15|
hello, we can continue the discussion on this issue ?
|msg291715 - (view)||Author: Serhiy Storchaka (serhiy.storchaka) *||Date: 2017-04-15 14:30|
The issue was open 6 years ago. The feature could be added in 3.3. But it still not implemented. Since there are problems with outputting the disassembly of internal code objects expanded in-line, proposed patch just outputs them after the disassembly of the main code object (similar to original Raymond's proposition). This is simpler and doesn't make the output too wide.
|msg291718 - (view)||Author: Nick Coghlan (ncoghlan) *||Date: 2017-04-15 15:50|
+1 for listing the nested code objects after the original one. In reviewing Serhiy's patch, the core technical implementation looks OK to me, but I think we may want to go with a "depth" argument rather than a simple "recursive" flag. My rationale for that relates to directly disassembling module and class source code: - dis(module_source, depth=1) # Module, class bodies, function bodies - dis(class_source, depth=1) # Class and method bodies (with the default depth being 0, to disable recursive descent entirely) Only if you set a higher depth than 1 would you start seeing things like closures, comprehension bodies, and nested classes. With a simple all-or-nothing flag, I think module level recursive disassembly would be too noisy to be useful. The bounded depth approach would also avoid a problem with invalid bytecode manipulations that manage to create a loop between two bytecode objects. While the *compiler* won't do that, there's no guarantee that the disassembler is being fed valid bytecode, so we should avoid exposing ourselves to any infinite loops in the display code.
|msg291720 - (view)||Author: Serhiy Storchaka (serhiy.storchaka) *||Date: 2017-04-15 16:36|
The problem with the *depth* parameter is that it adds a burden of choosing the value for the end user. "Oh, there are more deeper code objects, I must increase the depth and rerun dis()!" I think in most cases when that parameter is specified it would be set to some large value like 999 because you don't want to set it too small. Compare for example with the usage of the attribute maxDiff in unittests. The single depth parameter doesn't adds too much control. You can't enable disassembling functions and method bodies but disable disassembling comprehensions in functions. If you need more control, you should use non-recursive dis() and manually walk the tree of code objects. How much output adds unlimited recursion in comparison with the recursion limited by the first level? As for supporting invalid bytecode, currently the dis module doesn't support it (see issue26694).
|msg291721 - (view)||Author: Nick Coghlan (ncoghlan) *||Date: 2017-04-15 17:02|
The problem I see is that we have conflicting requirements for the default behaviour: - if we modify dis() instead of adding a new function, then the default behaviour needs to be non-recursive for backwards compatibility reasons - if we allow the depth to be configurable, then we'd like the default behaviour to be to show everything One potential resolution to that would be to define this as a new function, `distree`, rather than modifying `dis` and `disassemble` to natively support recursion. Then the new function could accept a `depth` argument (addressing my concerns), but have `depth=None` as the default (addressing your concerns). If we wanted to allow even more control than that, then I think os.walk provides a useful precedent, where we'd add a new `dis.walk` helper that just looked at `co_consts` to find nested code objects without doing any of the other disassembly work. The dis.walk() helper would produce an iterable of (depth, code, nested) 3-tuples, where: - the first item is the current depth in the compile tree - the second is the code object itself - the third is a list of nested code objects Similar to os.walk(), editing the list of nested objects in place would let you control whether or not any further recursion took place.
|msg291765 - (view)||Author: Raymond Hettinger (rhettinger) *||Date: 2017-04-16 21:22|
> if we modify dis() instead of adding a new function, then > the default behaviour needs to be non-recursive for > backwards compatibility reasons I don't see how we have any backward compatibility issues. The dis() function is purely informational like help(). The problem is it doesn't show important information, list comprehensions are now effectively hidden from everyone who isn't clever and persistent. I use dis() as a teaching aid in my Python courses and as a debugging tool when doing consulting. From my point of view, it is effectively broken in Python 3.
|msg291780 - (view)||Author: Nick Coghlan (ncoghlan) *||Date: 2017-04-17 04:42|
Yeah, I was mixing this up with getargspec (et al), which get used by IDEs and similar tools. While third party tools do use the disassembler, they typically won't use its display logic directly unless they're just dumping the output to a terminal equivalent. Given that, a "depth=None" parameter on `dis` and `disassemble` would provide the default behaviour of rendering the entire compilation tree, while still allowing turning off recursion entirely ("depth=0"), or limiting it to a desired number of levels ("depth=1", etc).
|2017-04-17 04:42:59||ncoghlan||set||messages: + msg291780|
|2017-04-16 21:22:06||rhettinger||set||messages: + msg291765|
|2017-04-15 17:02:03||ncoghlan||set||messages: + msg291721|
|2017-04-15 16:36:04||serhiy.storchaka||set||messages: + msg291720|
|2017-04-15 15:50:09||ncoghlan||set||messages: + msg291718|
versions: + Python 3.7, - Python 3.6
|2017-04-15 14:22:00||serhiy.storchaka||set||pull_requests: + pull_request1286|
|2016-07-18 13:15:21||matrixise||set||messages: + msg270751|
messages: + msg267425
|2015-12-03 17:17:52||Todd Dembrey||set||nosy:
+ Todd Dembrey|
|2015-11-10 07:27:13||rhettinger||set||assignee: rhettinger ->|
messages: + msg254220
versions: + Python 3.6, - Python 3.5
messages: + msg226531
messages: + msg226525
|2014-08-12 19:49:09||rhettinger||set||stage: needs patch -> patch review|
versions: + Python 3.5, - Python 3.4
|2013-07-12 07:35:54||rhettinger||set||stage: needs patch|
versions: + Python 3.4, - Python 3.3
|2011-12-10 23:44:24||rhettinger||set||messages: + msg149199|
nosy: + torsten
messages: + msg149198
keywords: + patch
messages: + msg137415
|2011-04-13 02:28:27||rhettinger||set||assignee: rhettinger|
|2011-04-11 21:48:24||belopolsky||set||messages: + msg133545|
|2011-04-11 21:21:03||pitrou||set||messages: + msg133544|
|2011-04-11 21:13:10||rhettinger||set||messages: + msg133543|
messages: + msg133542
|2011-04-11 14:42:04||belopolsky||set||messages: + msg133524|