classification
Title: in debug mode, compile(ast) fails with an assertion error if an AST node has no line number information
Type: crash Stage: needs patch
Components: Interpreter Core Versions: Python 3.4, Python 3.5, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder: PEP 511: code.co_lnotab: use signed line number delta to support moving instructions in an optimizer
View: 26107
Assigned To: Nosy List: Claudiu.Popa, benjamin.peterson, r.david.murray, terry.reedy, vstinner, ztane
Priority: normal Keywords:

Created on 2014-04-29 09:00 by ztane, last changed 2016-01-20 11:35 by vstinner. This issue is now closed.

Files
File name Uploaded Description Edit
astlinenotest.py ztane, 2014-04-29 09:00 an example of how to trigger the assertion failure using ASTs and compile
Messages (7)
msg217502 - (view) Author: Antti Haapala (ztane) * Date: 2014-04-29 09:00
We had had problems with our web service occasionally hanging and performing poorly, and as we didn't have much clue about the cause of these, we decided to continuously run our staging build under debug enabled python 3.4, and then attaching gdb as needed. To much dismay we found out that our code generating code that builds AST trees and then compiles them to modules is dumping cores on the debug version. 

The assertion is the much discussed "linenumbers must grow monotonically" at http://hg.python.org/cpython/file/04f714765c13/Python/compile.c#l3969

In our case, the AST is generated from a HTML template with embedded python parts; as we could approximately point out much of the corresponding code in the source template, we decided to reuse the linenumbers in AST, and things seemed to work quite nicely and usually we could get usable tracebacks too.

Under debug build, however, as the ordering of some constructs in the source language are different from python, we need to discard *all* linenumbers and only after then use fix_missing_locations, and thus get completely unusable traces from these parts of code, all happening on line 1. Just using fix_missing_locations does not work. Likewise the rules for which parts of the tree should come in which order in the lnotab is quite hard to deduce.

It seems to me that when the lnotab was created, no one even had in mind that there would be an actually useful AST module that would be used for code generation. Considering that there have been other calls for breaking the correspondence of bytecode addresses to monotonically growing linenumbers, I want to reopen the discussion about changing the lnotab structures now to allow arbitrary mapping of source code locations to bytecode, and especially about the need for this assertion in the debug builds at all.

Attached is an example of code that appends a function to an existing module syntax tree, run under python*-dbg it dumps the core with "Python/compile.c:nnnn: assemble_lnotab: Assertion `d_lineno >= 0' failed." Ofc in this simple case it is easy to just modify the linenumbers so that function "bar" would come after "foo", however in some cases it is hard to know the actual rules; fix_missing_locations does not do this right at all.

I am also pretty sure most of the existing code that combine parsed and generated ASTs and then compile the resulting trees also would fail that assert, but no one is ever running their code under debug builds.
msg217806 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2014-05-02 22:49
3.1 to 3.3 only get internet security patcches, and I don't believe this is such an issue.

It is not clear from the title (a statement, not a request) and your text what specific patch you would like. If you want to 'reopen a discussion', please post on python-ideas. It appears that your idea is to abandon a rule that does not work.  But it is not clear from a first reading exactly what alternative you want.
msg218383 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-05-12 23:45
With the development version (Python 3.5), I reproduce the crash when Python is compiled in debug mode:

$ ./python astlinenotest.py 
python: Python/compile.c:3975: assemble_lnotab: Assertion `d_lineno >= 0' failed.
Abandon (core dumped)

The problem is that astlinenotest.py creates an AST node without lineno information, which makes an assertion to fail in the compiler.

In my astoptimizer project, I use this function to not have to worry of the lineno:

def copy_lineno(node, new_node):
    ast.fix_missing_locations(new_node)
    ast.copy_location(new_node, node)
    return new_node
msg218412 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-05-13 08:13
We can maybe modify the compiler to use the line number 1 if the line information is missing?
msg218700 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2014-05-17 13:18
Victor: in the production code discussed in the original posting, there *are* line numbers, and they are meaningful; they just aren't monotonically increasing.

I believe the request here is to simply remove the assert.  (If we did that, we'd have to also add tests that python itself was generating monotonically increasing line number, and make some doc changes.)
msg218723 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2014-05-17 23:49
Summary of this post: compile currently checks user input with assert; this is a bug that should be changed.

I re-read astlinenotest.py and realized that FunctionDef is included in '*' and not some omitted import. (The latter is common for code posted on python-list, if not here. This illustrates why PEP 8 deprecates 'import *'; import ast ... ast.FunctionDef would have been clear on first reading.) So I ran the module in installed 3.4 and repository debug 3.5 and got the assert with the latter.

"It seems to me that when the lnotab was created, no one even had in mind that there would be an actually useful AST module that would be used for code generation." I am pretty sure this is correct. I suspect that the assert in question was originally intended to test the logic of the internal syntax tree line number generation. But now it also tests user input, which I and may others think is a bad idea.

A solution between removing the assert (and the internal check) and converting it to, say, ValueError("decreasing line no in input ast"), and thereby stopping code that now normally and legitimately works, would be to skip the asserts when the compile input is an ast instead of code.
msg258670 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-01-20 11:35
Good news: this issue has been fixed by the commit of the issue #26107.
History
Date User Action Args
2016-01-20 11:35:39vstinnersetstatus: open -> closed
superseder: PEP 511: code.co_lnotab: use signed line number delta to support moving instructions in an optimizer
resolution: fixed
messages: + msg258670
2015-03-09 14:46:03Claudiu.Popasetnosy: + Claudiu.Popa
2014-05-17 23:49:50terry.reedysetmessages: + msg218723
stage: needs patch
2014-05-17 13:18:52r.david.murraysetnosy: + r.david.murray
messages: + msg218700
2014-05-13 08:13:59vstinnersetnosy: + benjamin.peterson
messages: + msg218412
2014-05-12 23:45:25vstinnersetnosy: + vstinner

messages: + msg218383
title: Compiling modified AST crashes on debug build unless linenumbering discarded -> in debug mode, compile(ast) fails with an assertion error if an AST node has no line number information
2014-05-02 22:49:15terry.reedysetnosy: + terry.reedy

messages: + msg217806
versions: - Python 3.1, Python 3.2, Python 3.3
2014-04-29 09:00:10ztanecreate