Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Patch: some changes to AST to make it more useful for static language analysis #60999

Closed
scummos mannequin opened this issue Dec 27, 2012 · 44 comments
Closed

Patch: some changes to AST to make it more useful for static language analysis #60999

scummos mannequin opened this issue Dec 27, 2012 · 44 comments
Assignees

Comments

@scummos
Copy link
Mannequin

scummos mannequin commented Dec 27, 2012

BPO 16795
Nosy @brettcannon, @benjaminp, @ezio-melotti, @merwok, @meadori, @ericsnowcurrently
Files
  • python.diff
  • python2.diff
  • readable.diff
  • full.diff
  • full2.diff
  • full3.diff
  • 81299-extend-asdl.diff
  • 81300-change-var-kwargs.diff
  • 81301-change-attr-ranges.diff
  • 81300-change-var-kwargs-new.diff
  • 81302-adjust-unparser.diff
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/benjaminp'
    closed_at = <Date 2013-03-18.17:50:17.629>
    created_at = <Date 2012-12-27.21:44:43.601>
    labels = []
    title = 'Patch: some changes to AST to make it more useful for static language analysis'
    updated_at = <Date 2015-02-02.15:53:56.247>
    user = 'https://bugs.python.org/scummos'

    bugs.python.org fields:

    activity = <Date 2015-02-02.15:53:56.247>
    actor = 'benjamin.peterson'
    assignee = 'benjamin.peterson'
    closed = True
    closed_date = <Date 2013-03-18.17:50:17.629>
    closer = 'python-dev'
    components = ['None']
    creation = <Date 2012-12-27.21:44:43.601>
    creator = 'scummos'
    dependencies = []
    files = ['28462', '28596', '28679', '28680', '28681', '28786', '28787', '28788', '28789', '28905', '29445']
    hgrepos = []
    issue_num = 16795
    keywords = ['patch']
    message_count = 44.0
    messages = ['178339', '179094', '179150', '179151', '179154', '179220', '179600', '179601', '179604', '179605', '179606', '179607', '179608', '179621', '179737', '179748', '180116', '180163', '180261', '180262', '180263', '180266', '180267', '180268', '180537', '180542', '180611', '180612', '180953', '181609', '181612', '181613', '181614', '181615', '181616', '182127', '182128', '184472', '184475', '184480', '184481', '184485', '235267', '235268']
    nosy_count = 8.0
    nosy_names = ['brett.cannon', 'benjamin.peterson', 'ezio.melotti', 'eric.araujo', 'meador.inge', 'scummos', 'python-dev', 'eric.snow']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue16795'
    versions = ['Python 3.4']

    @scummos
    Copy link
    Mannequin Author

    scummos mannequin commented Dec 27, 2012

    Here's a patch doing some adjustments to the AST to make it more useful for static language analysis, as discussed in http://mail.python.org/pipermail/python-dev/2012-December/123320.html.

    Changes done:

    • the described fix to attribute ranges
    • add location information for var / kwargs and arguments

    Interestingly, this even fixes a bug; compare the locations of the error in the following situation:

    >>> l = [1, 2, 3]
    >>> l[
    ... 
    ... 2
    ... 
    ... ].Foo
    
    Old error message:
    Traceback (most recent call last):
      File "<stdin>", line 3, in <module>
    AttributeError: 'int' object has no attribute 'Foo'
    
    New error message:
    Traceback (most recent call last):
      File "<stdin>", line 5, in <module>
    AttributeError: 'int' object has no attribute 'Foo'

    The new message is obviously more accurate (one could even go as far as saying that the first one does not make any sense at all -- what does the expression in the slice have to do with the error?).
    The same thing happens in similar situations, e.g. with line continuation characters, function calls, ... anything multi-line with an error related to attribute access.

    I hope the patch is okay, if not please let me know what to change. I also hope I managed to include all important changes into the patch ;)

    @benjaminp
    Copy link
    Contributor

    It would be good if
    a) the patch was against hg default
    b) it had tests

    @scummos
    Copy link
    Mannequin Author

    scummos mannequin commented Jan 5, 2013

    While writing tests, I noticed that the additional fields (lineno, col_offset for vararg, kwarg, and other arguments) currently are mandatory. Is that a problem?
    It doesn't seem trivial to change that, since apparently only attributes (not fields) can be optional, but those are not allowed by the syntax of python.asdl at this point.
    In case the fields need to be mandatory, what would be the correct approach to achieve that?

    Thanks.

    @benjaminp
    Copy link
    Contributor

    A question mark after the type name in the AST makes it optional.

    @scummos
    Copy link
    Mannequin Author

    scummos mannequin commented Jan 5, 2013

    Thanks. I had seen and tried this before, but the "ast" module in python, which is used in the tests, still requires the additional arguments. Probably this is only valid for the C API?

    @scummos
    Copy link
    Mannequin Author

    scummos mannequin commented Jan 6, 2013

    Here's a new proposal, I adjusted the AST tests and fixed some small problems I encountered during that. It contains all the diffs for generated files, should I remove those for easier review?
    A remaining problem is that AST_Tests::_assertTrueorder now fails, I think because the condition it checks simply is not met any more (by design of the change). What's the correct way to deal with that?

    @scummos
    Copy link
    Mannequin Author

    scummos mannequin commented Jan 10, 2013

    Here's another version now which I think could be used like this. All tests have been adjusted. I'll append two patches, one just containing the changes to the parser for ease of review, and one full diff which also contains changes to the generated files and the test adjustments.

    Please point out any remaining problems you see with this so I can fix them.

    @scummos
    Copy link
    Mannequin Author

    scummos mannequin commented Jan 10, 2013

    Attached is the full diff this time.

    @scummos
    Copy link
    Mannequin Author

    scummos mannequin commented Jan 10, 2013

    The patch review tool currently throws errors on submitting any form (http://pastie.org/pastes/5665048/text) so please forgive me for answering here once more. I'll copy this information (patch + message) to the review as soon as the website is working again.

    In ast.c, use the LINENO macro for n_lineno.
    Done.

    http://bugs.python.org/review/16795/diff/7080/Lib/test/test_ast.py#newcode183
    Lib/test/test_ast.py:183: def _assertTrueorder(self, ast_node,
    parent_pos, reverse_check = False):
    Wrap everything here by 80 chars.
    Done.

    http://bugs.python.org/review/16795/diff/7080/Lib/test/test_ast.py#newcode198
    Lib/test/test_ast.py:198: self.assertTrue(node_pos <= parent_pos if
    reverse_check else node_pos >= parent_pos)
    Lift the condition out of the assert call.
    Done.

    http://bugs.python.org/review/16795/diff/7080/Lib/test/test_ast.py#newcode467
    Lib/test/test_ast.py:467: self.maxDiff = None
    A comment explaining what this is for would be nice.
    Sorry, this was for testing purposes only, and I forgot to remove it.

    http://bugs.python.org/review/16795/diff/7080/Lib/test/test_ast.py#newcode589
    Lib/test/test_ast.py:589: 0, 0, 0, 0)
    These extra parameters are optional now, right? They needn't be passed
    then.
    Unfortunately not: Altough the question mark in the asdl file is present and I made fairly sure to regenerate all the derived files, the parameters are still mandatory.

    URL of the patch review: http://bugs.python.org/review/16795/

    @benjaminp
    Copy link
    Contributor

    Could you post an example of the error, please?

    @scummos
    Copy link
    Mannequin Author

    scummos mannequin commented Jan 11, 2013

    The above post has an example for trying to add a patch, here's what happens when I try to post a reply: http://pastie.org/pastes/5665144/text
    I also tried with another web browser, so it's unlikely that it's the browser's fault (but maybe the user's? ;)

    @benjaminp
    Copy link
    Contributor

    Ah, sorry, I was talking about the failure of optional arguments.

    @scummos
    Copy link
    Mannequin Author

    scummos mannequin commented Jan 11, 2013

    Ah, whops, I misunderstood that.

    The error is rather generic:
    Traceback (most recent call last):
      File "./Lib/test/test_ast.py", line 796, in test_lambda
        self._check_arguments(fac, self.expr)
      File "./Lib/test/test_ast.py", line 596, in _check_arguments
        check(arguments(args=args), "must have Load context")
      File "./Lib/test/test_ast.py", line 593, in arguments
        kwarg, kwargannotation, defaults, kw_defaults)
    TypeError: arguments constructor takes either 0 or 12 positional arguments

    It's very generic in C too:
    Python/ast.c:1571:42: error: macro "arguments" requires 13 arguments, but only 9 given

    @benjaminp
    Copy link
    Contributor

    Ah, yes. This is part of the annoying inconsistency in our asdl framework. Here's what I think should happen:

    • on arguments, vararg and kwarg should get the "arg" type, killing some of the numerous fields on arguments
    • asdl needs to be hacked, so "arg" can have a lineno and col_offset attributes like the "expr" type.

    Sorry this is getting so painful and involved.

    @scummos
    Copy link
    Mannequin Author

    scummos mannequin commented Jan 11, 2013

    Not an issue, having this thing resolved upstream would save a huge lot of pain elsewhere. ;)

    So, to make sure... I'll go to the asdl file, make arguments have two arg attributes which store the data for the var and kwarg which they can contain, then I adjust ast.c to reflect that new structure. Then I go to asdl.py and hack it so we can have attributes ( ... ) on arguments. Does that sound correct?

    @benjaminp
    Copy link
    Contributor

    Yes. Feel free to do that in separate patches as needed.

    @scummos
    Copy link
    Mannequin Author

    scummos mannequin commented Jan 16, 2013

    I think I got it mostly working now (it was quite simple in fact), but there's one issue which I can't seem to solve. This fails:

    >>> compile(ast.parse("def fun(): pass"), "<file>", "exec")
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: required field "arg" missing from arg
    
    However, this succeeds:
    >>> compile(ast.parse("def fun(*va, **kwa): pass"), "<file>", "exec")
    <code object <module> at 0x7fb390323780, file "<file>", line 1>

    The reason is quite simple: vararg and kwarg are optional in arguments, but they're of type arg, and arg has mandatory attributes ("arg", the name of the argument). Still, when calling ast.parse(), Python creates attributes called vararg, kwarg on the "attributes" object, which are set to None:

    >>> ast.parse('def fun(): pass').body[0].args.vararg.__repr__()
    'None'

    Thus, when in compile(), the code in Python_ast.c does "if ( _PyObject_HasAttrId(obj, &PyId_vararg) ) { ... }" this check says "yes there's a vararg" altough there really is none, which leads to the above error message.

    I checked the asdl file, and in fact I think this is a general issue, which was not noticed so far, since only things without mandatory attributes are used in conjunction with the question mark "?" operator there (expr and the integral types identifier, int...). Is this correct?

    An easy way to solve this problem would be to check whether the attribute is None in Python_ast.c, but I'm everything but sure this is the correct way to fix this. Alternatively, one could not create the attributes on the ast objects when they're not present in the parsed code (i.e. leave the "vararg" attribute nonexistent instead of setting it to none). What should I do about this?

    @benjaminp
    Copy link
    Contributor

    I think "None" should be treated as meaning not present for an optional argument.

    By the way, it would be good if we could get you to sign a contributor agreement. http://www.python.org/psf/contrib/

    @scummos
    Copy link
    Mannequin Author

    scummos mannequin commented Jan 19, 2013

    Here's the next version which I hope to be somewhat complete now.

    vararg and kwarg are now of type arg, and I did all the changes which are required to make this possible. The ast tests pass.

    Do you prefer to have this as one large patch all together, or would you rather like to review (and apply) 3..4 patches split into the individual features I implemented?

    @benjaminp
    Copy link
    Contributor

    Individual patches would be great.

    @scummos
    Copy link
    Mannequin Author

    scummos mannequin commented Jan 19, 2013

    Alright, I'll be back with those shortly (as soon as I found out how to do this best with hg -- I'm used to git ;). I'll also sign the contributor agreement, that's no problem of course.

    @scummos
    Copy link
    Mannequin Author

    scummos mannequin commented Jan 19, 2013

    Okay, here they are. I'm not sure how to make hg include a commit message in the patch...

    81299-extend-asdl.diff: Changes required to the ASDL framework, in order to allow attributes ( ... ) on a product
    81300-change-var-kwargs.diff: Makes var/kwarg be instances of arg, and adds the lineno / col_offset attributes to arg.
    81301-change-attr-ranges.diff: Changes power ranges as described in the first post of this report.

    All three patches include the corresponding changes to the unit tests, and hopefully the correct set of changes to the generated (Python-ast.h/.c) files.

    @scummos
    Copy link
    Mannequin Author

    scummos mannequin commented Jan 19, 2013

    second patch file

    @scummos
    Copy link
    Mannequin Author

    scummos mannequin commented Jan 19, 2013

    third patch file

    (... is there a better way to upload three files?)

    @scummos
    Copy link
    Mannequin Author

    scummos mannequin commented Jan 24, 2013

    I have signed the contributor agreement and sent a scan to the specified mail address (received no reply so far, but I guess that's okay).

    Did anyone happen to find the time to look at the patches yet?

    Greetings,
    Sven

    @benjaminp
    Copy link
    Contributor

    Thanks for signing the agreement. I'll try to look at the patches by the end of this weekend. Sorry for the delay.

    @merwok
    Copy link
    Member

    merwok commented Jan 25, 2013

    I'm not sure how to make hg include a commit message in the patch...
    See hg help export.

    (In Mercurial, the only objects are changesets; hg log trawls through commit messages (with options to see short text, full text, diff), hg diff only shows diff, and hg export is the command to output full changesets.)

    @ezio-melotti
    Copy link
    Member

    It's not necessary to include the commit message and/or use "hg export" though, since we don't import patches directly and we write the message ourself when we commit.

    (... is there a better way to upload three files?)

    You need to upload them individually and "submit changes" 3 times, but it's not necessary to write a comment every time. Also, unless the 3 patches are independent, it's usually better to upload a single diff that includes all the necessary changes.

    @scummos
    Copy link
    Mannequin Author

    scummos mannequin commented Jan 29, 2013

    Hm, I'm still getting the same error messages from the review tool which I described earlier; I can neither comment nor add patches.

    So, I'll have to abuse the bug report again:
    Thanks for the review. Is it possible you selected the wrong patch file for the second patch (Patch Set 5)? It seems to include all changes instead of just those from the second of the three patches I submitted. Also I had already removed the traceback.print_stack() call.

    I fixed the other two issues and I will attach a corrected version of the second patch for review. I hope I got everything right ;)

    Please have an extra close look at the changes to symtable.c and compile.c (since I'm not very familiar with that code), in order to avoid that we break stuff with this.

    Cheers,
    Sven

    @scummos
    Copy link
    Mannequin Author

    scummos mannequin commented Feb 7, 2013

    Any news on this yet? ;)

    Unfortunately, I'm still having no luck in adding the patch to the review tool (same error message).

    @benjaminp
    Copy link
    Contributor

    Are you attaching files directly on Rietveld or on this issue?

    @scummos
    Copy link
    Mannequin Author

    scummos mannequin commented Feb 7, 2013

    Attaching files to this bug report here works fine (see corrected patch above), but when I add the file to http://bugs.python.org/review/16795/ under "Add another patchset", I get the error message I described. I tried with firefox, chromium and konqueror.

    @benjaminp
    Copy link
    Contributor

    Yeah, I think that's broken. It's best just to attach them here.

    @ezio-melotti
    Copy link
    Member

    It's just not supported -- the "Add another patchset" link should be removed from rietveld.

    @scummos
    Copy link
    Mannequin Author

    scummos mannequin commented Feb 7, 2013

    Oh, alright, that explains things. In this case, the file I attached on Jan 29 (http://bugs.python.org/file28905/81300-change-var-kwargs-new.diff) should contain all the requested changes.

    Greetings

    @benjaminp benjaminp self-assigned this Feb 8, 2013
    @scummos
    Copy link
    Mannequin Author

    scummos mannequin commented Feb 15, 2013

    I don't want to push anything, but did you find time to review this yet? It would be great to have it in the next release.

    @brettcannon
    Copy link
    Member

    Python 3.4.0a1 isn't due until August so you have no worries about missing the next release. =)

    @benjaminp
    Copy link
    Contributor

    Hi Sven, I was about to apply this (sorry for the delay), and I realized there's one more thing. We have an example AST unparser in Tools/parser that needs to be updated for AST changes. You can run it's test suite by running test_tools in the main test suite.

    @scummos
    Copy link
    Mannequin Author

    scummos mannequin commented Mar 18, 2013

    Hi Benjamin,

    the delay is not a problem -- as long as this is in time for Python 3.4, everything is fine.

    Attached is a patch which adjusts the unparser to the changes. Acoording to the tests, this is all that needs to be updated.

    Cheers,
    Sven

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Mar 18, 2013

    New changeset 7c5c678e4164 by Benjamin Peterson in branch 'default':
    unify some ast.argument's attrs; change Attribute column offset (closes bpo-16795)
    http://hg.python.org/cpython/rev/7c5c678e4164

    @python-dev python-dev mannequin closed this as completed Mar 18, 2013
    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Mar 18, 2013

    New changeset 219c997b880b by Benjamin Peterson in branch 'default':
    add Sven Brauch for his bpo-16795 contribution
    http://hg.python.org/cpython/rev/219c997b880b

    @scummos
    Copy link
    Mannequin Author

    scummos mannequin commented Mar 18, 2013

    Thanks for reviewing this, and thanks for guiding me through the implementation process. ;)

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Feb 2, 2015

    New changeset 7d1c32ddc432 by Benjamin Peterson in branch '3.4':
    revert lineno and col_offset changes from bpo-16795 (closes bpo-21295)
    https://hg.python.org/cpython/rev/7d1c32ddc432

    @benjaminp
    Copy link
    Contributor

    People pointed out in bpo-21295 that this made some things that were possible before impossible, so the lineno and col_offset changes of this have been reverted.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    None yet
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants