Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Underscores in numeric literals not supported in lib2to3. #74055

Closed
nevsan mannequin opened this issue Mar 22, 2017 · 14 comments
Closed

Underscores in numeric literals not supported in lib2to3. #74055

nevsan mannequin opened this issue Mar 22, 2017 · 14 comments
Labels
3.7 (EOL) end of life topic-2to3 type-bug An unexpected behavior, bug, or error

Comments

@nevsan
Copy link
Mannequin

nevsan mannequin commented Mar 22, 2017

BPO 29869
Nosy @birkenfeld, @serhiy-storchaka, @Mariatta, @nevsan
PRs
  • bpo-29869: Allow underscores in numeric literals in lib2to3. #752
  • Revert bpo-29869: Allow underscores in numeric literals in lib2to3. #1109
  • bpo-29869: Allow underscores in numeric literals in lib2to3. #1119
  • [3.6] bpo-29869: Allow underscores in numeric literals in lib2to3. (GH-1119) #1122
  • bpo-29869: Add Nevada Sanchez to Misc/ACKS #1125
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2017-04-14.01:31:24.462>
    created_at = <Date 2017-03-22.00:12:25.319>
    labels = ['3.7', 'type-bug', 'expert-2to3']
    title = 'Underscores in numeric literals not supported in lib2to3.'
    updated_at = <Date 2017-04-14.01:31:24.461>
    user = 'https://github.com/nevsan'

    bugs.python.org fields:

    activity = <Date 2017-04-14.01:31:24.461>
    actor = 'Mariatta'
    assignee = 'none'
    closed = True
    closed_date = <Date 2017-04-14.01:31:24.462>
    closer = 'Mariatta'
    components = ['2to3 (2.x to 3.x conversion tool)']
    creation = <Date 2017-03-22.00:12:25.319>
    creator = 'nevsan'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 29869
    keywords = []
    message_count = 14.0
    messages = ['289951', '289982', '289988', '290001', '290007', '290008', '290023', '291598', '291600', '291618', '291623', '291634', '291635', '291636']
    nosy_count = 4.0
    nosy_names = ['georg.brandl', 'serhiy.storchaka', 'Mariatta', 'nevsan']
    pr_nums = ['752', '1109', '1119', '1122', '1125']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue29869'
    versions = ['Python 3.6', 'Python 3.7']

    @nevsan
    Copy link
    Mannequin Author

    nevsan mannequin commented Mar 22, 2017

    The following should work in Python 3.6

    from lib2to3.pgen2 import driver
    from lib2to3 import pytree
    from lib2to3 import pygram
    
    _GRAMMAR_FOR_PY3 = pygram.python_grammar_no_print_statement.copy()
    parser_driver = driver.Driver(_GRAMMAR_FOR_PY3, convert=pytree.convert)
    tree = parser_driver.parse_string('100_1\n', debug=False)
    

    @nevsan nevsan mannequin added topic-2to3 type-bug An unexpected behavior, bug, or error labels Mar 22, 2017
    @Mariatta Mariatta added the 3.7 (EOL) end of life label Mar 22, 2017
    @serhiy-storchaka
    Copy link
    Member

    Python uses more strong rules for underscores in numerical literals. Underscores are acceptable between digits and between the base prefix and digit. For example the regular expression for hexadecimals is r'0[xX]?[\da-fA-F]+(?:[\da-fA-F]+)*[lL]?'.

    Underscores also are acceptable in the exponent of float literals:

    Exponent = r'[eE][-+]?\d+(?:_\d+)*'

    @nevsan
    Copy link
    Mannequin Author

    nevsan mannequin commented Mar 22, 2017

    The existing regular expressions weren't actually strict enough as is, so I made them even more correct. In particular, we must have at least one digit following 0[xXbBoO] and must be before any underscores.

    I have a small set of test cases to examine correctness of these regular expressions: https://gist.github.com/nevsan/7fc78dc61d309842406d67d6839b9861

    @birkenfeld
    Copy link
    Member

    In particular, we must have at least one digit following 0[xXbBoO] and must be before any underscores.

    This is not true (but your test file does the right thing).

    @nevsan
    Copy link
    Mannequin Author

    nevsan mannequin commented Mar 22, 2017

    Thanks, it seems I misspoke. Glad I tested it!

    @serhiy-storchaka
    Copy link
    Member

    I suggest to use my regular expression for haxadedecimals. Your regular expression starves from catastrophic backtracking. Compare two examples:

    re.match(r'0[xX]?[\da-fA-F]+(?:[\da-fA-F]+)*[lL]?'+r'\b', '0x'+'0'*100+'z')
    re.match(r'0xX+[lL]?'+r'\b', '0x'+'0'*100+'z')

    @nevsan
    Copy link
    Mannequin Author

    nevsan mannequin commented Mar 23, 2017

    Good point. Updated.

    @Mariatta
    Copy link
    Member

    New changeset 97a40b7 by Mariatta (Nevada Sanchez) in branch '3.6':
    bpo-29869: Allow underscores in numeric literals in lib2to3. (GH-752)
    97a40b7

    @Mariatta
    Copy link
    Member

    New changeset 84c2d75 by Mariatta in branch '3.6':
    Revert "bpo-29869: Allow underscores in numeric literals in lib2to3. (GH-752)" (GH-1109)
    84c2d75

    @Mariatta
    Copy link
    Member

    Nevada Sanchez, please create your PR against the master branch.
    Once that is merged, we will backport it to the 3.6 branch.

    Thanks :)

    @Mariatta
    Copy link
    Member

    New changeset a6e395d by Mariatta (Nevada Sanchez) in branch 'master':
    bpo-29869: Allow underscores in numeric literals in lib2to3. (GH-1119)
    a6e395d

    @Mariatta
    Copy link
    Member

    New changeset 2cdf087 by Mariatta in branch '3.6':
    [3.6] bpo-29869: Allow underscores in numeric literals in lib2to3. (GH-1119) (GH-1122)
    2cdf087

    @Mariatta
    Copy link
    Member

    New changeset 9476299 by Mariatta in branch 'master':
    bpo-29869: Add Nevada Sanchez to Misc/ACKS (GH-1125)
    9476299

    @Mariatta
    Copy link
    Member

    PR has been merged and backported to 3.6.
    I also added Nevada Sanchez to Misc/ACKS.
    Thanks all :)

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life topic-2to3 type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants