This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Underscores in numeric literals not supported in lib2to3.
Type: behavior Stage: resolved
Components: 2to3 (2.x to 3.x conversion tool) Versions: Python 3.7, Python 3.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Mariatta, georg.brandl, nevsan, serhiy.storchaka
Priority: normal Keywords:

Created on 2017-03-22 00:12 by nevsan, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 752 merged nevsan, 2017-03-22 00:13
PR 1109 merged Mariatta, 2017-04-13 10:25
PR 1119 merged nevsan, 2017-04-13 17:19
PR 1122 merged Mariatta, 2017-04-13 23:17
PR 1125 merged Mariatta, 2017-04-14 01:25
Messages (14)
msg289951 - (view) Author: Nevada Sanchez (nevsan) * Date: 2017-03-22 00:12
The following should work in Python 3.6

```
from lib2to3.pgen2 import driver
from lib2to3 import pytree
from lib2to3 import pygram

_GRAMMAR_FOR_PY3 = pygram.python_grammar_no_print_statement.copy()
parser_driver = driver.Driver(_GRAMMAR_FOR_PY3, convert=pytree.convert)
tree = parser_driver.parse_string('100_1\n', debug=False)
```
msg289982 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-03-22 11:56
Python uses more strong rules for underscores in numerical literals. Underscores are acceptable between digits and between the base prefix and digit. For example the regular expression for hexadecimals is r'0[xX]_?[\da-fA-F]+(?:_[\da-fA-F]+)*[lL]?'.

Underscores also are acceptable in the exponent of float literals:

Exponent = r'[eE][-+]?\d+(?:_\d+)*'
msg289988 - (view) Author: Nevada Sanchez (nevsan) * Date: 2017-03-22 14:56
The existing regular expressions weren't actually strict enough as is, so I made them even more correct. In particular, we must have at least one digit following `0[xXbBoO]` and must be before any underscores.

I have a small set of test cases to examine correctness of these regular expressions: https://gist.github.com/nevsan/7fc78dc61d309842406d67d6839b9861
msg290001 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2017-03-22 17:15
> In particular, we must have at least one digit following `0[xXbBoO]` and must be before any underscores.

This is not true (but your test file does the right thing).
msg290007 - (view) Author: Nevada Sanchez (nevsan) * Date: 2017-03-22 18:41
Thanks, it seems I misspoke. Glad I tested it!
msg290008 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-03-22 18:57
I suggest to use my regular expression for haxadedecimals. Your regular expression starves from catastrophic backtracking. Compare two examples:

re.match(r'0[xX]_?[\da-fA-F]+(?:_[\da-fA-F]+)*[lL]?'+r'\b', '0x'+'0'*100+'z')
re.match(r'0[xX](?:_?[\da-fA-F]+)+[lL]?'+r'\b', '0x'+'0'*100+'z')
msg290023 - (view) Author: Nevada Sanchez (nevsan) * Date: 2017-03-23 04:17
Good point. Updated.
msg291598 - (view) Author: Mariatta (Mariatta) * (Python committer) Date: 2017-04-13 10:21
New changeset 97a40b7a5b2979fb17e1751c139fd4ba1ebd5276 by Mariatta (Nevada Sanchez) in branch '3.6':
bpo-29869: Allow underscores in numeric literals in lib2to3. (GH-752)
https://github.com/python/cpython/commit/97a40b7a5b2979fb17e1751c139fd4ba1ebd5276
msg291600 - (view) Author: Mariatta (Mariatta) * (Python committer) Date: 2017-04-13 11:03
New changeset 84c2d75489a84174d8993aea292828662e35a50f by Mariatta in branch '3.6':
Revert "bpo-29869: Allow underscores in numeric literals in lib2to3. (GH-752)" (GH-1109)
https://github.com/python/cpython/commit/84c2d75489a84174d8993aea292828662e35a50f
msg291618 - (view) Author: Mariatta (Mariatta) * (Python committer) Date: 2017-04-13 15:31
Nevada Sanchez, please create your PR against the master branch.
Once that is merged, we will backport it to the 3.6 branch.

Thanks :)
msg291623 - (view) Author: Mariatta (Mariatta) * (Python committer) Date: 2017-04-13 17:32
New changeset a6e395dffadf8c5124903c01ad69fefa36b1a935 by Mariatta (Nevada Sanchez) in branch 'master':
bpo-29869: Allow underscores in numeric literals in lib2to3. (GH-1119)
https://github.com/python/cpython/commit/a6e395dffadf8c5124903c01ad69fefa36b1a935
msg291634 - (view) Author: Mariatta (Mariatta) * (Python committer) Date: 2017-04-13 23:54
New changeset 2cdf087d1fd48f7d0f95b5a0b31b9a624fa84751 by Mariatta in branch '3.6':
[3.6] bpo-29869: Allow underscores in numeric literals in lib2to3. (GH-1119) (GH-1122)
https://github.com/python/cpython/commit/2cdf087d1fd48f7d0f95b5a0b31b9a624fa84751
msg291635 - (view) Author: Mariatta (Mariatta) * (Python committer) Date: 2017-04-14 01:30
New changeset 947629916a5ecb1f6f6792e9b9234e084c5bf274 by Mariatta in branch 'master':
bpo-29869: Add Nevada Sanchez to Misc/ACKS (GH-1125)
https://github.com/python/cpython/commit/947629916a5ecb1f6f6792e9b9234e084c5bf274
msg291636 - (view) Author: Mariatta (Mariatta) * (Python committer) Date: 2017-04-14 01:31
PR has been merged and backported to 3.6.
I also added Nevada Sanchez to Misc/ACKS.
Thanks all :)
History
Date User Action Args
2022-04-11 14:58:44adminsetgithub: 74055
2017-04-14 01:31:24Mariattasetstatus: open -> closed
stage: patch review -> resolved
2017-04-14 01:31:11Mariattasetresolution: fixed
messages: + msg291636
2017-04-14 01:30:44Mariattasetmessages: + msg291635
2017-04-14 01:25:41Mariattasetpull_requests: + pull_request1260
2017-04-13 23:54:51Mariattasetmessages: + msg291634
2017-04-13 23:17:39Mariattasetpull_requests: + pull_request1259
2017-04-13 17:32:56Mariattasetmessages: + msg291623
2017-04-13 17:19:11nevsansetpull_requests: + pull_request1257
2017-04-13 15:31:17Mariattasetmessages: + msg291618
2017-04-13 11:03:18Mariattasetmessages: + msg291600
2017-04-13 10:25:02Mariattasetpull_requests: + pull_request1250
2017-04-13 10:21:07Mariattasetnosy: + Mariatta
messages: + msg291598
2017-03-23 04:17:38nevsansetmessages: + msg290023
2017-03-22 18:57:03serhiy.storchakasetmessages: + msg290008
2017-03-22 18:41:49nevsansetmessages: + msg290007
2017-03-22 17:15:17georg.brandlsetmessages: + msg290001
2017-03-22 14:56:58nevsansetmessages: + msg289988
2017-03-22 11:56:56serhiy.storchakasetnosy: + georg.brandl, serhiy.storchaka
messages: + msg289982
2017-03-22 02:36:42Mariattasetstage: patch review
versions: + Python 3.7
2017-03-22 00:13:36nevsansetpull_requests: + pull_request668
2017-03-22 00:12:25nevsancreate