classification
Title: Display the bytecode when compiled a regular expression in debug mode
Type: enhancement Stage: resolved
Components: Library (Lib), Regular Expressions Versions: Python 3.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Jonathan Goble, ezio.melotti, mrabarnett, serhiy.storchaka, terry.reedy
Priority: normal Keywords:

Created on 2017-05-07 15:27 by serhiy.storchaka, last changed 2017-05-14 06:54 by serhiy.storchaka. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 1491 merged serhiy.storchaka, 2017-05-07 15:32
Messages (5)
msg293198 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-05-07 15:27
Proposed patch makes compiling a regular expression in debug mode (with the re.DEBUG flag) displaying the bytecode in human readable form (in addition to the syntax tree). For example:

>>> re.compile('test_[a-z_]+', re.DEBUG)
LITERAL 116
LITERAL 101
LITERAL 115
LITERAL 116
LITERAL 95
MAX_REPEAT 1 MAXREPEAT
  IN
    RANGE (97, 122)
    LITERAL 95

 0. INFO 16 0b1 6 MAXREPEAT (to 17)
      prefix_skip 5
      prefix [0x74, 0x65, 0x73, 0x74, 0x5f] ('test_')
      overlap [0, 0, 0, 1, 0]
17: LITERAL 0x74 ('t')
19. LITERAL 0x65 ('e')
21. LITERAL 0x73 ('s')
23. LITERAL 0x74 ('t')
25. LITERAL 0x5f ('_')
27. REPEAT_ONE 12 1 MAXREPEAT (to 40)
31.   IN 7 (to 39)
33.     RANGE 0x61 0x7a ('a'-'z')
36.     LITERAL 0x5f ('_')
38.     FAILURE
39:   SUCCESS
40: SUCCESS
re.compile('test_[a-z_]+', re.DEBUG)

This feature is needed mainly for our own needs. It can help optimizing regular expression compilation.
msg293211 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2017-05-07 19:36
The new output is the blank line and numbered lines, produced by the new dis function.

The addition is specific to CPython's re module.  Thus the doc for re.DEBUG remains "Display debug information about compiled expression."  I think that the NEWS entry should also mention that this is a cpython-specific enhancement and not a language change.  See review note.
msg293214 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-05-07 20:17
What the review note you mean Terry?
msg293224 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2017-05-08 04:04
Whoops, the one I thought I added previously.  I must not have clicked the [comment] button after writing it and before closing the tab.
msg293636 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-05-14 06:05
New changeset 4ab6abfca4d6e444cca04821b24701cde6993f4e by Serhiy Storchaka in branch 'master':
bpo-30299: Display a bytecode when compile a regex in debug mode. (#1491)
https://github.com/python/cpython/commit/4ab6abfca4d6e444cca04821b24701cde6993f4e
History
Date User Action Args
2017-05-14 06:54:04serhiy.storchakasetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2017-05-14 06:05:15serhiy.storchakasetmessages: + msg293636
2017-05-09 00:48:53Jonathan Goblesetnosy: + Jonathan Goble
2017-05-08 04:04:04terry.reedysetmessages: + msg293224
2017-05-07 20:17:07serhiy.storchakasetmessages: + msg293214
2017-05-07 19:36:27terry.reedysetnosy: + terry.reedy
messages: + msg293211
2017-05-07 15:32:41serhiy.storchakasetpull_requests: + pull_request1593
2017-05-07 15:27:38serhiy.storchakacreate