Message315802
It seems to me that regular expressions used in the lib2to3 version are more efficient but more complex.
$ ./python -m timeit -s 'import re; p = re.compile(r"0[bB](?:_?[01])+"); s = "0b"+"_0101"*16' 'p.match(s)'
100000 loops, best of 5: 2.45 usec per loop
$ ./python -m timeit -s 'import re; p = re.compile(r"0[bB]_?[01]+(?:_[01]+)*"); s = "0b"+"_0101"*16' 'p.match(s)'
200000 loops, best of 5: 1.08 usec per loop
$ ./python -m timeit -s 'import re; p = re.compile(r"0[xX](?:_?[0-9a-fA-F])+[lL]?"); s = "0x_0123_4567_89ab_cdef"' 'p.match(s)'
500000 loops, best of 5: 815 nsec per loop
$ ./python -m timeit -s 'import re; p = re.compile(r"0[xX]_?[\da-fA-F]+(?:_[\da-fA-F]+)*[lL]?"); s = "0x_0123_4567_89ab_cdef"' 'p.match(s)'
500000 loops, best of 5: 542 nsec per loop
Since the performance of lib2to3 is important, it is better to keep the current regexpes.
But using \d in Python 3 is a bug, it should be replaced with [0-9]. This also speeds up the regex:
$ ./python -m timeit -s 'import re; p = re.compile(r"0[xX]_?[0-9a-fA-F]+(?:_[0-9a-fA-F]+)*[lL]?"); s = "0x_0123_4567_89ab_cdef"' 'p.match(s)'
500000 loops, best of 5: 471 nsec per loop |
|
Date |
User |
Action |
Args |
2018-04-26 14:50:37 | serhiy.storchaka | set | recipients:
+ serhiy.storchaka, lukasz.langa |
2018-04-26 14:50:37 | serhiy.storchaka | set | messageid: <1524754237.39.0.682650639539.issue33338@psf.upfronthosting.co.za> |
2018-04-26 14:50:37 | serhiy.storchaka | link | issue33338 messages |
2018-04-26 14:50:37 | serhiy.storchaka | create | |
|