classification
Title: Convert collections (cmp_op, hasconst, hasname and others) in opcode module to more optimal type
Type: performance Stage: patch review
Components: Library (Lib) Versions: Python 3.9
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: godaygo, larry, rhettinger, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2018-04-21 18:52 by godaygo, last changed 2019-07-13 18:12 by rhettinger.

Pull Requests
URL Status Linked Edit
PR 14742 closed BTaskaya, 2019-07-13 13:13
Messages (6)
msg315576 - (view) Author: Kirill Balunov (godaygo) Date: 2018-04-21 18:52
The opcode module contains several collections:

`cmp_op`
`hasconst`
`hasname`
`hasjrel`
...

which are only used for `in` checks. At the same time, they are stored as `list`s and `cmp_op` as tuple. Both these types are not optimal for `__contains__` checks. Maybe it is worth at least to convert them to `frozenset` type after they are filled?
msg315817 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-04-26 20:06
This list are short. I don't think there is a large benefit of this change. But there is a small risk of breaking a third party code (for example hasjrel+hasjabs).
msg315818 - (view) Author: Kirill Balunov (godaygo) Date: 2018-04-26 20:51
Small risk of breaking is a fair point (maybe some FutureWarning with new __getattr__ PEP 562?). I've checked several packages:

---
vstinner/bytecode:: uses:

@staticmethod
    def _has_jump(opcode):
        return (opcode in _opcode.hasjrel
                or opcode in _opcode.hasjabs)

---
maynard:: defines them as sets and does not rely on opcode module.

all_jumps = absolute_jumps | relative_jumps

---
numba:: converts them to frozensets:

JREL_OPS = frozenset(dis.hasjrel)
JABS_OPS = frozenset(dis.hasjabs)
JUMP_OPS = JREL_OPS | JABS_OPS

---
codetransfromer:: uses:

absjmp = opcode in hasjabs
reljmp = opcode in hasjrel

---
anotherassembler.py:: uses

elif opcode in hasjrel or opcode in hasjabs:

---
byteplay:: converts them to set:

hasjrel = set(Opcode(x) for x in opcode.hasjrel)
hasjabs = set(Opcode(x) for x in opcode.hasjabs)
hasjump = hasjrel.union(hasjabs)

---
byterun:: uses:

elif byteCode in dis.hasjrel:
    arg = f.f_lasti + intArg
elif byteCode in dis.hasjabs:
    arg = intArg

In fact, all of the above indicated does not mean anything, but I have not found cases of hasjrel+hasjabs.

Despite the fact that they are small, on average, with sets I gain 5-6x speed-up.
msg315820 - (view) Author: Kirill Balunov (godaygo) Date: 2018-04-26 21:12
I apologize for FutureWarning and __getattr__. I myself do not understand what I meant and how it will help in this situation :)
msg347824 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2019-07-13 14:11
Maynard is unsupported; it only understands the old bytecode format, pre-3.6 16-bit "wordcode".

https://docs.python.org/3.6/whatsnew/3.6.html#optimizations
msg347850 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2019-07-13 18:11
+1 for switching to sets or frozensets.
History
Date User Action Args
2019-07-13 18:12:00rhettingersetversions: + Python 3.9, - Python 3.7, Python 3.8
2019-07-13 18:11:54rhettingersetnosy: + rhettinger
messages: + msg347850
2019-07-13 14:11:38larrysetnosy: larry, serhiy.storchaka, godaygo
messages: + msg347824
2019-07-13 13:13:46BTaskayasetkeywords: + patch
stage: patch review
pull_requests: + pull_request14538
2018-04-26 21:12:52godaygosetmessages: + msg315820
2018-04-26 20:51:55godaygosetmessages: + msg315818
2018-04-26 20:06:20serhiy.storchakasetmessages: + msg315817
2018-04-26 19:41:51godaygosetnosy: + larry, serhiy.storchaka
2018-04-21 18:52:42godaygocreate