> So you think that even a dedicated "LEN" opcode would not be any faster? (This is getting in Paul Sokolovsky territory -- IIRC he has a variant of Python that doesn't allow overriding builtins.)

Yeah, a dedicated LEN opcode could only be faster if it would not be possible to shadow builtins (or if there was a "len" operator in Python).  If that's not the case, this hypothetical LEN opcode would still have to check if "len" was shadowed or not, and that's slower than the optimized LOAD_GLOBAL we have now.
