New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache constant "slice" instances #55316
Comments
Follow-up to ticket 10227. The following facts seem to indicate that it would be worth caching constant instances of the slice type, such as in [:] or [:-1]. with cached slice instance: $ ./python -m timeit -s 'l = list(range(100)); s=slice(None)' 'l[s]'
1000000 loops, best of 3: 0.464 usec per loop
$ ./python -m timeit -s 'l = list(range(10)); s=slice(None)' 'l[s]'
10000000 loops, best of 3: 0.149 usec per loop
$ ./python -m timeit -s 'l = list(range(10)); s=slice(None,1)' 'l[s]'
10000000 loops, best of 3: 0.135 usec per loop uncached normal usage: $ ./python -m timeit -s 'l = list(range(100))' 'l[:]'
1000000 loops, best of 3: 0.499 usec per loop
$ ./python -m timeit -s 'l = list(range(100))' 'l[:1]'
10000000 loops, best of 3: 0.171 usec per loop Timings based on Python 3.2 rc2. A quick grep against the py3k stdlib finds 2096 lines in 393 files that use constant slices. |
Erm, bpo-10227. |
Similar idea has been rejected in bpo-2268 because the win was too small to justify the number of changes that had to be made. |
Hmm, ok, but AFAICT, your patch was rejected rather because of the way it approached the problem, not so much because of the issue itself. Plus, the fact that Python 3 requires slices in more places than Python 2 (which had the lower level getslice protocol) makes this a bigger issue now than it was three years ago. |
I would be rather for the patch myself. The bytecode currently generated
True. |
I'm trying to sprint on this this week. I already had a PoC put together, but it has some significant issues caused by one major quirk with slices:
In various places, it is (reasonably) assumed that constants can be hashed (so as to coalesce identical constants), and since slice objects can't be, this causes problems. In my proof of concept, I cheated, and added a field to the slice object, hashable, that was only set to true when the slice was a result of:
Obviously this isn't practical for an actual patch, and I'm trying to figure out if there is a way around it. Right now, it seems like ast_opt.c's approach may not be necessary (even if I don't set hashable, it still works, and seems to coalesce, which seems odd; pretty sure when I first started the slices tried to get stored as dict keys at some point). I'm still having issues figuring out to handle marshaling; my code currently works, but where the first load of a module (from .py) performs constant coalescence, my attempts to make slices use w_ref/R_REF appropriately have been unsuccessful. I'd greatly appreciate any help/tips from anyone who knows enough about:
If you're at the sprints, I'm sitting in the CPython room directly in front of the entrance. |
Hi josh.r, this may be dump but is there a reason not to make slices hashable? Couldn't they hash like a tuple (start, stop, step)? I looked around in the code, bpo and the mailing lists but could not find a reason why not, maybe it was just not useful at the time but could be implemented now? |
To support slice constants we need to change the marshal format and introduce new version (current version is 4 and it was stable for several releases). If my memory do not betray me we already discussed this on other issue and the conclusion was that the benefit of this is too small for such large change. But if we will need to add new marshal version for other reasons, supporting slices will be a nice addition. We just do not have enough reasons for this now. |
Remi: The reason they're not hashable is presumably because it would make them valid dictionary keys, further confusing the difference between sequences and mappings. x[::-1] would, for a sequence, return a reversed sequence, but for a mapping, would end up using the slice as a key (not actually slicing) to obtain a single value. Confusing at best. Serhiy: It was discussed on the other issue, but the rationale there was, to quote Raymond:
Of those, only the first bullet is (essentially) unchanged. Addressing the following points in order:
Perhaps the issues with "too many other things need to be changed" remain, but I'd like to at least have a PR in place so that, should the marshalling format change be accepted, we'd be able to take advantage of it. Side-note: The marshal code for marshalling slices that I chose was ':'; seemed rather appropriate for slices, and conveniently, it was not already in use. :-) |
Batuhan Taskaya opened a duplicate issue bpo-42454. Since it includes PR-23496, with some review, and at least as much discussion, I am closing this issue, contrary to normal policy. Worthwhile? Batuhan found about 100000 constant slices in 3200 PyPI packages, and 3-50% speedup. Rémi asked "Couldn't [slices] hash like a tuple (start, stop, step)?" Josh: the PR defines TYPE_SLICE as ':'. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: