Message339709
>I am -1 for this optimization because it affects only one particular case (neither other kinds of comprehensions, nor generator expressions, nor list comprehensions with
conditions) and even in this case it is small. It is possible to add a lot of other optimizations for other cases which will sped up them to 50% or 100%, but we do not do
this, because every such optimization has a cost. It increases the amount of code which should be maintained and covered by tests, it adds small overhead in common cases to
speed up an uncommon case, and increasing the code base can negatively affect surrounding code (just because the CPU cache and registers are used inappropriate and the
compiler optimizes less important paths).
Understood, I had hoped this change would have a broader impact. The additional opcode is not ideal either.
> In addition, while this change speed up list comprehensions for long list, it slows down them for short lists. Short lists are more common.
I've been profiling this today, basically, this implementation receives the `list_iter`, `range_iter`, etc.
There is no standard object model for an iterator's length, _PyObject_HasLen would return false because it neither implements tp_as_sequence nor, tp_as_mapping (rightly so).
What this has uncovered (so hopefully there's some value from this whole experience!) is that __length_hint__ for iterators is _really_ inefficient. Take a list_iterator for example:
PyObject_LengthHint will call, _PyObject_HasLen, which returns false, which then goes to call
_PyObject_LookupSpecial, then
_PyObject_CallNoArg, which calls
listiter_len, which calls
PyList_GET_SIZE which returns a Py_ssize_t, which is then converted to a PyLong via
PyLong_FromSsize_t, which is then returned back to PyObject_LengthHint, which then
PyLong_AsSsize_t is run to convert the PyLong back into a Py_ssize_t
The Py_ssize_t is then finally returned to the caller!
My conclusion was that the list comprehension should be initialized to the length of the target, before GET_ITER is run. This would remove the overhead for range objects, because you could simply call _PyObject_HasLen, which would return true for dict, list, tuple and set, but false for range objects (which is what you want).
The issue is that GET_ITER is called outside the code object for the comprehension, so you'd have to pass an additional argument to the comprehension generator.
This is way outside of my expertise, but the only way I can see to find a sizeable benefit, with minimal code and no edge cases which are slower.
Thanks for your time |
|
Date |
User |
Action |
Args |
2019-04-09 07:04:14 | anthony shaw | set | recipients:
+ anthony shaw, ronaldoussoren, ncoghlan, methane, serhiy.storchaka, Aaron Hall, pablogsal |
2019-04-09 07:04:14 | anthony shaw | set | messageid: <1554793454.86.0.460464278237.issue36551@roundup.psfhosted.org> |
2019-04-09 07:04:14 | anthony shaw | link | issue36551 messages |
2019-04-09 07:04:14 | anthony shaw | create | |
|