classification
Title: Hide iteration variable in list comprehensions
Type: Stage:
Components: Interpreter Core Versions: Python 3.0
process
Status: closed Resolution: accepted
Dependencies: Superseder:
Assigned To: georg.brandl Nosy List: georg.brandl, gvanrossum, ncoghlan
Priority: normal Keywords: patch

Created on 2007-02-15 11:29 by ncoghlan, last changed 2008-01-06 22:29 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
list_comp_private_iter_vars.diff ncoghlan, 2007-02-15 11:29 Hide list comp iteration variables
new-set-comps.diff georg.brandl, 2007-02-26 10:30 + set comps
new-comps-updated.diff ncoghlan, 2007-04-14 08:40 Compatible with py3k SVN as of Apr 14
Messages (12)
msg51882 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2007-02-15 11:29
This patch hides the iteration variable in list comprehensions.
It adds new tests (modelled on the generator expression tests) and also removes some del statements from the standard library (where code previously cleaned up its own namespace).
The changes to symtable.[ch] are more significant than strictly necessary - I found it necessary to spend some time cleaning up the code in order to understand what was needed for the list comprehension changes. Given that the 2.x and 3.0 compilers have already diverged fairly significantly, I don't believe this will make the process of keeping them in sync any more difficult than it is already.

Assigning to Georg for initial review (as his set comprehensions patch provided a great deal of inspiration for this one).
msg51883 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2007-02-15 11:53
Speed measurements show a significant speed up over trunk & Python 2.4 for module/class level code:

(Python 2.4)$ python -m timeit -s "seq=range(1000)" "[[x for x in seq] for y in seq]"
10 loops, best of 3: 239 msec per loop
(Python 2.x trunk)$ ./python -m timeit -s "seq=range(1000)" "[[x for x in seq] for y in seq]"
10 loops, best of 3: 193 msec per loop
(Python 3000)$ ./python -m timeit -s "seq=range(1000)" "[[x for x in seq] for y in seq]"
10 loops, best of 3: 176 msec per loop

This is almost certainly due to the variables and the list object becoming function locals.

There is a slowdown inside a function (but we are still faster than Python 2.4):

(Python 2.4)$ python -m timeit -s "seq=range(1000)" -s "def f(): return [[x for x in seq] for y in seq]" "f()"
10 loops, best of 3: 259 msec per loop
(Python 2.x trunk)$ ./python -m timeit -s "seq=range(1000)" -s "def f(): return [[x for x in seq] for y in seq]" "f()"
10 loops, best of 3: 176 msec per loop
(Python 3000)$ ./python -m timeit -s "seq=range(1000)" -s "def f(): return [[x for x in seq] for y in seq]" "f()"
10 loops, best of 3: 185 msec per loop
msg51884 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2007-02-15 12:13
For reference, the original set comprehensions patch & the py3k list discussion:
  http://www.python.org/sf/1548388
  http://mail.python.org/pipermail/python-3000/2006-December/005188.html

Note that the current patch ended up looking a *lot* like the original one (the main difference specific to list comprehensions is that the temporary list is built in the inner scope and then returned rather than being passed in as an argument. Additionally, the code has been unified only for the symtable stage - the AST and compilation stages still use separate code for listcomps and genexps).

It turns out that there are some really curly scoping problems that using a nested function deals with automatically (see the new test_listcomps.py in the patch for examples). Having a more efficient mechanism specifically for 'transient' scopes like this is an interesting idea, but it's far from easy (mainly because the variables in the scope still need to be properly visible in scopes it *contains*).

Adding set comprehensions on top of this patch should be pretty straightforward.
msg51885 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2007-02-26 10:30
Okay, I looked at the patch, and apart from a refleak in one of the compiler methods I couldn't find anything wrong.
I then manually applied the rest of my set comprehension patch. The result is attached.

Grammar, AST and compilation for comprehensions are now really unified.

It passes the tests for listcomps, genexps and setcomps.

I couldn't check properly for refleaks since e.g. test_genexps leaks in the branch head, as well as some other tests. I'm currently searching for the offending revision(s)...
File Added: new-set-comps.diff
msg51886 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2007-03-07 22:56
Can't say that slowdown bothers me much, if it's typical.

However I think you need to do a svn up in your workspace and regenerate the patch; I get FAILED message from patch on all the generated files and a few others: graminit.[ch], symtable.[ch], Python-ast.c, symbol.py.
msg51887 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2007-03-12 12:17
The patch for 'nonlocal' was applied since Georg & I started working on this. It's going to take me a bit of fiddling to update the patch so that the parser/compiler stage play well with each other (I really should have just believed the patch utility when it reported a conflict trying to patch symtable.c).
msg51888 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2007-04-14 08:40
I've uploaded a new version of the patch that is compatible with the py3k SVN branch as of April 14.

The patch still includes the various cleanups I made to the symbol table processing while working out how to make this change (use sets where appropriate, avoid using the same variable name to refer to completely different things, make a couple of syntax error messages more explicit).

The slowdown should be fixed at the cost of a single function call - the shorter the sequence being iterated over, the greater the percentage slowdown that will be. The speed of generator expressions should actually be (very) marginally increased as they now skip the SETUP_LOOP/POP_BLOCK steps in the same fashion as list comprehensions always have.
File Added: new-comps-updated.diff
msg51889 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2007-04-14 14:56
Assuming all tests pass in a debug build and you don't see any leaks with "regrtest.py -R 4:3:", just check it in!  I've been waiting long enough for this.  Speedups are for post 3.0a1.
msg51890 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2007-04-14 17:01
I applied this and ran regrtest -R. Found no refleaks, and the only tests that failed were compiler and transformer, as expected.
msg51891 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2007-04-14 17:42
OK, I think Nick can check it in himself, right Nick?
msg51892 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2007-04-15 12:06
test_compiler and test_transformer are failing due to the fact that the compiler package still needs some TLC to catch up with the Grammar changes.

test_logging, test_tcl and test_structmembers all fail when run with -R 4:3:. However, these failures don't appear to be due to this change. More importantly, no leaks are reported in the new tests that are part of this update (test_listcomps, test_setcomps).

Committed as rev 54835.
msg51893 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2007-06-02 08:57
The compiler package has been thrown out of Py3k, so there's no reason to leave this open anymore.
History
Date User Action Args
2008-01-06 22:29:46adminsetkeywords: - py3k
versions: + Python 3.0
2007-02-15 11:29:02ncoghlancreate