I was a bit surprised when I ran into this issue when porting some nose tests from Windows to Linux:
#!/usr/bin/env python
with open('/etc/services') as fd:
lines = fd.readlines()
lines.append('')
SERVICES = [line.split()[0] for line in lines
if (line and not line.startswith('#'))]
$ python list_comprehension.py
Traceback (most recent call last):
File "list_comprehension.py", line 5, in <module>
if (line and not line.startswith('#'))]
IndexError: list index out of range
$ python3.2 list_comprehension.py
Traceback (most recent call last):
File "list_comprehension.py", line 4, in <module>
SERVICES = [line.split()[0] for line in lines
File "list_comprehension.py", line 5, in <listcomp>
if (line and not line.startswith('#'))]
IndexError: list index out of range
$ python -V
Python 2.7.5
$ python3.2 -V
Python 3.2.5
This is occurring of course because the .split() is being done on an empty line.
The docs don't note (at least in the list comprehension section [*]) that if-statements are evaluated after the value is generated for the current index in the loop. This seems very backwards because generating a value could in fact be very expensive, whereas determining whether or not a precondition has been met should be less expensive.
What could/should be done is one of two things: 1. evaluation order should be clearly spelled out in the docs, or 2. the list comprehension handling code should be changed to support evaluating the conditional statements before calculating a result. Otherwise discouraging use of [map+]filter (at least pylint does that) seems incredibly unwise as I can get the functionality I want in a single line as opposed to an unrolled loop.
[*] http://docs.python.org/2/tutorial/datastructures.html#list-comprehensions
|