classification
Title: all() returns wrong result when the parameters are non-encapsulated list-comprehension
Type: Stage:
Components: Windows Versions: Python 2.6
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: Jonathan.Livni, georg.brandl, mark.dickinson, rhettinger
Priority: normal Keywords:

Created on 2011-02-16 08:31 by Jonathan.Livni, last changed 2011-02-17 16:07 by mark.dickinson. This issue is now closed.

Files
File name Uploaded Description Edit
Eclipse.JPG Jonathan.Livni, 2011-02-16 08:31 Screenshot from pydev debugger
Messages (12)
msg128628 - (view) Author: Jonathan Livni (Jonathan.Livni) Date: 2011-02-16 08:31
all( (x<=y) for x,y in zip(L, L[1:]) )
all([(x<=y) for x,y in zip(L, L[1:])])

Both lines of code above check if L is a non-decreasing list. Both should return the same results. But under some conditions, they don't. I've encountered this with a list of Decimal numbers.
This is 100% reproducible on my Win7 64bit vanilla Python 2.6.6 32bit setup, alas I cannot share the specific code that generates this difference.
See attached screenshot from Eclipse Pydev debugger.
msg128629 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2011-02-16 08:40
It's not easy to reproduce this without the full list of decimals.

Do you have a nonstandard decimal Context set?

What is the result if you put the LC into a function, i.e.

def f(L):
    return [(x<=y) for x,y in zip(L, L[1:])]

print all(f(L))
msg128630 - (view) Author: Jonathan Livni (Jonathan.Livni) Date: 2011-02-16 08:47
The exact list of decimals doesn't help - I tried taking the list and reproducing the bug with the following short script, but the problem did not reproduced:

from decimal import Decimal
L = [Decimal('6.700'), Decimal('6.800'), Decimal('7.140'), Decimal('7.460'), Decimal('7.735'), Decimal('8.160'), Decimal('8.280'), Decimal('8.355'), Decimal('8.710'), Decimal('9.640'), Decimal('10.155'), Decimal('10.460'), Decimal('10.810'), Decimal('11.875'), Decimal('12.310'), Decimal('12.315'), Decimal('13.250'), Decimal('13.205'), Decimal('13.750'), Decimal('14.245'), Decimal('14.805'), Decimal('15.385'), Decimal('15.955'), Decimal('16.365'), Decimal('16.960'), Decimal('17.500'), Decimal('19.445')]
print all(x<=y for x, y in zip(L, L[1:]))

The script above rightfully printed False.
The decimal list above was taken from the pydev debugger session where I found the bug.

In the original script I do not mess around with Decimal at all. I just cast too and from float and use simple arithmetics with it.
msg128631 - (view) Author: Jonathan Livni (Jonathan.Livni) Date: 2011-02-16 08:51
Another note - the original problematic code looks like this:

def non_decreasing(L):
    return all(x<=y for x, y in zip(L, L[1:]))

Changing it to:

def non_decreasing(L):
    def f(L):
        return [x<=y for x, y in zip(L, L[1:])]
    return all(f(L))    

also worked around the bug
msg128642 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2011-02-16 10:05
This is an interesting puzzle.  In both cases, the zip() function is called and runs to completion before either the list comprehension or genexp is started or called.  The code for all() is somewhat simple -- it iterates over the input and tests whether the value is true.  That is also the same in both.

one essential difference between the two then is that the x, y variables get exposed in the list comprehension but not in the genexp.  The only way I can see to get to two to evaluate differently is to mutate the exposed variables before the comparison:

>>> from decimal import *
>>> L = list(map(Decimal, '6.700 6.800 7.140 7.460 7.735'.split()))
>>> def f(z):
	global y
	y -= 100
	return z

>>> all([(f(x)<=y) for x, y in zip(L, L[1:])])
False
>>> all((f(x)<=y) for x, y in zip(L, L[1:]))
True

I don't see how that mutation could happen in your functions unless decimal has been subclassed to override its __le__ method.

Another way to get a midstream mutation is for L to change in mid-computation in multi-threaded code.  Is your example single threaded?  Is the debugger affecting the run in some way?

The disassembly shows 1) when zip is called, 2) whether x,y are exposed, and 3) whether a list is being iterated or the genexp:

>>> from dis import dis
>>> dis(compile('all((x<=y) for x, y in zip(a, b))', '', 'eval'))
  1           0 LOAD_NAME                0 (all)
              3 LOAD_CONST               0 (<code object <genexpr> at 0x16b3f50, file "", line 1>)
              6 MAKE_FUNCTION            0
              9 LOAD_NAME                1 (zip)
             12 LOAD_NAME                2 (a)
             15 LOAD_NAME                3 (b)
             18 CALL_FUNCTION            2
             21 GET_ITER            
             22 CALL_FUNCTION            1
             25 CALL_FUNCTION            1
             28 RETURN_VALUE        
>>> dis(compile('all([(x<=y) for x, y in zip(a, b)])', '', 'eval'))
  1           0 LOAD_NAME                0 (all)
              3 BUILD_LIST               0
              6 DUP_TOP             
              7 STORE_NAME               1 (_[1])
             10 LOAD_NAME                2 (zip)
             13 LOAD_NAME                3 (a)
             16 LOAD_NAME                4 (b)
             19 CALL_FUNCTION            2
             22 GET_ITER            
        >>   23 FOR_ITER                25 (to 51)
             26 UNPACK_SEQUENCE          2
             29 STORE_NAME               5 (x)
             32 STORE_NAME               6 (y)
             35 LOAD_NAME                1 (_[1])
             38 LOAD_NAME                5 (x)
             41 LOAD_NAME                6 (y)
             44 COMPARE_OP               1 (<=)
             47 LIST_APPEND         
             48 JUMP_ABSOLUTE           23
        >>   51 DELETE_NAME              1 (_[1])
             54 CALL_FUNCTION            1
             57 RETURN_VALUE  

Nothing else interesting pops-out.

One question out of curiousity.  In the JPG file that is attached, the return type is listed as bool_ instead of bool.  Is that normal for an eclipsed debugger values display?
msg128667 - (view) Author: Jonathan Livni (Jonathan.Livni) Date: 2011-02-16 13:51
The script I used is a single file single threaded code - but - It uses django's ORM to get the data from a MySQL database.

I've reduced the code path to this:

import sys,os
sys.path.append(os.path.dirname(os.getcwdu()))
os.environ['DJANGO_SETTINGS_MODULE']='my_app.settings'
from django.core.management import setup_environ
from my_app import settings
setup_environ(settings)

from my_app.convert.models import SomeModel
from operator import itemgetter
from decimal import Decimal

def non_decreasing(L):
    return all(x<=y for x, y in zip(L, L[1:]))    

raw_data =  SomeModel.objects.filter(the_date=the_date,col1__gt=Decimal('0.2'),col2__gt=Decimal('0.2'),col3__gt=0,col4__gt=0,col5__gte=2).order_by('col6','col7','col8').values_list('col6','col7','col8','col1','col3','col2','col4')
data=defaultdict(list)

for d in raw_data:
	data[d[0],d[1]].append(d[2:])

for (exp,t),d in data.iteritems():
	col8s = map(itemgetter(0),d)
	mids = [(x[3]+x[4])/Decimal('2.0') for x in d]
	if not non_decreasing(mids):
		raise Exception
msg128687 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2011-02-16 18:43
Are you positive that your 'all' is the builtin Python 'all'?  NumPy's 'all' function would behave the way you describe:

>>> all(x < 3 for x in range(5))
False
>>> from numpy import all
>>> all(x < 3 for x in range(5))
True

What does all.__module__ give?
msg128688 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2011-02-16 18:49
And voila:

>>> from numpy import bool_
>>> bool_
<type 'numpy.bool_'>

Case closed, I guess :)
msg128690 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2011-02-16 19:43
Numpy tried to frame Python's innocent all() function.  The crime was almost perfect, but the forensic evidence showed that the real culprit had left behind a telltale underscore after the bool.

Another lurid bug report brought to justice :-)
msg128695 - (view) Author: Jonathan Livni (Jonathan.Livni) Date: 2011-02-16 20:40
from pylab import *

There lies the rub?
msg128715 - (view) Author: Jonathan Livni (Jonathan.Livni) Date: 2011-02-17 09:14
Let my foolishness be a reminder to all not to use "from [module] import *"

After saying that - I believe overloading a built in Python function in a popular package\module is a mistake!

I still don't know if pylab's all() is erroneous or if it's correct functionality. I'll open a ticket there.
msg128730 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2011-02-17 16:07
> After saying that - I believe overloading a built in Python function in
> a popular package\module is a mistake!

I believe NumPy had 'any' and 'all' *before* Python did. :-)
History
Date User Action Args
2011-02-17 16:07:54mark.dickinsonsetnosy: georg.brandl, rhettinger, mark.dickinson, Jonathan.Livni
messages: + msg128730
2011-02-17 09:14:53Jonathan.Livnisetnosy: georg.brandl, rhettinger, mark.dickinson, Jonathan.Livni
messages: + msg128715
2011-02-16 20:40:12Jonathan.Livnisetnosy: georg.brandl, rhettinger, mark.dickinson, Jonathan.Livni
messages: + msg128695
2011-02-16 19:43:38rhettingersetnosy: georg.brandl, rhettinger, mark.dickinson, Jonathan.Livni
messages: + msg128690
2011-02-16 18:49:59georg.brandlsetstatus: open -> closed

messages: + msg128688
resolution: not a bug
nosy: georg.brandl, rhettinger, mark.dickinson, Jonathan.Livni
2011-02-16 18:43:44mark.dickinsonsetnosy: + mark.dickinson
messages: + msg128687
2011-02-16 13:51:27Jonathan.Livnisetnosy: georg.brandl, rhettinger, Jonathan.Livni
messages: + msg128667
2011-02-16 10:05:23rhettingersetnosy: + rhettinger
messages: + msg128642
2011-02-16 08:51:55Jonathan.Livnisetnosy: georg.brandl, Jonathan.Livni
messages: + msg128631
2011-02-16 08:47:54Jonathan.Livnisetnosy: georg.brandl, Jonathan.Livni
messages: + msg128630
2011-02-16 08:40:00georg.brandlsetnosy: + georg.brandl
messages: + msg128629
2011-02-16 08:31:38Jonathan.Livnicreate