msg128628 - (view) |
Author: Jonathan Livni (Jonathan.Livni) |
Date: 2011-02-16 08:31 |
all( (x<=y) for x,y in zip(L, L[1:]) )
all([(x<=y) for x,y in zip(L, L[1:])])
Both lines of code above check if L is a non-decreasing list. Both should return the same results. But under some conditions, they don't. I've encountered this with a list of Decimal numbers.
This is 100% reproducible on my Win7 64bit vanilla Python 2.6.6 32bit setup, alas I cannot share the specific code that generates this difference.
See attached screenshot from Eclipse Pydev debugger.
|
msg128629 - (view) |
Author: Georg Brandl (georg.brandl) *  |
Date: 2011-02-16 08:40 |
It's not easy to reproduce this without the full list of decimals.
Do you have a nonstandard decimal Context set?
What is the result if you put the LC into a function, i.e.
def f(L):
return [(x<=y) for x,y in zip(L, L[1:])]
print all(f(L))
|
msg128630 - (view) |
Author: Jonathan Livni (Jonathan.Livni) |
Date: 2011-02-16 08:47 |
The exact list of decimals doesn't help - I tried taking the list and reproducing the bug with the following short script, but the problem did not reproduced:
from decimal import Decimal
L = [Decimal('6.700'), Decimal('6.800'), Decimal('7.140'), Decimal('7.460'), Decimal('7.735'), Decimal('8.160'), Decimal('8.280'), Decimal('8.355'), Decimal('8.710'), Decimal('9.640'), Decimal('10.155'), Decimal('10.460'), Decimal('10.810'), Decimal('11.875'), Decimal('12.310'), Decimal('12.315'), Decimal('13.250'), Decimal('13.205'), Decimal('13.750'), Decimal('14.245'), Decimal('14.805'), Decimal('15.385'), Decimal('15.955'), Decimal('16.365'), Decimal('16.960'), Decimal('17.500'), Decimal('19.445')]
print all(x<=y for x, y in zip(L, L[1:]))
The script above rightfully printed False.
The decimal list above was taken from the pydev debugger session where I found the bug.
In the original script I do not mess around with Decimal at all. I just cast too and from float and use simple arithmetics with it.
|
msg128631 - (view) |
Author: Jonathan Livni (Jonathan.Livni) |
Date: 2011-02-16 08:51 |
Another note - the original problematic code looks like this:
def non_decreasing(L):
return all(x<=y for x, y in zip(L, L[1:]))
Changing it to:
def non_decreasing(L):
def f(L):
return [x<=y for x, y in zip(L, L[1:])]
return all(f(L))
also worked around the bug
|
msg128642 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2011-02-16 10:05 |
This is an interesting puzzle. In both cases, the zip() function is called and runs to completion before either the list comprehension or genexp is started or called. The code for all() is somewhat simple -- it iterates over the input and tests whether the value is true. That is also the same in both.
one essential difference between the two then is that the x, y variables get exposed in the list comprehension but not in the genexp. The only way I can see to get to two to evaluate differently is to mutate the exposed variables before the comparison:
>>> from decimal import *
>>> L = list(map(Decimal, '6.700 6.800 7.140 7.460 7.735'.split()))
>>> def f(z):
global y
y -= 100
return z
>>> all([(f(x)<=y) for x, y in zip(L, L[1:])])
False
>>> all((f(x)<=y) for x, y in zip(L, L[1:]))
True
I don't see how that mutation could happen in your functions unless decimal has been subclassed to override its __le__ method.
Another way to get a midstream mutation is for L to change in mid-computation in multi-threaded code. Is your example single threaded? Is the debugger affecting the run in some way?
The disassembly shows 1) when zip is called, 2) whether x,y are exposed, and 3) whether a list is being iterated or the genexp:
>>> from dis import dis
>>> dis(compile('all((x<=y) for x, y in zip(a, b))', '', 'eval'))
1 0 LOAD_NAME 0 (all)
3 LOAD_CONST 0 (<code object <genexpr> at 0x16b3f50, file "", line 1>)
6 MAKE_FUNCTION 0
9 LOAD_NAME 1 (zip)
12 LOAD_NAME 2 (a)
15 LOAD_NAME 3 (b)
18 CALL_FUNCTION 2
21 GET_ITER
22 CALL_FUNCTION 1
25 CALL_FUNCTION 1
28 RETURN_VALUE
>>> dis(compile('all([(x<=y) for x, y in zip(a, b)])', '', 'eval'))
1 0 LOAD_NAME 0 (all)
3 BUILD_LIST 0
6 DUP_TOP
7 STORE_NAME 1 (_[1])
10 LOAD_NAME 2 (zip)
13 LOAD_NAME 3 (a)
16 LOAD_NAME 4 (b)
19 CALL_FUNCTION 2
22 GET_ITER
>> 23 FOR_ITER 25 (to 51)
26 UNPACK_SEQUENCE 2
29 STORE_NAME 5 (x)
32 STORE_NAME 6 (y)
35 LOAD_NAME 1 (_[1])
38 LOAD_NAME 5 (x)
41 LOAD_NAME 6 (y)
44 COMPARE_OP 1 (<=)
47 LIST_APPEND
48 JUMP_ABSOLUTE 23
>> 51 DELETE_NAME 1 (_[1])
54 CALL_FUNCTION 1
57 RETURN_VALUE
Nothing else interesting pops-out.
One question out of curiousity. In the JPG file that is attached, the return type is listed as bool_ instead of bool. Is that normal for an eclipsed debugger values display?
|
msg128667 - (view) |
Author: Jonathan Livni (Jonathan.Livni) |
Date: 2011-02-16 13:51 |
The script I used is a single file single threaded code - but - It uses django's ORM to get the data from a MySQL database.
I've reduced the code path to this:
import sys,os
sys.path.append(os.path.dirname(os.getcwdu()))
os.environ['DJANGO_SETTINGS_MODULE']='my_app.settings'
from django.core.management import setup_environ
from my_app import settings
setup_environ(settings)
from my_app.convert.models import SomeModel
from operator import itemgetter
from decimal import Decimal
def non_decreasing(L):
return all(x<=y for x, y in zip(L, L[1:]))
raw_data = SomeModel.objects.filter(the_date=the_date,col1__gt=Decimal('0.2'),col2__gt=Decimal('0.2'),col3__gt=0,col4__gt=0,col5__gte=2).order_by('col6','col7','col8').values_list('col6','col7','col8','col1','col3','col2','col4')
data=defaultdict(list)
for d in raw_data:
data[d[0],d[1]].append(d[2:])
for (exp,t),d in data.iteritems():
col8s = map(itemgetter(0),d)
mids = [(x[3]+x[4])/Decimal('2.0') for x in d]
if not non_decreasing(mids):
raise Exception
|
msg128687 - (view) |
Author: Mark Dickinson (mark.dickinson) *  |
Date: 2011-02-16 18:43 |
Are you positive that your 'all' is the builtin Python 'all'? NumPy's 'all' function would behave the way you describe:
>>> all(x < 3 for x in range(5))
False
>>> from numpy import all
>>> all(x < 3 for x in range(5))
True
What does all.__module__ give?
|
msg128688 - (view) |
Author: Georg Brandl (georg.brandl) *  |
Date: 2011-02-16 18:49 |
And voila:
>>> from numpy import bool_
>>> bool_
<type 'numpy.bool_'>
Case closed, I guess :)
|
msg128690 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2011-02-16 19:43 |
Numpy tried to frame Python's innocent all() function. The crime was almost perfect, but the forensic evidence showed that the real culprit had left behind a telltale underscore after the bool.
Another lurid bug report brought to justice :-)
|
msg128695 - (view) |
Author: Jonathan Livni (Jonathan.Livni) |
Date: 2011-02-16 20:40 |
from pylab import *
There lies the rub?
|
msg128715 - (view) |
Author: Jonathan Livni (Jonathan.Livni) |
Date: 2011-02-17 09:14 |
Let my foolishness be a reminder to all not to use "from [module] import *"
After saying that - I believe overloading a built in Python function in a popular package\module is a mistake!
I still don't know if pylab's all() is erroneous or if it's correct functionality. I'll open a ticket there.
|
msg128730 - (view) |
Author: Mark Dickinson (mark.dickinson) *  |
Date: 2011-02-17 16:07 |
> After saying that - I believe overloading a built in Python function in
> a popular package\module is a mistake!
I believe NumPy had 'any' and 'all' *before* Python did. :-)
|
|
Date |
User |
Action |
Args |
2022-04-11 14:57:12 | admin | set | github: 55430 |
2011-02-17 16:07:54 | mark.dickinson | set | nosy:
georg.brandl, rhettinger, mark.dickinson, Jonathan.Livni messages:
+ msg128730 |
2011-02-17 09:14:53 | Jonathan.Livni | set | nosy:
georg.brandl, rhettinger, mark.dickinson, Jonathan.Livni messages:
+ msg128715 |
2011-02-16 20:40:12 | Jonathan.Livni | set | nosy:
georg.brandl, rhettinger, mark.dickinson, Jonathan.Livni messages:
+ msg128695 |
2011-02-16 19:43:38 | rhettinger | set | nosy:
georg.brandl, rhettinger, mark.dickinson, Jonathan.Livni messages:
+ msg128690 |
2011-02-16 18:49:59 | georg.brandl | set | status: open -> closed
messages:
+ msg128688 resolution: not a bug nosy:
georg.brandl, rhettinger, mark.dickinson, Jonathan.Livni |
2011-02-16 18:43:44 | mark.dickinson | set | nosy:
+ mark.dickinson messages:
+ msg128687
|
2011-02-16 13:51:27 | Jonathan.Livni | set | nosy:
georg.brandl, rhettinger, Jonathan.Livni messages:
+ msg128667 |
2011-02-16 10:05:23 | rhettinger | set | nosy:
+ rhettinger messages:
+ msg128642
|
2011-02-16 08:51:55 | Jonathan.Livni | set | nosy:
georg.brandl, Jonathan.Livni messages:
+ msg128631 |
2011-02-16 08:47:54 | Jonathan.Livni | set | nosy:
georg.brandl, Jonathan.Livni messages:
+ msg128630 |
2011-02-16 08:40:00 | georg.brandl | set | nosy:
+ georg.brandl messages:
+ msg128629
|
2011-02-16 08:31:38 | Jonathan.Livni | create | |