This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Make range __eq__ work
Type: enhancement Stage:
Components: Interpreter Core Versions: Python 3.0
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: Nosy List: amaury.forgeotdarc, belopolsky, benjamin.peterson, georg.brandl, gvanrossum, pitrou, rhettinger
Priority: normal Keywords: patch

Created on 2008-04-09 21:28 by benjamin.peterson, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
range_eq.patch benjamin.peterson, 2008-04-09 21:28
range_eq2.patch benjamin.peterson, 2008-04-10 01:56
range_eq3.patch benjamin.peterson, 2008-04-10 20:40
range_eq_after_review.patch benjamin.peterson, 2008-04-15 21:21
range_eq4.patch benjamin.peterson, 2008-04-16 02:21
range_eq5.patch benjamin.peterson, 2008-04-16 11:58
range_eq6.patch benjamin.peterson, 2008-04-17 01:24
range_eq7.patch benjamin.peterson, 2008-04-17 22:07
range_eq8.patch benjamin.peterson, 2008-04-25 22:36
range_eq8_normalize.patch benjamin.peterson, 2008-04-25 23:24
range_eq8_normalize2.patch benjamin.peterson, 2008-04-25 23:41
range_eq8_normalize3.patch benjamin.peterson, 2008-04-26 02:55
range_eq8_normalize4.patch benjamin.peterson, 2008-04-26 15:39
range_eq8_normalize5.patch benjamin.peterson, 2008-04-26 17:06
Messages (44)
msg65266 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2008-04-09 21:28
This patch makes this work:
>>> range(4) == range(4)
True

This is in similar fashion to dict.keys
>>> {1:2}.keys() == {1:2}.keys()
True

Tests included.
msg65268 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2008-04-09 22:04
Your patch does not check the return values of PyObject_RichCompare
calls for NULL.  This is probably ok given the current restrictions on
the type of start/step/stop, but adding the null checks will make the
code more robust.
msg65269 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2008-04-09 22:07
I have no opinion whether the patch should go in, but in any case:

* You should use PyObject_RichCompareBool.
* You should error-check the returns of those.
msg65270 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2008-04-09 22:08
Ah yes, and variable declarations must be at the start of a block, else
MSVC (and possible other compilers) will complain.
msg65273 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2008-04-09 22:12
Actually, the patch contains an exploitable bug:

$ cat x.py
class X(int):
    def __eq__(self, other):
        raise ValueError

x = range(X(1),X(10),X(1))
x == x

$ ./python x.py
Segmentation fault (core dumped)
msg65278 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2008-04-09 22:21
On Wed, Apr 9, 2008 at 6:08 PM, Georg Brandl <report@bugs.python.org> wrote:

>  Ah yes, and variable declarations must be at the start of a block

You can avoid the need for extra local variables by declaring
range_richcompare arguments as rangeobject* and casting the function
to richcmpfunc in type initialization.  See for example
weakrefobject.c implementation.
msg65283 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2008-04-10 01:56
Thanks for the help.
msg65288 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-04-10 07:16
And, a __hash__ method should be added.
msg65291 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2008-04-10 13:06
On Thu, Apr 10, 2008 at 3:16 AM, Amaury Forgeot d'Arc
<report@bugs.python.org> wrote:
>
>  And, a __hash__ method should be added.
>

See issue2268 for a slice.__hash__ implementation which should work
for range as well.
msg65318 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2008-04-10 20:40
Including __hash__ in this patch. Alexander, thanks; that could worked
perfectly.
msg65477 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2008-04-14 19:18
Once the review for this is completed I have no objection to it going in.
msg65520 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2008-04-15 15:47
The patch does not add unit tests for hash(range(..)).  I would actually 
test that hash(range(a,b,c)) == hash((a,b,c)) for various values of a, 
b,  and c.

A nit-pick: while I personally like the coding style with line breaks 
before binary ops, python C style appears to be the opposite.
msg65524 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2008-04-15 18:01
Code review:

Objects/rangeobject.c:

line 259-260:
AFAIK register is completely useless in this day and age; drop it.

line 296 and further:
Please move the || and && operators to the end of the previous line.

line 313 etc:
This all fits on one line.

Line 319-320:
Ditto.

Line 361:
Please make the comment line up.

Lib/test/test_builtin.py:
I'd like to see another test indicating which of the following is true:

range(0, 11, 2) == range(0, 10, 2)

or

range(0, 11, 2) != range(0, 10, 2)

(since they produce the same sequence of values).

Also, add a test for __hash__() of a range().
msg65530 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2008-04-15 21:21
Thanks for the review! How does this look?
msg65531 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2008-04-15 21:36
Close.  Just a few nits:

line 298:
These two fit on one line. Possibly the whole thing fits. If it doesn't,
the '{' should be on a line by itself.

(Are you aware of PEP 7, C code guidelines? I realize it's incomplete,
but you should still study it -- if you find things it's missing,
propose a patch to update it...!)

line 356:
Still not aligned.

Are you sure you're viewing this with a fixed-width font and tabs set to
8 spaces?
msg65532 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-04-15 21:41
The patch uses Py_SIZE(v) on a rangeobject?
I guess you borrowed too much from the tuple implementation ;-)
msg65541 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2008-04-16 02:21
Attaching a new patch. I'm not really sure what to do about the hash. I
just removed the Py_SIZE parts. I hope that's OK.
msg65543 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2008-04-16 02:50
On Tue, Apr 15, 2008 at 10:21 PM, Benjamin Peterson
<report@bugs.python.org> wrote:
>  I'm not really sure what to do about the hash. I
>  just removed the Py_SIZE parts. I hope that's OK.

If you want to maintain hash(range(a,b,c)) == hash((a,b,c)) invariant,
you need to replace len with 3, not 0.  While I think your latest
implementation is ok, reproducing tuple hash will make correctness
more obvious.
msg65546 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2008-04-16 11:58
Thanks Alexander. Trying again.
msg65555 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2008-04-16 17:48
With range_eq5.patch applied on an x86-64 Linux box:

$ ./python 
Python 3.0a4+ (py3k:62359M, Apr 16 2008, 13:36:31) 
[GCC 3.4.6 20060404 (Red Hat 3.4.6-3)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> hash(range(1,2,3))
-4094131973719604815
>>> hash(((1,2,3)))
2528502973977326415


Also, please fix indentation at Objects/rangeobject.c:271.
msg65556 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2008-04-16 18:09
range_eq5.patch fails to reproduce tuple hash because in the tuple hash
implementation the len variable varies between iterations over items. I
would just use literals for the three different values of mult rather
than copying mult += (long)(82520L + 3 + 3).
msg65565 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-04-16 19:40
Why would you want to have hash(range(a,b,c)) == hash((a,b,c)) ?
It'd be more logical to have hash(range(a,b,c)) ==
hash(tuple(range(a,b,c))) ... which is not the same thing at all :)
msg65566 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2008-04-16 20:16
On Wed, Apr 16, 2008 at 3:40 PM, Antoine Pitrou <report@bugs.python.org> wrote:
..
>  Why would you want to have hash(range(a,b,c)) == hash((a,b,c)) ?

No particular reason other than that this is easy to test and gives
some assurance that hash values are reasonable.

>  It'd be more logical to have hash(range(a,b,c)) ==
>  hash(tuple(range(a,b,c))) ... which is not the same thing at all :)

No, this is not correct because range(..) == range(..) is not the same
as tuple(range(..)) == tuple(range(..)) in the proposed
implementation.  See Guido's example above and Benjamin's test case in
the eq5 patch.
msg65569 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2008-04-17 01:24
When I was trying to get the hash to resemble tuple's, it occurred to
me: Why not just hash a tuple?
msg65571 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2008-04-17 03:11
On Wed, Apr 16, 2008 at 9:24 PM, Benjamin Peterson wrote:
..
> Why not just hash a tuple?

There are a few reasons, but neither is good enough to have another
round of code review :-)

1. It is strange to have the hash function allocate new objects.  If
that was a type frequently used as a dict key, I would be concerned
about a possibility that dictionary lookup may trigger gc.

2. While reproducing hash(tuple) is a good starting point, there may
be a reason to choose different values for the magic constants.

3. If you don't want to mess with hash(tuple) complexity, a simple xor
of start/stop/step hashes (maybe with a check to prevent accidental -1
return) should be good enough.
msg65577 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2008-04-17 06:43
No need to get hung-up on the hash function.  I can fix that up after a 
checkin and use something simple like: hashvalue = (start*prime1+seqlen)
*prime2+step.  That doesn't involve object creation and it produces the 
same hash value for range(5,10,2) and range(5,9,2) which are equivalent.
msg65579 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-04-17 07:21
> It produces the same hash value for range(5,10,2) and range(5,9,2) 
> which are equivalent.
If "equivalent" means "__eq__", they are not.

This does not invalidate your formula, of course: different objects may
have the same hash.
It is also simple to understand: just mix the numbers together.
msg65580 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2008-04-17 07:33
That was a typo.  The pair was range(5,10,2)-->[5, 7, 9] and range
(5,11,2)-->[5, 7, 9].  The formula works because the it uses seqlen 
instead of a stop value.
msg65594 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2008-04-17 22:07
Attaching a new patch with a few improvements. I tried to implemented
Raymond's hash function (I think this is how the math should be done.).
The rich compare function now short-circuits when a value isn't equal.
Also, I made ranges with the same set of integers equal even if the stop
is different.
msg65815 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2008-04-25 21:31
Comments?
msg65816 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-04-25 21:57
about range_eq7.patch:

- Did you change your mind about range equality?
  range(0,10,2) == range(0,9,2)
seems True now; it was not with range_eq6.patch

- The hash function will fail with big values (and wrongly returns a
value even when an exception is set). I suggest to call PyObject_Hash
instead of PyNumber_AsSsize_t.

- Now that you short-circuit the comparison, it is enough to have only
one boolean variable (is_equal), which may replace all uses of
start_same, stop_same and step_same.
msg65817 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2008-04-25 21:59
On Fri, Apr 25, 2008 at 5:31 PM, Benjamin Peterson
<report@bugs.python.org> wrote:
>  Comments?

In the range_hash function, len, start, step locals should be declared
Py_ssize_t, not long. Also, you can use range_length() instead of
PyObject_Size() and you need to clear error if you get len == -1.

See issue2690. With your patch,

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
OverflowError: Python int too large to convert to C ssize_t

but

True

You can avoid this problem by using range_length_obj instead of
PyObject_Size in range_richcompare.
msg65818 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2008-04-25 22:04
It looks like e-mail processor eats '>>>' examples.  My examples were

>>> range(2**100) == range(2**100+1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
OverflowError: Python int too large to convert to C ssize_t

and

>>> range(2**100) == range(2**100)
True
msg65819 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2008-04-25 22:12
On Fri, Apr 25, 2008 at 5:57 PM, Amaury Forgeot d'Arc
<report@bugs.python.org> wrote:
..
>  - Did you change your mind about range equality?
>   range(0,10,2) == range(0,9,2)
>  seems True now; it was not with range_eq6.patch
>

This makes me think: what would you say to an idea to normalize ranges
in constructor so that range(5,10,2) returns range(5,11,2).
msg65821 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2008-04-25 22:36
Thanks for the help.

Yes, after thinking for a while, I decided that range equality should
represent the set of integers and not the values in the constructor.
Normalization would be a good idea, but I think that's another issue
I'll tackle after this.

Now you get an error for hashing a huge range.
msg65822 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2008-04-25 22:38
On Fri, Apr 25, 2008 at 3:36 PM, Benjamin Peterson
<report@bugs.python.org> wrote:
>
>  Benjamin Peterson <musiccomposition@gmail.com> added the comment:
>
>  Thanks for the help.
>
>  Yes, after thinking for a while, I decided that range equality should
>  represent the set of integers and not the values in the constructor.
>  Normalization would be a good idea, but I think that's another issue
>  I'll tackle after this.

The two go hand-in-hand; you shouldn't have two range() objects that
compare equal and yet have different stop attribute values.

>  Now you get an error for hashing a huge range.
>
>  Added file: http://bugs.python.org/file10110/range_eq8.patch
>
>
>
>  __________________________________
>  Tracker <report@bugs.python.org>
>  <http://bugs.python.org/issue2603>
>  __________________________________
>
msg65823 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2008-04-25 22:41
On Fri, Apr 25, 2008 at 5:38 PM, Guido van Rossum
<report@bugs.python.org> wrote:
>
>  Guido van Rossum <guido@python.org> added the comment:
>  The two go hand-in-hand; you shouldn't have two range() objects that
>  compare equal and yet have different stop attribute values.

If it makes any difference, the attributes aren't even available through Python.
msg65824 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2008-04-25 23:10
On Fri, Apr 25, 2008 at 3:41 PM, Benjamin Peterson
<report@bugs.python.org> wrote:
>  If it makes any difference, the attributes aren't even available through Python.

But they are deducible via the str() or repr(). And IMO they *should*
be available.

I think I'd be okay with normalization on creation, so that range(0,
5, 2) returns range(0, 6, 2). Hm, but isn't that odd? Why not the
other way around?
msg65826 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2008-04-25 23:24
Here's a normalizing patch. It breaks the repr tests because the numbers
change.
msg65827 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2008-04-25 23:35
On Fri, Apr 25, 2008 at 7:10 PM, Guido van Rossum
<report@bugs.python.org> wrote:
..
>  I think I'd be okay with normalization on creation, so that range(0,
>  5, 2) returns range(0, 6, 2). Hm, but isn't that odd? Why not the
>  other way around?

I find it natural to have start + len*step  = stop invariant rather
than start +(len-1)*step + 1 = stop.  I may be influenced by C++ (STL)
tradition of giving preference to "i != stop" over "i < stop"
condition so that algorithms support iterators that are not ordered.
I also believe some algorithmic simplifications will be possible with
 start + len*step  = stop invariant.
msg65831 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2008-04-26 00:22
With respect to range_eq8_normalize2.patch, it is unusual to have a 
function that consumes a reference to its argument.  I would combine 
normalize_stop with PyNumber_Index and make it similar to validate_step 
with respect to reference counting.

Note that if you choose stop = start + len*step normaization, you will 
not need to create 'one' in normalize_stop.

With your patch I see

>>> range(0,6,2)
range(0, 6, 2)
>>> range(0,5,2)
range(0, 5, 2)

I would expect one of these ranges normalized.
msg65832 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2008-04-26 00:37
... and your patch produces wrong results:

>>> list(range(5,0,-2)) # expected [5, 3, 1]
[5, 3]

See my patch in issue2690 for a way to compute length correctly in 
range_new.
msg65833 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2008-04-26 02:55
I merged your range calculating code. After reading that bug report, I
think we need to start a discussion on python-dev about range size
constraints before moving forward any more. (We have people implementing
different things under different assumptions left and right.)
msg68261 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2008-06-16 03:25
Methinks this one is also better rejected.  Please reopen if you feel
strongly.
History
Date User Action Args
2022-04-11 14:56:33adminsetgithub: 46855
2008-06-16 03:25:16gvanrossumsetstatus: open -> closed
resolution: rejected
messages: + msg68261
2008-04-26 17:06:34benjamin.petersonsetfiles: + range_eq8_normalize5.patch
2008-04-26 15:39:23benjamin.petersonsetfiles: + range_eq8_normalize4.patch
2008-04-26 02:55:51benjamin.petersonsetfiles: + range_eq8_normalize3.patch
messages: + msg65833
2008-04-26 00:37:12belopolskysetmessages: + msg65832
2008-04-26 00:22:42belopolskysetmessages: + msg65831
2008-04-25 23:42:00benjamin.petersonsetfiles: + range_eq8_normalize2.patch
2008-04-25 23:35:12belopolskysetmessages: + msg65827
2008-04-25 23:24:33benjamin.petersonsetfiles: + range_eq8_normalize.patch
messages: + msg65826
2008-04-25 23:10:37gvanrossumsetmessages: + msg65824
2008-04-25 22:41:01benjamin.petersonsetmessages: + msg65823
2008-04-25 22:38:40gvanrossumsetmessages: + msg65822
2008-04-25 22:36:49benjamin.petersonsetfiles: + range_eq8.patch
messages: + msg65821
2008-04-25 22:12:23belopolskysetmessages: + msg65819
2008-04-25 22:04:19belopolskysetmessages: + msg65818
2008-04-25 21:59:39belopolskysetmessages: + msg65817
2008-04-25 21:57:47amaury.forgeotdarcsetmessages: + msg65816
2008-04-25 21:31:03benjamin.petersonsetmessages: + msg65815
2008-04-17 22:07:16benjamin.petersonsetfiles: + range_eq7.patch
messages: + msg65594
2008-04-17 07:33:56rhettingersetmessages: + msg65580
2008-04-17 07:21:17amaury.forgeotdarcsetmessages: + msg65579
2008-04-17 06:43:38rhettingersetnosy: + rhettinger
messages: + msg65577
2008-04-17 03:11:40belopolskysetmessages: + msg65571
2008-04-17 01:24:29benjamin.petersonsetfiles: + range_eq6.patch
messages: + msg65569
2008-04-16 20:16:17belopolskysetmessages: + msg65566
2008-04-16 19:40:10pitrousetnosy: + pitrou
messages: + msg65565
2008-04-16 18:09:37belopolskysetmessages: + msg65556
2008-04-16 17:48:19belopolskysetmessages: + msg65555
2008-04-16 11:58:17benjamin.petersonsetfiles: + range_eq5.patch
messages: + msg65546
2008-04-16 02:50:48belopolskysetmessages: + msg65543
2008-04-16 02:21:29benjamin.petersonsetfiles: + range_eq4.patch
messages: + msg65541
2008-04-15 21:41:51amaury.forgeotdarcsetmessages: + msg65532
2008-04-15 21:36:24gvanrossumsetmessages: + msg65531
2008-04-15 21:21:16benjamin.petersonsetfiles: + range_eq_after_review.patch
messages: + msg65530
2008-04-15 18:01:25gvanrossumsetmessages: + msg65524
2008-04-15 15:47:16belopolskysetmessages: + msg65520
2008-04-14 19:18:59gvanrossumsetnosy: + gvanrossum
messages: + msg65477
2008-04-10 20:40:46benjamin.petersonsetfiles: + range_eq3.patch
messages: + msg65318
2008-04-10 13:06:09belopolskysetmessages: + msg65291
2008-04-10 07:16:09amaury.forgeotdarcsetnosy: + amaury.forgeotdarc
messages: + msg65288
2008-04-10 01:56:34benjamin.petersonsetfiles: + range_eq2.patch
messages: + msg65283
2008-04-09 22:21:28belopolskysetmessages: + msg65278
2008-04-09 22:12:23belopolskysetmessages: + msg65273
2008-04-09 22:08:12georg.brandlsetmessages: + msg65270
2008-04-09 22:07:30georg.brandlsetnosy: + georg.brandl
messages: + msg65269
2008-04-09 22:04:27belopolskysetnosy: + belopolsky
messages: + msg65268
2008-04-09 21:28:07benjamin.petersoncreate