Issue 2603: Make range __eq__ work

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/46855

classification

Title:	Make range __eq__ work
Type:	enhancement	Stage:
Components:	Interpreter Core	Versions:	Python 3.0

process

Status:	closed	Resolution:	rejected
Dependencies:		Superseder:
Assigned To:		Nosy List:	amaury.forgeotdarc, belopolsky, benjamin.peterson, georg.brandl, gvanrossum, pitrou, rhettinger
Priority:	normal	Keywords:	patch

Created on 2008-04-09 21:28 by benjamin.peterson, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
range_eq.patch	benjamin.peterson, 2008-04-09 21:28
range_eq2.patch	benjamin.peterson, 2008-04-10 01:56
range_eq3.patch	benjamin.peterson, 2008-04-10 20:40
range_eq_after_review.patch	benjamin.peterson, 2008-04-15 21:21
range_eq4.patch	benjamin.peterson, 2008-04-16 02:21
range_eq5.patch	benjamin.peterson, 2008-04-16 11:58
range_eq6.patch	benjamin.peterson, 2008-04-17 01:24
range_eq7.patch	benjamin.peterson, 2008-04-17 22:07
range_eq8.patch	benjamin.peterson, 2008-04-25 22:36
range_eq8_normalize.patch	benjamin.peterson, 2008-04-25 23:24
range_eq8_normalize2.patch	benjamin.peterson, 2008-04-25 23:41
range_eq8_normalize3.patch	benjamin.peterson, 2008-04-26 02:55
range_eq8_normalize4.patch	benjamin.peterson, 2008-04-26 15:39
range_eq8_normalize5.patch	benjamin.peterson, 2008-04-26 17:06

Messages (44)
msg65266 - (view)	Author: Benjamin Peterson (benjamin.peterson) *	Date: 2008-04-09 21:28
This patch makes this work: >>> range(4) == range(4) True This is in similar fashion to dict.keys >>> {1:2}.keys() == {1:2}.keys() True Tests included.
msg65268 - (view)	Author: Alexander Belopolsky (belopolsky) *	Date: 2008-04-09 22:04
Your patch does not check the return values of PyObject_RichCompare calls for NULL. This is probably ok given the current restrictions on the type of start/step/stop, but adding the null checks will make the code more robust.
msg65269 - (view)	Author: Georg Brandl (georg.brandl) *	Date: 2008-04-09 22:07
I have no opinion whether the patch should go in, but in any case: * You should use PyObject_RichCompareBool. * You should error-check the returns of those.
msg65270 - (view)	Author: Georg Brandl (georg.brandl) *	Date: 2008-04-09 22:08
Ah yes, and variable declarations must be at the start of a block, else MSVC (and possible other compilers) will complain.
msg65273 - (view)	Author: Alexander Belopolsky (belopolsky) *	Date: 2008-04-09 22:12
Actually, the patch contains an exploitable bug: $ cat x.py class X(int): def __eq__(self, other): raise ValueError x = range(X(1),X(10),X(1)) x == x $ ./python x.py Segmentation fault (core dumped)
msg65278 - (view)	Author: Alexander Belopolsky (belopolsky) *	Date: 2008-04-09 22:21
On Wed, Apr 9, 2008 at 6:08 PM, Georg Brandl <report@bugs.python.org> wrote: > Ah yes, and variable declarations must be at the start of a block You can avoid the need for extra local variables by declaring range_richcompare arguments as rangeobject* and casting the function to richcmpfunc in type initialization. See for example weakrefobject.c implementation.
msg65283 - (view)	Author: Benjamin Peterson (benjamin.peterson) *	Date: 2008-04-10 01:56
Thanks for the help.
msg65288 - (view)	Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) *	Date: 2008-04-10 07:16
And, a __hash__ method should be added.
msg65291 - (view)	Author: Alexander Belopolsky (belopolsky) *	Date: 2008-04-10 13:06
On Thu, Apr 10, 2008 at 3:16 AM, Amaury Forgeot d'Arc <report@bugs.python.org> wrote: > > And, a __hash__ method should be added. > See issue2268 for a slice.__hash__ implementation which should work for range as well.
msg65318 - (view)	Author: Benjamin Peterson (benjamin.peterson) *	Date: 2008-04-10 20:40
Including __hash__ in this patch. Alexander, thanks; that could worked perfectly.
msg65477 - (view)	Author: Guido van Rossum (gvanrossum) *	Date: 2008-04-14 19:18
Once the review for this is completed I have no objection to it going in.
msg65520 - (view)	Author: Alexander Belopolsky (belopolsky) *	Date: 2008-04-15 15:47
The patch does not add unit tests for hash(range(..)). I would actually test that hash(range(a,b,c)) == hash((a,b,c)) for various values of a, b, and c. A nit-pick: while I personally like the coding style with line breaks before binary ops, python C style appears to be the opposite.
msg65524 - (view)	Author: Guido van Rossum (gvanrossum) *	Date: 2008-04-15 18:01
Code review: Objects/rangeobject.c: line 259-260: AFAIK register is completely useless in this day and age; drop it. line 296 and further: Please move the \|\| and && operators to the end of the previous line. line 313 etc: This all fits on one line. Line 319-320: Ditto. Line 361: Please make the comment line up. Lib/test/test_builtin.py: I'd like to see another test indicating which of the following is true: range(0, 11, 2) == range(0, 10, 2) or range(0, 11, 2) != range(0, 10, 2) (since they produce the same sequence of values). Also, add a test for __hash__() of a range().
msg65530 - (view)	Author: Benjamin Peterson (benjamin.peterson) *	Date: 2008-04-15 21:21
Thanks for the review! How does this look?
msg65531 - (view)	Author: Guido van Rossum (gvanrossum) *	Date: 2008-04-15 21:36
Close. Just a few nits: line 298: These two fit on one line. Possibly the whole thing fits. If it doesn't, the '{' should be on a line by itself. (Are you aware of PEP 7, C code guidelines? I realize it's incomplete, but you should still study it -- if you find things it's missing, propose a patch to update it...!) line 356: Still not aligned. Are you sure you're viewing this with a fixed-width font and tabs set to 8 spaces?
msg65532 - (view)	Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) *	Date: 2008-04-15 21:41
The patch uses Py_SIZE(v) on a rangeobject? I guess you borrowed too much from the tuple implementation ;-)
msg65541 - (view)	Author: Benjamin Peterson (benjamin.peterson) *	Date: 2008-04-16 02:21
Attaching a new patch. I'm not really sure what to do about the hash. I just removed the Py_SIZE parts. I hope that's OK.
msg65543 - (view)	Author: Alexander Belopolsky (belopolsky) *	Date: 2008-04-16 02:50
On Tue, Apr 15, 2008 at 10:21 PM, Benjamin Peterson <report@bugs.python.org> wrote: > I'm not really sure what to do about the hash. I > just removed the Py_SIZE parts. I hope that's OK. If you want to maintain hash(range(a,b,c)) == hash((a,b,c)) invariant, you need to replace len with 3, not 0. While I think your latest implementation is ok, reproducing tuple hash will make correctness more obvious.
msg65546 - (view)	Author: Benjamin Peterson (benjamin.peterson) *	Date: 2008-04-16 11:58
Thanks Alexander. Trying again.
msg65555 - (view)	Author: Alexander Belopolsky (belopolsky) *	Date: 2008-04-16 17:48
With range_eq5.patch applied on an x86-64 Linux box: $ ./python Python 3.0a4+ (py3k:62359M, Apr 16 2008, 13:36:31) [GCC 3.4.6 20060404 (Red Hat 3.4.6-3)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> hash(range(1,2,3)) -4094131973719604815 >>> hash(((1,2,3))) 2528502973977326415 Also, please fix indentation at Objects/rangeobject.c:271.
msg65556 - (view)	Author: Alexander Belopolsky (belopolsky) *	Date: 2008-04-16 18:09
range_eq5.patch fails to reproduce tuple hash because in the tuple hash implementation the len variable varies between iterations over items. I would just use literals for the three different values of mult rather than copying mult += (long)(82520L + 3 + 3).
msg65565 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2008-04-16 19:40
Why would you want to have hash(range(a,b,c)) == hash((a,b,c)) ? It'd be more logical to have hash(range(a,b,c)) == hash(tuple(range(a,b,c))) ... which is not the same thing at all :)
msg65566 - (view)	Author: Alexander Belopolsky (belopolsky) *	Date: 2008-04-16 20:16
On Wed, Apr 16, 2008 at 3:40 PM, Antoine Pitrou <report@bugs.python.org> wrote: .. > Why would you want to have hash(range(a,b,c)) == hash((a,b,c)) ? No particular reason other than that this is easy to test and gives some assurance that hash values are reasonable. > It'd be more logical to have hash(range(a,b,c)) == > hash(tuple(range(a,b,c))) ... which is not the same thing at all :) No, this is not correct because range(..) == range(..) is not the same as tuple(range(..)) == tuple(range(..)) in the proposed implementation. See Guido's example above and Benjamin's test case in the eq5 patch.
msg65569 - (view)	Author: Benjamin Peterson (benjamin.peterson) *	Date: 2008-04-17 01:24
When I was trying to get the hash to resemble tuple's, it occurred to me: Why not just hash a tuple?
msg65571 - (view)	Author: Alexander Belopolsky (belopolsky) *	Date: 2008-04-17 03:11
On Wed, Apr 16, 2008 at 9:24 PM, Benjamin Peterson wrote: .. > Why not just hash a tuple? There are a few reasons, but neither is good enough to have another round of code review :-) 1. It is strange to have the hash function allocate new objects. If that was a type frequently used as a dict key, I would be concerned about a possibility that dictionary lookup may trigger gc. 2. While reproducing hash(tuple) is a good starting point, there may be a reason to choose different values for the magic constants. 3. If you don't want to mess with hash(tuple) complexity, a simple xor of start/stop/step hashes (maybe with a check to prevent accidental -1 return) should be good enough.
msg65577 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2008-04-17 06:43
No need to get hung-up on the hash function. I can fix that up after a checkin and use something simple like: hashvalue = (startprime1+seqlen) prime2+step. That doesn't involve object creation and it produces the same hash value for range(5,10,2) and range(5,9,2) which are equivalent.
msg65579 - (view)	Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) *	Date: 2008-04-17 07:21
> It produces the same hash value for range(5,10,2) and range(5,9,2) > which are equivalent. If "equivalent" means "__eq__", they are not. This does not invalidate your formula, of course: different objects may have the same hash. It is also simple to understand: just mix the numbers together.
msg65580 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2008-04-17 07:33
That was a typo. The pair was range(5,10,2)-->[5, 7, 9] and range (5,11,2)-->[5, 7, 9]. The formula works because the it uses seqlen instead of a stop value.
msg65594 - (view)	Author: Benjamin Peterson (benjamin.peterson) *	Date: 2008-04-17 22:07
Attaching a new patch with a few improvements. I tried to implemented Raymond's hash function (I think this is how the math should be done.). The rich compare function now short-circuits when a value isn't equal. Also, I made ranges with the same set of integers equal even if the stop is different.
msg65815 - (view)	Author: Benjamin Peterson (benjamin.peterson) *	Date: 2008-04-25 21:31
Comments?
msg65816 - (view)	Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) *	Date: 2008-04-25 21:57
about range_eq7.patch: - Did you change your mind about range equality? range(0,10,2) == range(0,9,2) seems True now; it was not with range_eq6.patch - The hash function will fail with big values (and wrongly returns a value even when an exception is set). I suggest to call PyObject_Hash instead of PyNumber_AsSsize_t. - Now that you short-circuit the comparison, it is enough to have only one boolean variable (is_equal), which may replace all uses of start_same, stop_same and step_same.
msg65817 - (view)	Author: Alexander Belopolsky (belopolsky) *	Date: 2008-04-25 21:59
On Fri, Apr 25, 2008 at 5:31 PM, Benjamin Peterson <report@bugs.python.org> wrote: > Comments? In the range_hash function, len, start, step locals should be declared Py_ssize_t, not long. Also, you can use range_length() instead of PyObject_Size() and you need to clear error if you get len == -1. See issue2690. With your patch, Traceback (most recent call last): File "<stdin>", line 1, in <module> OverflowError: Python int too large to convert to C ssize_t but True You can avoid this problem by using range_length_obj instead of PyObject_Size in range_richcompare.
msg65818 - (view)	Author: Alexander Belopolsky (belopolsky) *	Date: 2008-04-25 22:04
It looks like e-mail processor eats '>>>' examples. My examples were >>> range(2100) == range(2100+1) Traceback (most recent call last): File "<stdin>", line 1, in <module> OverflowError: Python int too large to convert to C ssize_t and >>> range(2100) == range(2100) True
msg65819 - (view)	Author: Alexander Belopolsky (belopolsky) *	Date: 2008-04-25 22:12
On Fri, Apr 25, 2008 at 5:57 PM, Amaury Forgeot d'Arc <report@bugs.python.org> wrote: .. > - Did you change your mind about range equality? > range(0,10,2) == range(0,9,2) > seems True now; it was not with range_eq6.patch > This makes me think: what would you say to an idea to normalize ranges in constructor so that range(5,10,2) returns range(5,11,2).
msg65821 - (view)	Author: Benjamin Peterson (benjamin.peterson) *	Date: 2008-04-25 22:36
Thanks for the help. Yes, after thinking for a while, I decided that range equality should represent the set of integers and not the values in the constructor. Normalization would be a good idea, but I think that's another issue I'll tackle after this. Now you get an error for hashing a huge range.
msg65822 - (view)	Author: Guido van Rossum (gvanrossum) *	Date: 2008-04-25 22:38
On Fri, Apr 25, 2008 at 3:36 PM, Benjamin Peterson <report@bugs.python.org> wrote: > > Benjamin Peterson <musiccomposition@gmail.com> added the comment: > > Thanks for the help. > > Yes, after thinking for a while, I decided that range equality should > represent the set of integers and not the values in the constructor. > Normalization would be a good idea, but I think that's another issue > I'll tackle after this. The two go hand-in-hand; you shouldn't have two range() objects that compare equal and yet have different stop attribute values. > Now you get an error for hashing a huge range. > > Added file: http://bugs.python.org/file10110/range_eq8.patch > > > > __________________________________ > Tracker <report@bugs.python.org> > <http://bugs.python.org/issue2603> > __________________________________ >
msg65823 - (view)	Author: Benjamin Peterson (benjamin.peterson) *	Date: 2008-04-25 22:41
On Fri, Apr 25, 2008 at 5:38 PM, Guido van Rossum <report@bugs.python.org> wrote: > > Guido van Rossum <guido@python.org> added the comment: > The two go hand-in-hand; you shouldn't have two range() objects that > compare equal and yet have different stop attribute values. If it makes any difference, the attributes aren't even available through Python.
msg65824 - (view)	Author: Guido van Rossum (gvanrossum) *	Date: 2008-04-25 23:10
On Fri, Apr 25, 2008 at 3:41 PM, Benjamin Peterson <report@bugs.python.org> wrote: > If it makes any difference, the attributes aren't even available through Python. But they are deducible via the str() or repr(). And IMO they should be available. I think I'd be okay with normalization on creation, so that range(0, 5, 2) returns range(0, 6, 2). Hm, but isn't that odd? Why not the other way around?
msg65826 - (view)	Author: Benjamin Peterson (benjamin.peterson) *	Date: 2008-04-25 23:24
Here's a normalizing patch. It breaks the repr tests because the numbers change.
msg65827 - (view)	Author: Alexander Belopolsky (belopolsky) *	Date: 2008-04-25 23:35
On Fri, Apr 25, 2008 at 7:10 PM, Guido van Rossum <report@bugs.python.org> wrote: .. > I think I'd be okay with normalization on creation, so that range(0, > 5, 2) returns range(0, 6, 2). Hm, but isn't that odd? Why not the > other way around? I find it natural to have start + lenstep = stop invariant rather than start +(len-1)step + 1 = stop. I may be influenced by C++ (STL) tradition of giving preference to "i != stop" over "i < stop" condition so that algorithms support iterators that are not ordered. I also believe some algorithmic simplifications will be possible with start + len*step = stop invariant.
msg65831 - (view)	Author: Alexander Belopolsky (belopolsky) *	Date: 2008-04-26 00:22
With respect to range_eq8_normalize2.patch, it is unusual to have a function that consumes a reference to its argument. I would combine normalize_stop with PyNumber_Index and make it similar to validate_step with respect to reference counting. Note that if you choose stop = start + len*step normaization, you will not need to create 'one' in normalize_stop. With your patch I see >>> range(0,6,2) range(0, 6, 2) >>> range(0,5,2) range(0, 5, 2) I would expect one of these ranges normalized.
msg65832 - (view)	Author: Alexander Belopolsky (belopolsky) *	Date: 2008-04-26 00:37
... and your patch produces wrong results: >>> list(range(5,0,-2)) # expected [5, 3, 1] [5, 3] See my patch in issue2690 for a way to compute length correctly in range_new.
msg65833 - (view)	Author: Benjamin Peterson (benjamin.peterson) *	Date: 2008-04-26 02:55
I merged your range calculating code. After reading that bug report, I think we need to start a discussion on python-dev about range size constraints before moving forward any more. (We have people implementing different things under different assumptions left and right.)
msg68261 - (view)	Author: Guido van Rossum (gvanrossum) *	Date: 2008-06-16 03:25
Methinks this one is also better rejected. Please reopen if you feel strongly.

History
Date	User	Action	Args
2022-04-11 14:56:33	admin	set	github: 46855
2008-06-16 03:25:16	gvanrossum	set	status: open -> closed resolution: rejected messages: + msg68261
2008-04-26 17:06:34	benjamin.peterson	set	files: + range_eq8_normalize5.patch
2008-04-26 15:39:23	benjamin.peterson	set	files: + range_eq8_normalize4.patch
2008-04-26 02:55:51	benjamin.peterson	set	files: + range_eq8_normalize3.patch messages: + msg65833
2008-04-26 00:37:12	belopolsky	set	messages: + msg65832
2008-04-26 00:22:42	belopolsky	set	messages: + msg65831
2008-04-25 23:42:00	benjamin.peterson	set	files: + range_eq8_normalize2.patch
2008-04-25 23:35:12	belopolsky	set	messages: + msg65827
2008-04-25 23:24:33	benjamin.peterson	set	files: + range_eq8_normalize.patch messages: + msg65826
2008-04-25 23:10:37	gvanrossum	set	messages: + msg65824
2008-04-25 22:41:01	benjamin.peterson	set	messages: + msg65823
2008-04-25 22:38:40	gvanrossum	set	messages: + msg65822
2008-04-25 22:36:49	benjamin.peterson	set	files: + range_eq8.patch messages: + msg65821
2008-04-25 22:12:23	belopolsky	set	messages: + msg65819
2008-04-25 22:04:19	belopolsky	set	messages: + msg65818
2008-04-25 21:59:39	belopolsky	set	messages: + msg65817
2008-04-25 21:57:47	amaury.forgeotdarc	set	messages: + msg65816
2008-04-25 21:31:03	benjamin.peterson	set	messages: + msg65815
2008-04-17 22:07:16	benjamin.peterson	set	files: + range_eq7.patch messages: + msg65594
2008-04-17 07:33:56	rhettinger	set	messages: + msg65580
2008-04-17 07:21:17	amaury.forgeotdarc	set	messages: + msg65579
2008-04-17 06:43:38	rhettinger	set	nosy: + rhettinger messages: + msg65577
2008-04-17 03:11:40	belopolsky	set	messages: + msg65571
2008-04-17 01:24:29	benjamin.peterson	set	files: + range_eq6.patch messages: + msg65569
2008-04-16 20:16:17	belopolsky	set	messages: + msg65566
2008-04-16 19:40:10	pitrou	set	nosy: + pitrou messages: + msg65565
2008-04-16 18:09:37	belopolsky	set	messages: + msg65556
2008-04-16 17:48:19	belopolsky	set	messages: + msg65555
2008-04-16 11:58:17	benjamin.peterson	set	files: + range_eq5.patch messages: + msg65546
2008-04-16 02:50:48	belopolsky	set	messages: + msg65543
2008-04-16 02:21:29	benjamin.peterson	set	files: + range_eq4.patch messages: + msg65541
2008-04-15 21:41:51	amaury.forgeotdarc	set	messages: + msg65532
2008-04-15 21:36:24	gvanrossum	set	messages: + msg65531
2008-04-15 21:21:16	benjamin.peterson	set	files: + range_eq_after_review.patch messages: + msg65530
2008-04-15 18:01:25	gvanrossum	set	messages: + msg65524
2008-04-15 15:47:16	belopolsky	set	messages: + msg65520
2008-04-14 19:18:59	gvanrossum	set	nosy: + gvanrossum messages: + msg65477
2008-04-10 20:40:46	benjamin.peterson	set	files: + range_eq3.patch messages: + msg65318
2008-04-10 13:06:09	belopolsky	set	messages: + msg65291
2008-04-10 07:16:09	amaury.forgeotdarc	set	nosy: + amaury.forgeotdarc messages: + msg65288
2008-04-10 01:56:34	benjamin.peterson	set	files: + range_eq2.patch messages: + msg65283
2008-04-09 22:21:28	belopolsky	set	messages: + msg65278
2008-04-09 22:12:23	belopolsky	set	messages: + msg65273
2008-04-09 22:08:12	georg.brandl	set	messages: + msg65270
2008-04-09 22:07:30	georg.brandl	set	nosy: + georg.brandl messages: + msg65269
2008-04-09 22:04:27	belopolsky	set	nosy: + belopolsky messages: + msg65268
2008-04-09 21:28:07	benjamin.peterson	create