msg65266 - (view) |
Author: Benjamin Peterson (benjamin.peterson) * |
Date: 2008-04-09 21:28 |
This patch makes this work:
>>> range(4) == range(4)
True
This is in similar fashion to dict.keys
>>> {1:2}.keys() == {1:2}.keys()
True
Tests included.
|
msg65268 - (view) |
Author: Alexander Belopolsky (belopolsky) * |
Date: 2008-04-09 22:04 |
Your patch does not check the return values of PyObject_RichCompare
calls for NULL. This is probably ok given the current restrictions on
the type of start/step/stop, but adding the null checks will make the
code more robust.
|
msg65269 - (view) |
Author: Georg Brandl (georg.brandl) * |
Date: 2008-04-09 22:07 |
I have no opinion whether the patch should go in, but in any case:
* You should use PyObject_RichCompareBool.
* You should error-check the returns of those.
|
msg65270 - (view) |
Author: Georg Brandl (georg.brandl) * |
Date: 2008-04-09 22:08 |
Ah yes, and variable declarations must be at the start of a block, else
MSVC (and possible other compilers) will complain.
|
msg65273 - (view) |
Author: Alexander Belopolsky (belopolsky) * |
Date: 2008-04-09 22:12 |
Actually, the patch contains an exploitable bug:
$ cat x.py
class X(int):
def __eq__(self, other):
raise ValueError
x = range(X(1),X(10),X(1))
x == x
$ ./python x.py
Segmentation fault (core dumped)
|
msg65278 - (view) |
Author: Alexander Belopolsky (belopolsky) * |
Date: 2008-04-09 22:21 |
On Wed, Apr 9, 2008 at 6:08 PM, Georg Brandl <report@bugs.python.org> wrote:
> Ah yes, and variable declarations must be at the start of a block
You can avoid the need for extra local variables by declaring
range_richcompare arguments as rangeobject* and casting the function
to richcmpfunc in type initialization. See for example
weakrefobject.c implementation.
|
msg65283 - (view) |
Author: Benjamin Peterson (benjamin.peterson) * |
Date: 2008-04-10 01:56 |
Thanks for the help.
|
msg65288 - (view) |
Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * |
Date: 2008-04-10 07:16 |
And, a __hash__ method should be added.
|
msg65291 - (view) |
Author: Alexander Belopolsky (belopolsky) * |
Date: 2008-04-10 13:06 |
On Thu, Apr 10, 2008 at 3:16 AM, Amaury Forgeot d'Arc
<report@bugs.python.org> wrote:
>
> And, a __hash__ method should be added.
>
See issue2268 for a slice.__hash__ implementation which should work
for range as well.
|
msg65318 - (view) |
Author: Benjamin Peterson (benjamin.peterson) * |
Date: 2008-04-10 20:40 |
Including __hash__ in this patch. Alexander, thanks; that could worked
perfectly.
|
msg65477 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2008-04-14 19:18 |
Once the review for this is completed I have no objection to it going in.
|
msg65520 - (view) |
Author: Alexander Belopolsky (belopolsky) * |
Date: 2008-04-15 15:47 |
The patch does not add unit tests for hash(range(..)). I would actually
test that hash(range(a,b,c)) == hash((a,b,c)) for various values of a,
b, and c.
A nit-pick: while I personally like the coding style with line breaks
before binary ops, python C style appears to be the opposite.
|
msg65524 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2008-04-15 18:01 |
Code review:
Objects/rangeobject.c:
line 259-260:
AFAIK register is completely useless in this day and age; drop it.
line 296 and further:
Please move the || and && operators to the end of the previous line.
line 313 etc:
This all fits on one line.
Line 319-320:
Ditto.
Line 361:
Please make the comment line up.
Lib/test/test_builtin.py:
I'd like to see another test indicating which of the following is true:
range(0, 11, 2) == range(0, 10, 2)
or
range(0, 11, 2) != range(0, 10, 2)
(since they produce the same sequence of values).
Also, add a test for __hash__() of a range().
|
msg65530 - (view) |
Author: Benjamin Peterson (benjamin.peterson) * |
Date: 2008-04-15 21:21 |
Thanks for the review! How does this look?
|
msg65531 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2008-04-15 21:36 |
Close. Just a few nits:
line 298:
These two fit on one line. Possibly the whole thing fits. If it doesn't,
the '{' should be on a line by itself.
(Are you aware of PEP 7, C code guidelines? I realize it's incomplete,
but you should still study it -- if you find things it's missing,
propose a patch to update it...!)
line 356:
Still not aligned.
Are you sure you're viewing this with a fixed-width font and tabs set to
8 spaces?
|
msg65532 - (view) |
Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * |
Date: 2008-04-15 21:41 |
The patch uses Py_SIZE(v) on a rangeobject?
I guess you borrowed too much from the tuple implementation ;-)
|
msg65541 - (view) |
Author: Benjamin Peterson (benjamin.peterson) * |
Date: 2008-04-16 02:21 |
Attaching a new patch. I'm not really sure what to do about the hash. I
just removed the Py_SIZE parts. I hope that's OK.
|
msg65543 - (view) |
Author: Alexander Belopolsky (belopolsky) * |
Date: 2008-04-16 02:50 |
On Tue, Apr 15, 2008 at 10:21 PM, Benjamin Peterson
<report@bugs.python.org> wrote:
> I'm not really sure what to do about the hash. I
> just removed the Py_SIZE parts. I hope that's OK.
If you want to maintain hash(range(a,b,c)) == hash((a,b,c)) invariant,
you need to replace len with 3, not 0. While I think your latest
implementation is ok, reproducing tuple hash will make correctness
more obvious.
|
msg65546 - (view) |
Author: Benjamin Peterson (benjamin.peterson) * |
Date: 2008-04-16 11:58 |
Thanks Alexander. Trying again.
|
msg65555 - (view) |
Author: Alexander Belopolsky (belopolsky) * |
Date: 2008-04-16 17:48 |
With range_eq5.patch applied on an x86-64 Linux box:
$ ./python
Python 3.0a4+ (py3k:62359M, Apr 16 2008, 13:36:31)
[GCC 3.4.6 20060404 (Red Hat 3.4.6-3)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> hash(range(1,2,3))
-4094131973719604815
>>> hash(((1,2,3)))
2528502973977326415
Also, please fix indentation at Objects/rangeobject.c:271.
|
msg65556 - (view) |
Author: Alexander Belopolsky (belopolsky) * |
Date: 2008-04-16 18:09 |
range_eq5.patch fails to reproduce tuple hash because in the tuple hash
implementation the len variable varies between iterations over items. I
would just use literals for the three different values of mult rather
than copying mult += (long)(82520L + 3 + 3).
|
msg65565 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2008-04-16 19:40 |
Why would you want to have hash(range(a,b,c)) == hash((a,b,c)) ?
It'd be more logical to have hash(range(a,b,c)) ==
hash(tuple(range(a,b,c))) ... which is not the same thing at all :)
|
msg65566 - (view) |
Author: Alexander Belopolsky (belopolsky) * |
Date: 2008-04-16 20:16 |
On Wed, Apr 16, 2008 at 3:40 PM, Antoine Pitrou <report@bugs.python.org> wrote:
..
> Why would you want to have hash(range(a,b,c)) == hash((a,b,c)) ?
No particular reason other than that this is easy to test and gives
some assurance that hash values are reasonable.
> It'd be more logical to have hash(range(a,b,c)) ==
> hash(tuple(range(a,b,c))) ... which is not the same thing at all :)
No, this is not correct because range(..) == range(..) is not the same
as tuple(range(..)) == tuple(range(..)) in the proposed
implementation. See Guido's example above and Benjamin's test case in
the eq5 patch.
|
msg65569 - (view) |
Author: Benjamin Peterson (benjamin.peterson) * |
Date: 2008-04-17 01:24 |
When I was trying to get the hash to resemble tuple's, it occurred to
me: Why not just hash a tuple?
|
msg65571 - (view) |
Author: Alexander Belopolsky (belopolsky) * |
Date: 2008-04-17 03:11 |
On Wed, Apr 16, 2008 at 9:24 PM, Benjamin Peterson wrote:
..
> Why not just hash a tuple?
There are a few reasons, but neither is good enough to have another
round of code review :-)
1. It is strange to have the hash function allocate new objects. If
that was a type frequently used as a dict key, I would be concerned
about a possibility that dictionary lookup may trigger gc.
2. While reproducing hash(tuple) is a good starting point, there may
be a reason to choose different values for the magic constants.
3. If you don't want to mess with hash(tuple) complexity, a simple xor
of start/stop/step hashes (maybe with a check to prevent accidental -1
return) should be good enough.
|
msg65577 - (view) |
Author: Raymond Hettinger (rhettinger) * |
Date: 2008-04-17 06:43 |
No need to get hung-up on the hash function. I can fix that up after a
checkin and use something simple like: hashvalue = (start*prime1+seqlen)
*prime2+step. That doesn't involve object creation and it produces the
same hash value for range(5,10,2) and range(5,9,2) which are equivalent.
|
msg65579 - (view) |
Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * |
Date: 2008-04-17 07:21 |
> It produces the same hash value for range(5,10,2) and range(5,9,2)
> which are equivalent.
If "equivalent" means "__eq__", they are not.
This does not invalidate your formula, of course: different objects may
have the same hash.
It is also simple to understand: just mix the numbers together.
|
msg65580 - (view) |
Author: Raymond Hettinger (rhettinger) * |
Date: 2008-04-17 07:33 |
That was a typo. The pair was range(5,10,2)-->[5, 7, 9] and range
(5,11,2)-->[5, 7, 9]. The formula works because the it uses seqlen
instead of a stop value.
|
msg65594 - (view) |
Author: Benjamin Peterson (benjamin.peterson) * |
Date: 2008-04-17 22:07 |
Attaching a new patch with a few improvements. I tried to implemented
Raymond's hash function (I think this is how the math should be done.).
The rich compare function now short-circuits when a value isn't equal.
Also, I made ranges with the same set of integers equal even if the stop
is different.
|
msg65815 - (view) |
Author: Benjamin Peterson (benjamin.peterson) * |
Date: 2008-04-25 21:31 |
Comments?
|
msg65816 - (view) |
Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * |
Date: 2008-04-25 21:57 |
about range_eq7.patch:
- Did you change your mind about range equality?
range(0,10,2) == range(0,9,2)
seems True now; it was not with range_eq6.patch
- The hash function will fail with big values (and wrongly returns a
value even when an exception is set). I suggest to call PyObject_Hash
instead of PyNumber_AsSsize_t.
- Now that you short-circuit the comparison, it is enough to have only
one boolean variable (is_equal), which may replace all uses of
start_same, stop_same and step_same.
|
msg65817 - (view) |
Author: Alexander Belopolsky (belopolsky) * |
Date: 2008-04-25 21:59 |
On Fri, Apr 25, 2008 at 5:31 PM, Benjamin Peterson
<report@bugs.python.org> wrote:
> Comments?
In the range_hash function, len, start, step locals should be declared
Py_ssize_t, not long. Also, you can use range_length() instead of
PyObject_Size() and you need to clear error if you get len == -1.
See issue2690. With your patch,
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OverflowError: Python int too large to convert to C ssize_t
but
True
You can avoid this problem by using range_length_obj instead of
PyObject_Size in range_richcompare.
|
msg65818 - (view) |
Author: Alexander Belopolsky (belopolsky) * |
Date: 2008-04-25 22:04 |
It looks like e-mail processor eats '>>>' examples. My examples were
>>> range(2**100) == range(2**100+1)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OverflowError: Python int too large to convert to C ssize_t
and
>>> range(2**100) == range(2**100)
True
|
msg65819 - (view) |
Author: Alexander Belopolsky (belopolsky) * |
Date: 2008-04-25 22:12 |
On Fri, Apr 25, 2008 at 5:57 PM, Amaury Forgeot d'Arc
<report@bugs.python.org> wrote:
..
> - Did you change your mind about range equality?
> range(0,10,2) == range(0,9,2)
> seems True now; it was not with range_eq6.patch
>
This makes me think: what would you say to an idea to normalize ranges
in constructor so that range(5,10,2) returns range(5,11,2).
|
msg65821 - (view) |
Author: Benjamin Peterson (benjamin.peterson) * |
Date: 2008-04-25 22:36 |
Thanks for the help.
Yes, after thinking for a while, I decided that range equality should
represent the set of integers and not the values in the constructor.
Normalization would be a good idea, but I think that's another issue
I'll tackle after this.
Now you get an error for hashing a huge range.
|
msg65822 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2008-04-25 22:38 |
On Fri, Apr 25, 2008 at 3:36 PM, Benjamin Peterson
<report@bugs.python.org> wrote:
>
> Benjamin Peterson <musiccomposition@gmail.com> added the comment:
>
> Thanks for the help.
>
> Yes, after thinking for a while, I decided that range equality should
> represent the set of integers and not the values in the constructor.
> Normalization would be a good idea, but I think that's another issue
> I'll tackle after this.
The two go hand-in-hand; you shouldn't have two range() objects that
compare equal and yet have different stop attribute values.
> Now you get an error for hashing a huge range.
>
> Added file: http://bugs.python.org/file10110/range_eq8.patch
>
>
>
> __________________________________
> Tracker <report@bugs.python.org>
> <http://bugs.python.org/issue2603>
> __________________________________
>
|
msg65823 - (view) |
Author: Benjamin Peterson (benjamin.peterson) * |
Date: 2008-04-25 22:41 |
On Fri, Apr 25, 2008 at 5:38 PM, Guido van Rossum
<report@bugs.python.org> wrote:
>
> Guido van Rossum <guido@python.org> added the comment:
> The two go hand-in-hand; you shouldn't have two range() objects that
> compare equal and yet have different stop attribute values.
If it makes any difference, the attributes aren't even available through Python.
|
msg65824 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2008-04-25 23:10 |
On Fri, Apr 25, 2008 at 3:41 PM, Benjamin Peterson
<report@bugs.python.org> wrote:
> If it makes any difference, the attributes aren't even available through Python.
But they are deducible via the str() or repr(). And IMO they *should*
be available.
I think I'd be okay with normalization on creation, so that range(0,
5, 2) returns range(0, 6, 2). Hm, but isn't that odd? Why not the
other way around?
|
msg65826 - (view) |
Author: Benjamin Peterson (benjamin.peterson) * |
Date: 2008-04-25 23:24 |
Here's a normalizing patch. It breaks the repr tests because the numbers
change.
|
msg65827 - (view) |
Author: Alexander Belopolsky (belopolsky) * |
Date: 2008-04-25 23:35 |
On Fri, Apr 25, 2008 at 7:10 PM, Guido van Rossum
<report@bugs.python.org> wrote:
..
> I think I'd be okay with normalization on creation, so that range(0,
> 5, 2) returns range(0, 6, 2). Hm, but isn't that odd? Why not the
> other way around?
I find it natural to have start + len*step = stop invariant rather
than start +(len-1)*step + 1 = stop. I may be influenced by C++ (STL)
tradition of giving preference to "i != stop" over "i < stop"
condition so that algorithms support iterators that are not ordered.
I also believe some algorithmic simplifications will be possible with
start + len*step = stop invariant.
|
msg65831 - (view) |
Author: Alexander Belopolsky (belopolsky) * |
Date: 2008-04-26 00:22 |
With respect to range_eq8_normalize2.patch, it is unusual to have a
function that consumes a reference to its argument. I would combine
normalize_stop with PyNumber_Index and make it similar to validate_step
with respect to reference counting.
Note that if you choose stop = start + len*step normaization, you will
not need to create 'one' in normalize_stop.
With your patch I see
>>> range(0,6,2)
range(0, 6, 2)
>>> range(0,5,2)
range(0, 5, 2)
I would expect one of these ranges normalized.
|
msg65832 - (view) |
Author: Alexander Belopolsky (belopolsky) * |
Date: 2008-04-26 00:37 |
... and your patch produces wrong results:
>>> list(range(5,0,-2)) # expected [5, 3, 1]
[5, 3]
See my patch in issue2690 for a way to compute length correctly in
range_new.
|
msg65833 - (view) |
Author: Benjamin Peterson (benjamin.peterson) * |
Date: 2008-04-26 02:55 |
I merged your range calculating code. After reading that bug report, I
think we need to start a discussion on python-dev about range size
constraints before moving forward any more. (We have people implementing
different things under different assumptions left and right.)
|
msg68261 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2008-06-16 03:25 |
Methinks this one is also better rejected. Please reopen if you feel
strongly.
|
|
Date |
User |
Action |
Args |
2022-04-11 14:56:33 | admin | set | github: 46855 |
2008-06-16 03:25:16 | gvanrossum | set | status: open -> closed resolution: rejected messages:
+ msg68261 |
2008-04-26 17:06:34 | benjamin.peterson | set | files:
+ range_eq8_normalize5.patch |
2008-04-26 15:39:23 | benjamin.peterson | set | files:
+ range_eq8_normalize4.patch |
2008-04-26 02:55:51 | benjamin.peterson | set | files:
+ range_eq8_normalize3.patch messages:
+ msg65833 |
2008-04-26 00:37:12 | belopolsky | set | messages:
+ msg65832 |
2008-04-26 00:22:42 | belopolsky | set | messages:
+ msg65831 |
2008-04-25 23:42:00 | benjamin.peterson | set | files:
+ range_eq8_normalize2.patch |
2008-04-25 23:35:12 | belopolsky | set | messages:
+ msg65827 |
2008-04-25 23:24:33 | benjamin.peterson | set | files:
+ range_eq8_normalize.patch messages:
+ msg65826 |
2008-04-25 23:10:37 | gvanrossum | set | messages:
+ msg65824 |
2008-04-25 22:41:01 | benjamin.peterson | set | messages:
+ msg65823 |
2008-04-25 22:38:40 | gvanrossum | set | messages:
+ msg65822 |
2008-04-25 22:36:49 | benjamin.peterson | set | files:
+ range_eq8.patch messages:
+ msg65821 |
2008-04-25 22:12:23 | belopolsky | set | messages:
+ msg65819 |
2008-04-25 22:04:19 | belopolsky | set | messages:
+ msg65818 |
2008-04-25 21:59:39 | belopolsky | set | messages:
+ msg65817 |
2008-04-25 21:57:47 | amaury.forgeotdarc | set | messages:
+ msg65816 |
2008-04-25 21:31:03 | benjamin.peterson | set | messages:
+ msg65815 |
2008-04-17 22:07:16 | benjamin.peterson | set | files:
+ range_eq7.patch messages:
+ msg65594 |
2008-04-17 07:33:56 | rhettinger | set | messages:
+ msg65580 |
2008-04-17 07:21:17 | amaury.forgeotdarc | set | messages:
+ msg65579 |
2008-04-17 06:43:38 | rhettinger | set | nosy:
+ rhettinger messages:
+ msg65577 |
2008-04-17 03:11:40 | belopolsky | set | messages:
+ msg65571 |
2008-04-17 01:24:29 | benjamin.peterson | set | files:
+ range_eq6.patch messages:
+ msg65569 |
2008-04-16 20:16:17 | belopolsky | set | messages:
+ msg65566 |
2008-04-16 19:40:10 | pitrou | set | nosy:
+ pitrou messages:
+ msg65565 |
2008-04-16 18:09:37 | belopolsky | set | messages:
+ msg65556 |
2008-04-16 17:48:19 | belopolsky | set | messages:
+ msg65555 |
2008-04-16 11:58:17 | benjamin.peterson | set | files:
+ range_eq5.patch messages:
+ msg65546 |
2008-04-16 02:50:48 | belopolsky | set | messages:
+ msg65543 |
2008-04-16 02:21:29 | benjamin.peterson | set | files:
+ range_eq4.patch messages:
+ msg65541 |
2008-04-15 21:41:51 | amaury.forgeotdarc | set | messages:
+ msg65532 |
2008-04-15 21:36:24 | gvanrossum | set | messages:
+ msg65531 |
2008-04-15 21:21:16 | benjamin.peterson | set | files:
+ range_eq_after_review.patch messages:
+ msg65530 |
2008-04-15 18:01:25 | gvanrossum | set | messages:
+ msg65524 |
2008-04-15 15:47:16 | belopolsky | set | messages:
+ msg65520 |
2008-04-14 19:18:59 | gvanrossum | set | nosy:
+ gvanrossum messages:
+ msg65477 |
2008-04-10 20:40:46 | benjamin.peterson | set | files:
+ range_eq3.patch messages:
+ msg65318 |
2008-04-10 13:06:09 | belopolsky | set | messages:
+ msg65291 |
2008-04-10 07:16:09 | amaury.forgeotdarc | set | nosy:
+ amaury.forgeotdarc messages:
+ msg65288 |
2008-04-10 01:56:34 | benjamin.peterson | set | files:
+ range_eq2.patch messages:
+ msg65283 |
2008-04-09 22:21:28 | belopolsky | set | messages:
+ msg65278 |
2008-04-09 22:12:23 | belopolsky | set | messages:
+ msg65273 |
2008-04-09 22:08:12 | georg.brandl | set | messages:
+ msg65270 |
2008-04-09 22:07:30 | georg.brandl | set | nosy:
+ georg.brandl messages:
+ msg65269 |
2008-04-09 22:04:27 | belopolsky | set | nosy:
+ belopolsky messages:
+ msg65268 |
2008-04-09 21:28:07 | benjamin.peterson | create | |