This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: PEP 357 --- adding nb_index
Type: Stage:
Components: Interpreter Core Versions: Python 2.5
process
Status: closed Resolution: accepted
Dependencies: Superseder:
Assigned To: gvanrossum Nosy List: gvanrossum, loewis, nnorwitz, teoliphant
Priority: normal Keywords: patch

Created on 2006-02-22 03:35 by teoliphant, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
patch_against_42553.txt teoliphant, 2006-02-22 17:51
patch_against_42808.txt gvanrossum, 2006-03-03 05:12 Guido's version - cleanup, TPFLAGS
patch_against_42810.txt gvanrossum, 2006-03-03 17:45 Added flag test to ISINDEX() in ceval.c
patch_against_42871.txt teoliphant, 2006-03-06 18:27 Simplify code. Fix overflow for __index__ with longintegers.
patch_against_42877.txt teoliphant, 2006-03-07 04:01 Add test_index, fix (2**100).__index__(), and style improvements
Messages (19)
msg49572 - (view) Author: Travis Oliphant (teoliphant) * (Python committer) Date: 2006-02-22 03:35
See PEP 357 for details.
msg49573 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2006-02-22 07:33
Logged In: YES 
user_id=33168

Travis, I briefly looked over the patches and have a few
comments.  It's easier if there's one big patch that
contains all the code, tests, and doc.  I noticed a second
patch that contains the tests, but I didn't see any doc updates.

I noticed some formatting which is not consistent with the
style of the surrounding code (I think).  Stuff like
n=var->field (no spaces) and

 if (XXX)   dosomething();
 else return 5;

Most places in the python code check for == NULL rather than
 the implicit check for 0.  Most places in code don't do
assignment in if (cond), though this last one is more variable.

I'll try to look this over in more detail soon.  But it
probably won't be until PyCon.  If you're there, maybe we
can discuss in person.  Also, if you have further PEP
updates, feel free to send them to me and I can check them in.
msg49574 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2006-02-22 08:03
Logged In: YES 
user_id=33168

I see one problem in Modules/operator.c.  This code:

+	if (i==-1 && PyErr_Occurred()) return NULL;
+	else return PyInt_FromLong((long)i);

Should be:

+	if (i == -1 && PyErr_Occurred())
+               return NULL;
+	return PyInt_FromSsizeT(i);

There was a discussion on python-checkins recently about
this.  The short answer is that if Py_ssize_t's are cast to
a long, that will lose precision on Win64 (size_ts are
64-bits, but longs are still 32).  Since i is Py_ssize_t,
the cast would lose the high order values (above 2G).

In Objects/stringobject.c:
-			i += PyString_GET_SIZE(self);
-		return string_item(self,i);
+			i += PyList_GET_SIZE(self);
+		return string_item(self, i);

I don't see how that (PyList) can be correct, self is a
string object.

I would change this code:
+	lenfunc func;
+	PyNumberMethods *nb;
+	if ((nb=item->ob_type->tp_as_number) &&	\
+	    (func=nb->nb_index)) {				
+		Py_ssize_t i = func(item);

to:
+	PyNumberMethods *nb = item->ob_type->tp_as_number;
+	if (nb != NULL && nb->nb_index != NULL) {		+		Py_ssize_t i
= nb->nb_index(item);

Let the compiler do the work.  We can optimize later if this
is a problem.  I think the code is clearer this way and is
more similar to the rest of the code.  I don't care as much
about the use of func here as the change to nb = ...  If you
want to keep the local variable, I would prefer a better
name like nb_index instead of func.

There's no need for a \ after &&.

Most of these comments apply to Objects/listobject.c,
Objects/tupleobject.c, Objects/unicodeobject.c, PyNumber_Index.

In Objects/classobject.c, you don't check that indexstr is
non-NULL after assignment.  In instance_index(), remove the
cast (int) of the return result outcome.

In slot_nb_index() remove . from end of exception string. 
Note: you don't need the Py_DECREF() and return -1 in the
last if as it would be the same if it fell through.  Use
whichever version you think is clearer.  Either way is fine
with me.

In Include/abstract.h, do all the comments still say a C
integer is returned even though a Py_ssize_t is returned? 
If not, can you update your comment too.
msg49575 - (view) Author: Travis Oliphant (teoliphant) * (Python committer) Date: 2006-02-22 17:52
Logged In: YES 
user_id=5296

I followed Neal's comments and updated a new patch.  The new
patch contains the tests and the documentation updates as well. 
msg49576 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2006-03-03 01:16
Logged In: YES 
user_id=6380

I promise I'll review this further tomorrow!

Feedback so far:

- compilation error with svn HEAD on abstract.c:650

- all the uses of the nb_index slot must check for some
Py_TPFLAGS_... bit before using that slot (see object.h)

- the comment in ceval.c about the clipping of values <
-PY_SSIZE_T_MAX is incorrect; they are boosted to
-PY_SSIZE_T_MAX, not to zero
msg49577 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2006-03-03 04:32
Logged In: YES 
user_id=6380

Travis -- I'm working on a new patch. Please don't work on
one at the same time for the new few hours!  (Until midnight
PST.)
msg49578 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2006-03-03 05:12
Logged In: YES 
user_id=6380

OK, Travis, here's a new patch.  It corrects the issues I
mentioned before, and does some whitespace cleanup (like
what Neal mentioned).  The biggest change is definitely the
check for Py_TPFLAGS_HAVE_INDEX whenever the nb_index slot
is referenced.

You don't have to do anything; but if you do have more
changes, please throw away your old changes, sync to the
head, and apply my patch, before doing any more work.
msg49579 - (view) Author: Travis Oliphant (teoliphant) * (Python committer) Date: 2006-03-03 17:08
Logged In: YES 
user_id=5296

I think there is still a need for a HASINDEX check in
the ISINDEX definition in ceval.c
msg49580 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2006-03-03 17:45
Logged In: YES 
user_id=6380

Totally right.

I have one more concern:

>>> import sys
>>> (sys.maxint+1).__index__()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
OverflowError: long int too large to convert to int
>>> 

whiile I agree that the C API ought to silently truncate
values to fit within the boundaries established by ssize_t,
I think that when calling __index__() from Python, it should
be allowed to return an unbounded value.

Anyway, I haven't had time to whip up code to implement
that, but here's a new patch that adds the test to ISINDEX()
-- thanks for catching that!
msg49581 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2006-03-03 17:45
Logged In: YES 
user_id=33168

The doc needs \versionadded tags.  I can add those after
checkin.

Should sequences in Modules/ be upgraded too?  mmap, array,
various db modules, bufferobject, rangeobject.

Travis already mentioned the TP_FLAGS_HAVE_INDEX check in
ceval.c.
msg49582 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2006-03-06 01:50
Logged In: YES 
user_id=21627

In Objects/classobject.c, instance_index should be cast to
lenfunc, not inquiry. Otherwise, it looks fine.
msg49583 - (view) Author: Travis Oliphant (teoliphant) * (Python committer) Date: 2006-03-06 10:27
Logged In: YES 
user_id=5296

Here's a new patch that does not raise overflow error for
(sys.maxint+1).__index__

This patch also cleans up the apply_slice code so to exhibit
the same behavior because it uses the same code.  This has
one minor difference in that -PY_SSIZE_T_MAX-1 (isn't this
correct?) is used instead of -PY_SSIZE_T_MAX when clipping
negative numbers. 

Also, versionadded tags were created and applied, and the
arraymodule was updated to use __index__ for slicing.
msg49584 - (view) Author: Travis Oliphant (teoliphant) * (Python committer) Date: 2006-03-06 10:27
Logged In: YES 
user_id=5296

Here's a new patch that does not raise overflow error for
(sys.maxint+1).__index__

This patch also cleans up the apply_slice code so to exhibit
the same behavior because it uses the same code.  This has
one minor difference in that -PY_SSIZE_T_MAX-1 (isn't this
correct?) is used instead of -PY_SSIZE_T_MAX when clipping
negative numbers. 

Also, versionadded tags were created and applied, and the
arraymodule was updated to use __index__ for slicing.
msg49585 - (view) Author: Travis Oliphant (teoliphant) * (Python committer) Date: 2006-03-06 18:27
Logged In: YES 
user_id=5296

This patch does not contain the bug in mmap that the last
one did. All regression tests for Linux pass.
msg49586 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2006-03-06 19:36
Logged In: YES 
user_id=6380

I'm sorry, but this still doesn't do what I think __index__
ought to do for a long: it ought to return self unchanged,
even if it's out of range for ssize_t.  IOW I want
(2**100).__index__() to return 2**100, not 2**31-1.  The
clipping should be done only on the conversion to ssize_t in
nb_index.

I reviewed the code that was new since my last patch.  Apart
from the above objection, it all looks good except:

- You seem to have lost the diffs for test_index.py; if you
do "svn add Lib/test/test_index.py" it will show up.

- Style nit: you don't have to use \ to split C lines
*unless* it's a multi-line CPP macro definition.

- Another style nit: try to put spaces around the ==
operator (and other comparisons).  i==-1 looks like
line-noise to me; i == -1 looks much better.

- _PyLong_AsSize_t() and long_index check for
PyErr_Occurred() after calling _long_as_ssize_t(); they
should really cal PyErr_Clear() before calling that, since
otherwise they can be fooled by pre-existing errors. 
(Clearing pre-existing errors is *also* bad, but they are
really a symptom of something bad already going on; the
general rule is to pre-clear errors *if* you're going to
check for PyErr_Occurred() instead of checking for error
return values (NULL or -1).
msg49587 - (view) Author: Travis Oliphant (teoliphant) * (Python committer) Date: 2006-03-07 04:01
Logged In: YES 
user_id=5296

I changed the wrap_index function so that in Pyhton the
__index__ method for long returns the object itself should
it be too large for a Py_ssizet.  I'm sorry I didn't
understand what you wanted before. 

* added Lib/test/test_index.py
* removed '\' used to split C-lines (where I could find them)
* put spaces around '==' sign
* check for an error at the start of _long_as_ssize_t so
that both _PyLong_AsSize_t() and long_index return an
error if one was pre-existing.  That way the errors are
carefully controlled in _long_assize_t and different things
are done (ignoring overflow or raising an error) depending
on the what calls _long_as_ssize_t.
msg49588 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2006-03-07 04:57
Logged In: YES 
user_id=6380

You know, I can't figure out why the code in wrap_index()
has the desired effect, but it does...  Somehow it looks
suspiciously, as if it could sometimes return a non-integral
self, but I can't come up with a test case where that
happens, so I think it's okay.

I think this is good to go now; I'll sleep on it for a night
and then check it in.

Thanks for all your efforts, Travis!
msg49589 - (view) Author: Travis Oliphant (teoliphant) * (Python committer) Date: 2006-03-07 15:48
Logged In: YES 
user_id=5296

Guido is right. 

The wrap_index function should probably contain a check for
Int or Long before checking for overflow.  Otherwise, it
would be possible to define a type in C that when it returns
PY_SSIZE_T_MAX for the nb_index method it's __index__() call
from Python will return itself (i.e. a situation exists
where __index__() does not return an int or a long). 

I'm thinking:

if ((PyInt_Check(self) || PyLong_Check(self)) &&
    ((res == PY_SSIZE_T_MAX) || (res == -PY_SSIZE_T_MAX-1)) {
...
}

The reason this works is because the nb_index method returns
PY_SSIZE_T_MAX or -PY_SSIZE_T_MAX-1 on overflow (without
setting an error).  We just check for those cases and return
self if they show up because it might be an overflow
condition.  Even if it isn't an overflow, it will still be
fine because self is either PY_SSIZE_T_MAX or
-PY_SSIZE_T_MAX-1 so returning self will be the same integer
as PyInt_FromSsize_t will return (except it could be a long
integer instead of an integer). 


msg49590 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2006-03-07 18:54
Logged In: YES 
user_id=6380

Checked in as 42899 and 42900 (Misc/NEWS).
History
Date User Action Args
2022-04-11 14:56:15adminsetgithub: 42935
2006-02-22 03:35:47teoliphantcreate