classification
Title: PyNumber_Index() is not int-subclass friendly (or operator.index() docos lie)
Type: behavior Stage: patch review
Components: Interpreter Core Versions: Python 3.3, Python 3.4, Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: ethan.furman Nosy List: Arfrever, alex, barry, docs@python, eric.snow, haypo, mark.dickinson, mjacob, ncoghlan, python-dev, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2013-03-29 22:25 by barry, last changed 2015-07-21 07:29 by ethan.furman.

Files
File name Uploaded Description Edit
issue17576.patch mark.dickinson, 2013-08-04 08:56 review
issue17576_v2.patch mark.dickinson, 2013-08-04 10:07 review
issue17576_v3.patch serhiy.storchaka, 2013-12-10 11:47 review
issue17576_v4.patch serhiy.storchaka, 2013-12-11 20:02 review
Messages (38)
msg185522 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2013-03-29 22:25
operator.index() is just a thin wrapper around PyNumber_Index().  The documentation for operator.index() claims that it is equivalent to calling obj.__index__() but for subclasses of int, this is not true.  In fact, PyNumber_Index() first does (e.g. in Python 3.3) a PyLong_Check() and if that succeeds, the original object is returned *without* doing the moral equivalent in C of calling obj.__index__().  An example:

class myint(int):
    def __index__(self):
        return int(self) + 1

>>> x = myint(7)
>>> x.__index__()
8
>>> from operator import index
>>> index(x)
7

The C API documents PyNumber_Index() as: "Returns the o converted to a Python int on success or NULL with a TypeError exception raised on failure."

Because this has been the behavior of PyNumber_Index() since at least 2.7 (I didn't check farther back), this probably cannot be classified as a bug deserving to be fixed in the code for older Pythons.  It might be worth fixing for Python 3.4, i.e. by moving the index check before the type check.  In the meantime, this is probably a documentation bug.

The C API implies, but should be clearer that if o is an int subtype (int and long in Python 2), it is returned unchanged.  The operator.index() documentation should be amended to describe this behavior for int/long subclasses.

A different alternative would be to leave PyNumber_Index() unchanged, but with the doco fix, and to augment operator.index() to do the PyIndex_Check() first, before calling PyNumber_Index().  That's a little more redundant, but would provide the documented behavior without changing the C API.
msg185523 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2013-03-29 22:29
You also end up with this nice bit of inconsistency:

>>> x = myint(7)
>>> from operator import index
>>> range(10)[6:x]
range(6, 7)
>>> range(10)[6:x.__index__()]
range(6, 8)
>>> range(10)[6:index(x)]
range(6, 7)
>>> 

Granted, it's insane to have __index__() return a different value like this, but in my specific use case, it's the type of object returned from operator.index() that's the problem.  operator.index() returns the subclass instance while obj.__index__() returns the int.

(The use case is the IntEnum of PEP 435.)
msg185530 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2013-03-30 00:29
Would it be okay to do a check on __index__ after the PyLong_Check() succeeds?  Something like this:

    if (PyLong_Check(item) &&
        item->ob_type->tp_as_number->nb_index == PyLong_Type.tp_as_number->nb_index) {
        Py_INCREF(item);
        return item;
    }

This is something Nick and I were talking about at the sprints regarding fast paths in the abstract API (for mappings and sequences in our case).
msg185531 - (view) Author: Alex Gaynor (alex) * (Python committer) Date: 2013-03-30 00:36
In my opinion that should use PyLong_CheckExact
msg185532 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2013-03-30 00:39
On Mar 30, 2013, at 12:29 AM, Eric Snow wrote:

>Would it be okay to do a check on __index__ after the PyLong_Check()
>succeeds?  Something like this:
>
>    if (PyLong_Check(item) &&
>        item->ob_type->tp_as_number->nb_index == PyLong_Type.tp_as_number->nb_index) {
>        Py_INCREF(item);
>        return item;
>    }
>
>This is something Nick and I were talking about at the sprints regarding fast
>paths in the abstract API (for mappings and sequences in our case).

I think that would work, yes.  With this extra check, overriding __index__()
in the subclass should fail this condition and fall back to the
PyIndex_Check() clause.
msg188791 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-05-09 22:09
Alex> In my opinion that should use PyLong_CheckExact

+1
msg188810 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-05-10 06:03
if (PyLong_CheckExact(item) || (PyLong_Check(item) &&
    item->ob_type->tp_as_number->nb_index == PyLong_Type.tp_as_number->nb_index))
msg194335 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2013-08-04 08:56
See the related python-dev discussion started by Mark Shannon here:

http://mail.python.org/pipermail/python-dev/2013-March/125022.html

and continuing well into April here:

http://mail.python.org/pipermail/python-dev/2013-April/125042.html

The consensus that emerged from that thread seems to be that calls to operator.index and to int() should always return something of exact type int.

The attached patch:

- Raises TypeError for implicit calls to nb_int that fail to return something of exact type int.  (Results of direct calls to __int__ are not checked.)

- Ensures that *all* conversions from a non-int to an int via nb_int make use of the nb_int slot, even for int subclasses.  Prior to this patch, some of the PyLong_As... functions would bypass __int__ for int subclasses.

- Adds a new private _PyLong_FromNbInt function to Objects/longobject.c, so that we have a single place for performing these conversions and making type checks, and refactors existing uses of the nb_int slot to go via this function.

- Makes corresponding changes for nb_index, which should address the original bug report.

I guess this may be too dangerous a change for Python 3.4.  In that case, I propose raising warnings instead of TypeErrors for Python 3.4 and turning those into TypeErrors in Python 3.5.

One other question:  should direct calls to __int__ and __index__ also have their return values type-checked?  That doesn't seem to happen at the moment for other magic methods (see below), so it would seem consistent to only do the type checking for interpreter-generated implicit calls to __int__ and __index__.  Nick: any opinion?

>>> class A:
...     def __len__(self): return "a string"
...     def __bool__(self): return "another string"
... 
>>> a = A()
>>> a.__len__()
'a string'
>>> a.__bool__()
'another string'
>>> len(a)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'str' object cannot be interpreted as an integer
>>> bool(a)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: __bool__ should return bool, returned str
msg194337 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2013-08-04 10:07
New patch that replaces the TypeErrors with warnings and fixes a refleak in the original patch.
msg194347 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-08-04 11:29
The deprecation warning version looks good to me.

Something I'll mention explicitly (regarding the PyCon discussions that Eric mentioned above), is that we unfortunately couldn't do something like this for the various concrete APIs with overly permissive subclass checks. For those APIs, calling them directly was often the *right* thing for simple subtypes implemented in C to use to call up to the parent implementation.

This case is different, as it's the *abstract* APIs that currently have the overly permissive checks.
msg195713 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-08-20 19:52
Shouldn't it be PendingDeprecationWarning?
msg195747 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2013-08-21 05:47
> Shouldn't it be PendingDeprecationWarning?

Hmm.  Possibly.  I'm not sure what the policy is any more regarding DeprecationWarning versus PendingDeprecationWarning.  Nick?
msg195759 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-08-21 09:33
Yet some nitpicks.

Currently the code of _PyLong_FromNbInt() is inlined and the do_decref flag is used to prevent needless change refcounts of int objects (see also issue18797). In proposed patch common code is extracted into the _PyLong_FromNbInt() function and int objects increfed and decrefed. Doesn't it affect a performance? PyLong_As* functions used in arguments parsing in a lot of builtins and their .performance is important.

If the patch slowdowns PyLong_As* functions we perhaps should check PyLong_CheckExact() before calling _PyLong_FromNbInt() and use the do_decref flag.

In general the idea and the patch LGTM.
msg195764 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-08-21 10:22
On PyPy 1.8.0 operator.index(True) returns 1.
msg195772 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-08-21 11:32
And yet one nitpick. For int subclasses which doesn't overload the __int__ method the patch calls default int.__int__ which creates a copy of int object. This is redundant in PyLong_As* functions because they only extract C int value and drop Python int object. So we can use int subclass object itself as good as int object.
msg195824 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-08-21 21:32
On 21 Aug 2013 15:47, "Mark Dickinson" <report@bugs.python.org> wrote:
>
>
> Mark Dickinson added the comment:
>
> > Shouldn't it be PendingDeprecationWarning?
>
> Hmm.  Possibly.  I'm not sure what the policy is any more regarding
DeprecationWarning versus PendingDeprecationWarning.  Nick?

Not sure if this is written down anywhere, but I use
PendingDeprecationWarning for "definitely still around next release, may
not have a set date for removal" and DeprecationWarning for "may be removed
next release".
msg200171 - (view) Author: Ethan Furman (ethan.furman) * (Python committer) Date: 2013-10-17 22:37
Where do we stand with this issue?
msg200229 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2013-10-18 08:00
I still need to act on some of Serhiy's comments.  I do plan to get this in for 3.4.
msg205733 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-12-09 19:30
Ping.
msg205735 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2013-12-09 19:42
> Ping.

Bah.  Sorry; I haven't had time to deal with this.  Serhiy: are you interested in taking over?
msg205793 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-12-10 11:47
Here is updated patch. There is no more overhead in PyLong_As* functions. Simplified PyNumber_Index(). assertWarns() now used instead of support.check_warnings(). Added new tests.
msg205802 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-12-10 12:09
Took me a while to figure out that one of the code paths was being deleted as redundant because the type machinery will always fill in nb_int for int subclasses, but Serhiy's patch looks good to me.
msg205856 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2013-12-10 20:38
Thanks, Serhiy.
msg205862 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-12-10 21:53
What about PyLong_AsSsize_t(), PyLong_AsUnsignedLong(), and PyLong_AsSize_t()? They are ignore __int__().

PyLong_AsVoidPtr() calls PyLong_AsLong(), PyLong_AsUnsignedLong(), PyLong_AsLongLong(), or PyLong_AsUnsignedLongLong() (depending on pointer's size and it's sign) and therefore can call or not call __int__, can raise or not raise TypeError on non-int subclasses with defined __int__().
msg205870 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-12-10 23:11
The ssize_t functions deliberately ignore lossy int conversions (e.g. from
floats) - that's why they only work on types that implement __index__.
msg205889 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-12-11 07:34
> The ssize_t functions deliberately ignore lossy int conversions (e.g. from
> floats) - that's why they only work on types that implement __index__.

They ignore __index__.
msg205890 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2013-12-11 07:37
Yes, the PyLong_As... conversions are a mess.  In theory, they should all use __index__ if available, and ignore __int__.  In practice, it's a mess that's difficult to change without breaking things.  I think there are already some other issues open on this subject.
msg205896 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-12-11 08:59
PyLong_AsUnsignedLong() and PyLong_AsLong() can return different values for same object. Why __int__ and __index__ should be used for int subclasses at all? I'm not sure this is right solution.
msg205916 - (view) Author: Roundup Robot (python-dev) Date: 2013-12-11 19:28
New changeset 618cca51a27e by Serhiy Storchaka in branch '3.3':
Issue #17576: Deprecation warning emitted now when __int__() or __index__()
http://hg.python.org/cpython/rev/618cca51a27e

New changeset 59fb79d0411e by Serhiy Storchaka in branch 'default':
Issue #17576: Deprecation warning emitted now when __int__() or __index__()
http://hg.python.org/cpython/rev/59fb79d0411e
msg205919 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-12-11 20:02
I first committed safe part of patch, which doesn't change behavior, only adds warnings and tests. Here is other part (very simple), which change behavior (in sum they are equal to issue17576_v3.patch with minor changes).

* PyNumber_Index() now calls __index__() for int subclasses.

>>> class A(int):
...     def __index__(self): return 1
... 
>>> import operator
>>> operator.index(A())

Returns 0 without patch and 1 with patch.

* PyNumber_Long() now calls __int__() on result of __trunc__() if it is int subclass instance.

>>> class A:
...     def __trunc__(self): return True
... 
>>> int(A())

Returns True without patch and 1 with patch.

* PyLong_As*() functions (but not all) call __int__() for int subclasses (I'm not sure this is right).

>>> class A(int):
...     def __int__(self): return 42
... 
>>> chr(A(43))

Returns '+' without patch and '*' with patch.
msg205921 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-12-11 20:20
Even with last patch int() can return non-int:

>>> class A(int):
...     def __int__(self): return True
...     def __repr__(self): return '<A>'
... 
>>> class B:
...     def __int__(self): return A()
... 
>>> class C:
...     def __trunc__(self): return A()
... 
>>> int(B())
<A>
>>> int(C())
True
msg206199 - (view) Author: Roundup Robot (python-dev) Date: 2013-12-14 19:08
New changeset a3de2b3881c1 by Serhiy Storchaka in branch '3.3':
Issue #17576: Removed deprecation warnings added in changeset 618cca51a27e.
http://hg.python.org/cpython/rev/a3de2b3881c1
msg207293 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-01-04 17:29
I have doubts about this issue, so I have unassigned it from myself.
msg236493 - (view) Author: Manuel Jacob (mjacob) * Date: 2015-02-24 13:38
Maybe I'm missing something, but it seems to me that test_int_subclass_with_index() is testing for the exactly wrong behaviour.  Isn't the point of this issue that operator.index(a) should be equal to a.__index__()?  Why are the tests checking that they are different?
msg236502 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-02-24 15:05
The tests are checking that they are the same value (8) and the same type (int)?
msg236526 - (view) Author: Manuel Jacob (mjacob) * Date: 2015-02-24 17:44
The tests in the attached patches (for example issue17576_v3.patch) check that both are 8, but the tests which were actually committed are checking that "my_int.__index__() == 8" and "operator.index(my_int) == 7".
msg236528 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-02-24 17:59
Ah, it just checks current behavior. So we will know when this will be changed.
msg236880 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2015-02-28 12:00
OK, something appears to have gotten confused along the way here. Barry's original problem report was that operator.index() was returning a different answer than operator.__index__() for int subclasses. Absolutely nothing to do with the int builtin at all. While the fact int() *may* return int subclasses isn't especially good, it's also a longstanding behaviour.

The problem Barry reports, where a subclassing based proxy type isn't reverting to a normal integer when accessed via operator.index() despite defining __index__() to do exactly that should be possible to fix just by applying the stricter check specifically in PyNumber_Index.

Expanding the scope to cover __int__() and __trunc__() as well would be much, much hairier, as those are much older interfaces, and used in a wider variety of situations. We specifically invented __index__() to stay away from that mess while making it possible to explicitly indicate that a type supports a lossless conversion to int rather than a potentially lossy one.
History
Date User Action Args
2015-07-21 07:29:34ethan.furmansetnosy: - ethan.furman
2015-02-28 12:00:51ncoghlansetmessages: + msg236880
2015-02-24 17:59:40serhiy.storchakasetmessages: + msg236528
2015-02-24 17:44:37mjacobsetmessages: + msg236526
2015-02-24 15:05:19serhiy.storchakasetmessages: + msg236502
2015-02-24 13:38:33mjacobsetnosy: + mjacob
messages: + msg236493
2015-01-15 05:02:36ethan.furmansetassignee: ethan.furman
2014-01-04 17:29:34serhiy.storchakasetassignee: serhiy.storchaka -> (no value)
messages: + msg207293
2013-12-14 19:08:20python-devsetmessages: + msg206199
2013-12-11 20:20:57serhiy.storchakasetmessages: + msg205921
2013-12-11 20:02:12serhiy.storchakasetfiles: + issue17576_v4.patch

messages: + msg205919
2013-12-11 19:28:18python-devsetnosy: + python-dev
messages: + msg205916
2013-12-11 08:59:39serhiy.storchakasetmessages: + msg205896
2013-12-11 07:37:09mark.dickinsonsetmessages: + msg205890
2013-12-11 07:34:48serhiy.storchakasetmessages: + msg205889
2013-12-10 23:11:10ncoghlansetmessages: + msg205870
2013-12-10 21:53:32serhiy.storchakasetmessages: + msg205862
2013-12-10 20:38:04mark.dickinsonsetmessages: + msg205856
2013-12-10 12:09:03ncoghlansetassignee: mark.dickinson -> serhiy.storchaka
messages: + msg205802
2013-12-10 11:47:31serhiy.storchakasetfiles: + issue17576_v3.patch

messages: + msg205793
2013-12-09 19:42:24mark.dickinsonsetmessages: + msg205735
2013-12-09 19:30:09serhiy.storchakasetmessages: + msg205733
2013-10-18 20:38:38Arfreversetnosy: + Arfrever
2013-10-18 08:00:27mark.dickinsonsetmessages: + msg200229
2013-10-17 22:37:01ethan.furmansetmessages: + msg200171
2013-08-21 21:32:15ncoghlansetmessages: + msg195824
2013-08-21 11:32:36serhiy.storchakasetmessages: + msg195772
2013-08-21 10:22:18serhiy.storchakasetmessages: + msg195764
2013-08-21 09:33:16serhiy.storchakasetmessages: + msg195759
2013-08-21 09:03:12serhiy.storchakalinkissue18712 dependencies
2013-08-21 05:47:34mark.dickinsonsetmessages: + msg195747
2013-08-20 19:52:35serhiy.storchakasetmessages: + msg195713
2013-08-20 14:47:05serhiy.storchakasetstage: patch review
2013-08-04 11:29:25ncoghlansetmessages: + msg194347
2013-08-04 10:07:25mark.dickinsonsetfiles: + issue17576_v2.patch

messages: + msg194337
2013-08-04 08:56:17mark.dickinsonsetfiles: + issue17576.patch

assignee: docs@python -> mark.dickinson
components: + Interpreter Core, - Documentation

keywords: + patch
nosy: + ncoghlan
messages: + msg194335
2013-05-10 06:03:37serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg188810
2013-05-09 22:09:29hayposetnosy: + haypo
messages: + msg188791
2013-05-08 22:20:31terry.reedysetversions: - Python 3.1, Python 3.2
2013-04-05 16:18:09ethan.furmansetnosy: + ethan.furman
2013-03-30 08:25:44mark.dickinsonsetnosy: + mark.dickinson
2013-03-30 00:39:42barrysetmessages: + msg185532
2013-03-30 00:36:57alexsetnosy: + alex
messages: + msg185531
2013-03-30 00:29:36eric.snowsetnosy: + eric.snow
messages: + msg185530
2013-03-29 22:29:38barrysetmessages: + msg185523
2013-03-29 22:25:33barrysettitle: PyNumber_Index() is not int-subclass friendly -> PyNumber_Index() is not int-subclass friendly (or operator.index() docos lie)
2013-03-29 22:25:11barrycreate