classification
Title: Pure Python operator.index doesn't match the C version.
Type: behavior Stage: patch review
Components: Extension Modules Versions: Python 3.11
process
Status: open Resolution:
Dependencies: 17576 Superseder:
Assigned To: Nosy List: arigo, eric.snow, iritkatriel, mark.dickinson, serhiy.storchaka, zach.ware
Priority: normal Keywords: patch

Created on 2013-08-12 14:09 by mark.dickinson, last changed 2021-09-10 17:48 by iritkatriel.

Files
File name Uploaded Description Edit
operator_index.patch serhiy.storchaka, 2013-08-13 09:58 review
x.py arigo, 2013-08-17 07:55
operator_index_2.patch serhiy.storchaka, 2013-08-17 08:29 review
y.py arigo, 2013-08-17 10:33
operator_index_3.patch serhiy.storchaka, 2013-08-17 19:22 review
operator_index_4.patch serhiy.storchaka, 2013-08-18 07:31 review
Messages (21)
msg194966 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2013-08-12 14:09
Nitpick: the pure Python version of operator.index (new in Python 3.4, introduced in issue #16694) doesn't match the C version, in that it looks up __index__ on the object rather than the class.


iwasawa:cpython mdickinson$ ./python.exe
Python 3.4.0a1+ (default:9e61563edb67+, Aug 12 2013, 14:45:12) 
[GCC 4.2.1 (Apple Inc. build 5664)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from test import support
>>> py_operator = support.import_fresh_module('operator', blocked=['_operator'])
>>> c_operator = support.import_fresh_module('operator', fresh=['_operator'])
>>> class A(int): pass
... 
>>> a = A(42); a.__index__ = lambda: 1729
>>> 
>>> py_operator.index(a)
1729
>>> c_operator.index(a)
42
msg195000 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-08-12 19:43
We can use type(a).__index__(a). Should we also correct the documentation for operator.index() and operator.length_hint()?
msg195006 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2013-08-12 20:15
Yes, I think it would make sense to fix the docs as well, at least for Python 3.4.  Probably not worth it for the maintenance releases.
msg195064 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-08-13 09:58
Here is a patch. Now the code of operator.index() becomes even more complicated. Perhaps you want suggest other wording for documentation?

Some code in stdlib (_pyio.py, bz2.py, connection.py) uses a.__index__() instead of type(a).__index__(a) (with replacing AttributeError to TypeError). Is it worth to change?
msg195451 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2013-08-17 07:55
Just mentioning it here again, but "type(a).__index__(a)" is still not perfectly correct.  Attached is a case where it differs.

I think you get always the correct answer by evaluating "range(a).stop".  It's admittedly obscure...  For example:

    class A:
        def __index__(self):
            return -42**100

    a = A()
    print(range(a).stop)
msg195454 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-08-17 08:29
The difference doesn't look significant. In all cases the TypeError is raised.

But there was other differences between C and Python versions -- when __index__() returns non-integer. Updated patch fixes this.
msg195457 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2013-08-17 10:23
> Just mentioning it here again, but "type(a).__index__(a)" is still not perfectly correct.

Hmm.  "type(a).__dict__['__index__'](a)" ?  (With suitable error checks, as in Serhiy's patch.)
msg195458 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2013-08-17 10:33
> The difference doesn't look significant. In all cases the TypeError is raised.

Ok, so here is another case.  (I won't go to great lengths trying to convince you that there is a problem, because the discussion already occurred several times; google for example for "_PyType_Lookup pure Python")
msg195474 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-08-17 14:25
> Hmm.  "type(a).__dict__['__index__'](a)" ?

This variant fails on:

    class A(int):
        @staticmethod
        def __index__():
            return 42
msg195476 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2013-08-17 14:35
Serhiy:  Yep, or even on bool.  Thanks.

Armin: I don't think either of us thinks there isn't a problem here. :-)
The Google search you suggested didn't turn up a whole lot of useful information for me.  Was there a discussion of this on python-dev at some point?  (And if not, should there be?)

For this *particular* issue, it seems we can't exactly reproduce.  type(a).__index__(a) seems like the best practical approximation to the true behaviour.  I'm guessing that it's fairly rare to have useful definitions of __index__ on a metaclass.
msg195489 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2013-08-17 16:08
This may have been the most recent discussion of this idea (as far as I can tell):
http://mail.python.org/pipermail//python-ideas/2012-August/016036.html

Basically, it seems to be still unresolved in the trunk Python; sorry, I thought by now it would have been resolved e.g. by the addition of a method on types or a function in the operator module.  In the absence of either, you need either to simulate its behavior by doing this:

    for t in type(a).__mro__:
        if '__index__' in t.__dict__:
            return t.__dict__['__index__'](a)

Or you can piggyback on an unrelated call that simply causes the C-level PyNumber_Index() to be called:

    return range(a).stop
msg195490 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2013-08-17 16:12
Sorry, realized that my pure Python algorithm isn't equivalent to _PyType_Lookup() --- it fails the staticmethod example of Serhiy.  A closer one would be:

    for t in type(a).__mro__:
        if '__index__' in t.__dict__:
            return t.__dict__['__index__'].__get__(a)()

But it's still not a perfect match.  I think right now the only "perfect" answer is given by workarounds like "range(a).stop"...
msg195510 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-08-17 19:22
Here is updated patch which uses Armin's algorithm for lookup special methods and adds special case for int subclasses in index().

I have no idea how the documentation should look.
msg195513 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2013-08-17 19:49
Couldn't you make use of inspect.getattr_static()?

  getattr_static(obj.__class__, '__index__').__get__(obj)()

getattr_static() does some extra work to get do the right lookup.  I haven't verified that it matches _PyType_Lookup() exactly, but it should be pretty close at least.

Also, pickle (unfortunately) also does lookup on instances rather than classes for the special methods (issue #16251).
msg195545 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2013-08-18 06:49
For completeness, can you post one line saying why the much simpler solution "range(a).stop" is not accepted?
msg195546 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-08-18 07:31
Here is a variant with getattr_static().
msg195547 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-08-18 07:38
> For completeness, can you post one line saying why the much simpler solution "range(a).stop" is not accepted?

Because it is implementation detail. The range() function if it will be implemented in Python needs operator.index().

The main purpose of adding Python implementation of the operator module was to provide an implementation which can be used in alternative Python implementations (e.g. PyPy). The CPython itself doesn't use it. And now I doubt that such complicated implementation will be helpful.

Perhaps we need a builtin which exposes _PyType_Lookup() at Python level.
msg195548 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2013-08-18 07:48
Ah.  If that's the only reason, then that seems a bit like misguided effort...  For alternative implementations like PyPy and Jython, the "_operator" module is definitely one of the simplest ones to reimplement in RPython or Java.  Every function is straightforwardly translated to just one call to an internal function -- that we need to have already for the rest of the language.
msg195549 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2013-08-18 07:56
...but yes, it's very obvious that exposing _PyType_Lookup() to pure Python is the right thing to do here.  This is a central part of the way Python works internally, after all.

Moreover, sorry about my previous note: if we started today to write PyPy, then it would be enough to have the pure Python version of operator.index(), based on the newly exposed _PyType_Lookup().  With PyPy's JIT, there is no real performance loss.  Some of my confusion came from the fact that there *would* be serious performance loss if we had to work with the pure Python looping-over-__mro__-and-fishing-in-__dict__ version.
msg195665 - (view) Author: Zachary Ware (zach.ware) * (Python committer) Date: 2013-08-19 20:43
A lot of this discussion has flown a rather unfortunate distance over my head, especially since I've barely had time to follow it.  But it looks to me like--given the number of other places that do the same thing as operator.index currently does--there needs to be a simple way to do the right thing somewhere accessible, which probably means a builtin.

On the other hand, it seems to me like 'a.__index__()' *should* be "the right thing" to do, but I get the feeling that making that so would be an astronomically huge change without much real benefit and lots of opportunities to break everything.  I suspect I'm also missing something fundamental in why it's not the right thing to do.
msg401603 - (view) Author: Irit Katriel (iritkatriel) * (Python committer) Date: 2021-09-10 17:48
Reproduced on 3.11.
History
Date User Action Args
2021-09-10 17:48:54iritkatrielsetnosy: + iritkatriel

messages: + msg401603
versions: + Python 3.11, - Python 3.4
2013-08-21 09:03:12serhiy.storchakasetdependencies: + PyNumber_Index() is not int-subclass friendly (or operator.index() docs lie)
2013-08-19 20:43:39zach.waresetmessages: + msg195665
2013-08-18 07:56:42arigosetmessages: + msg195549
2013-08-18 07:48:01arigosetmessages: + msg195548
2013-08-18 07:38:45serhiy.storchakasetmessages: + msg195547
2013-08-18 07:31:58serhiy.storchakasetfiles: + operator_index_4.patch
2013-08-18 07:31:25serhiy.storchakasetmessages: + msg195546
2013-08-18 06:49:07arigosetmessages: + msg195545
2013-08-17 19:49:09eric.snowsetnosy: + eric.snow
messages: + msg195513
2013-08-17 19:22:04serhiy.storchakasetfiles: + operator_index_3.patch

messages: + msg195510
2013-08-17 16:12:23arigosetmessages: + msg195490
2013-08-17 16:08:21arigosetmessages: + msg195489
2013-08-17 14:35:15mark.dickinsonsetmessages: + msg195476
2013-08-17 14:25:35serhiy.storchakasetmessages: + msg195474
2013-08-17 10:33:54arigosetfiles: + y.py

messages: + msg195458
2013-08-17 10:23:06mark.dickinsonsetmessages: + msg195457
2013-08-17 08:29:29serhiy.storchakasetfiles: + operator_index_2.patch

messages: + msg195454
2013-08-17 07:55:01arigosetfiles: + x.py
nosy: + arigo
messages: + msg195451

2013-08-13 09:58:05serhiy.storchakasetfiles: + operator_index.patch
keywords: + patch
messages: + msg195064

stage: patch review
2013-08-12 20:15:07mark.dickinsonsetmessages: + msg195006
2013-08-12 19:43:38serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg195000
2013-08-12 14:09:38mark.dickinsoncreate