Message 339462 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	josh.r
Recipients	christian.heimes, jdemeyer, josh.r
Date	2019-04-04.20:32:38
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1554409958.7.0.958073645566.issue36525@roundup.psfhosted.org>
In-reply-to

Content
Actually, there is a use case, and it's one I've been intermittently trying to replicate ever since the functionality vanished in the Python 2 to 3 transition (or so I thought). The use case is taking an existing built-in function (as in a CPython function defined in C, and therefore not naturally obeying the descriptor protocol) and converting it to the equivalent of an unbound method for assignment to a class. In Python 2, a trick I'd use to make int-like classes work identically on Python 2 and Python 3 was to define it like so: import types import sys class MyInt: def __index__(self): return SOMEINTEGER if sys.version_info[0] == 2: from future_builtins import hex, oct MyInt.__hex__ = types.MethodType(hex, None, MyInt) MyInt.__oct__ = types.MethodType(oct, None, MyInt) del hex, oct Essentially it took the built-in methods from future_builtins that used __index__ and converted them to unbound methods of MyInt so even people using my code with the non-future hex/oct would get the correct behavior (the old hex would call __hex__, which was actually the future hex, and would therefore delegate to __index__). And it did it at C speed; no additional bytecode was executed whether you used the class with old or future hex. That specific use case doesn't actually apply to hex/oct anymore (conveniently, it was only needed when on the version that supported it), but it's a general purpose tool. In Python 3, types.MethodType lost the final argument (because now Python level code only had functions and bound methods, no in-between concept of unbound methods). But this removed some functionality I liked; in particular I'd liked the idea of using operator.attrgetter to define rich comparisons and hashing in terms of a key method (__key__ has been proposed before for this purpose IIRC) implemented in C (to reduce overhead from delegating all of those methods to yet another method). Now that you've pointed out this API to me, I could actually do that. Not prettily (since it involves using ctypes to gain access), but it's possible. If exposed at the Python layer (or in the case of my testing, manually loaded using ctypes), I can implement __key__ on a class with: class Foo: # Compare based on attributes x, y and z __key__ = PyInstanceMethod_New(operator.attrgetter('x', 'y', 'z')) def __eq__(self, other): if not isinstance(other, Foo): return NotImplemented return self.__key__() == other.__key__() def __hash__(self): return hash(self.__key__()) and if I have a Foo instance, instance.__key__() just works (and slightly faster than def __key__(self): return self.x, self.y, self.z would), and all the random comparison and hashing stuff can be implemented in terms of it (I'd likely be writing a C accelerated decorator for use in contexts like functools.total_ordering that depended solely on the existence of a __key__, so comparisons and hashing and the like could all be implemented without ending up back in bytecode, which has some performance advantages for large sorts and dedupes, and lets you write otherwise Python level classes that benefit from GIL thread safety). The only way I've been able to mimic this before was wrapping in functools.partialmethod, but that's implemented in Python (so no CPython thread safety or reduced bytecode advantages) and supports a ton of other features that make it much slower for this use case. In local microbenchmarks, Foo using PyInstanceMethod_New took 227 ns to call foo.__key__(), using the "obvious" approach took 238 ns, and using partialmethod took 3.1 us, over 10x longer. An in-between approach of: class Foo: _getkey = attrgetter('x', 'y', 'z') def __key__(self): return self._getkey(self) is okay, taking 317 ns, but adding additional layering, requiring the separate caching of the attrgetter, etc. Yes, it's kind of a fiddly thing; all it really does is make it possible to reuse existing built-ins taking a single argument as methods on a class. But I'd like to have that option; we've got a ton of built-ins that do fiddly things, and it would be nice to reuse them easily. Point is, rather than deprecating it, I'd kind of like it if we could make it accessible via the types module for this sort of use case (without requiring ctypes hackery), as types.InstanceMethodType or the like.

Actually, there is a use case, and it's one I've been intermittently trying to replicate ever since the functionality vanished in the Python 2 to 3 transition (or so I thought).

The use case is taking an existing built-in function (as in a CPython function defined in C, and therefore not naturally obeying the descriptor protocol) and converting it to the equivalent of an unbound method for assignment to a class.

In Python 2, a trick I'd use to make int-like classes work identically on Python 2 and Python 3 was to define it like so:

import types
import sys

class MyInt:
    def __index__(self):
        return SOMEINTEGER

if sys.version_info[0] == 2:
    from future_builtins import hex, oct
    MyInt.__hex__ = types.MethodType(hex, None, MyInt)
    MyInt.__oct__ = types.MethodType(oct, None, MyInt)
    del hex, oct

Essentially it took the built-in methods from future_builtins that used __index__ and converted them to unbound methods of MyInt so even people using my code with the non-future hex/oct would get the correct behavior (the old hex would call __hex__, which was actually the future hex, and would therefore delegate to __index__). And it did it at C speed; no additional bytecode was executed whether you used the class with old or future hex. That specific use case doesn't actually apply to hex/oct anymore (conveniently, it was only needed when on the version that supported it), but it's a general purpose tool.

In Python 3, types.MethodType lost the final argument (because now Python level code only had functions and bound methods, no in-between concept of unbound methods). But this removed some functionality I liked; in particular I'd liked the idea of using operator.attrgetter to define rich comparisons and hashing in terms of a key method (__key__ has been proposed before for this purpose IIRC) implemented in C (to reduce overhead from delegating all of those methods to yet another method).

Now that you've pointed out this API to me, I could actually do that. Not prettily (since it involves using ctypes to gain access), but it's possible. If exposed at the Python layer (or in the case of my testing, manually loaded using ctypes), I can implement __key__ on a class with:

class Foo:
    # Compare based on attributes x, y and z
    __key__ = PyInstanceMethod_New(operator.attrgetter('x', 'y', 'z'))
    def __eq__(self, other):
        if not isinstance(other, Foo):
            return NotImplemented
        return self.__key__() == other.__key__()
    def __hash__(self):
        return hash(self.__key__())

and if I have a Foo instance, instance.__key__() just works (and slightly faster than def __key__(self): return self.x, self.y, self.z would), and all the random comparison and hashing stuff can be implemented in terms of it (I'd likely be writing a C accelerated decorator for use in contexts like functools.total_ordering that depended solely on the existence of a __key__, so comparisons and hashing and the like could all be implemented without ending up back in bytecode, which has some performance advantages for large sorts and dedupes, and lets you write otherwise Python level classes that benefit from GIL thread safety).

The only way I've been able to mimic this before was wrapping in functools.partialmethod, but that's implemented in Python (so no CPython thread safety or reduced bytecode advantages) and supports a ton of other features that make it *much* slower for this use case. In local microbenchmarks, Foo using PyInstanceMethod_New took 227 ns to call foo.__key__(), using the "obvious" approach took 238 ns, and using partialmethod took 3.1 us, over 10x longer. An in-between approach of:

class Foo:
    _getkey = attrgetter('x', 'y', 'z')
    def __key__(self): return self._getkey(self)

is okay, taking 317 ns, but adding additional layering, requiring the separate caching of the attrgetter, etc.

Yes, it's kind of a fiddly thing; all it really does is make it possible to reuse existing built-ins taking a single argument as methods on a class. But I'd like to have that option; we've got a ton of built-ins that do fiddly things, and it would be nice to reuse them easily.

Point is, rather than deprecating it, I'd kind of like it if we could make it accessible via the types module for this sort of use case (without requiring ctypes hackery), as types.InstanceMethodType or the like.

History
Date	User	Action	Args
2019-04-04 20:32:38	josh.r	set	recipients: + josh.r, christian.heimes, jdemeyer
2019-04-04 20:32:38	josh.r	set	messageid: <1554409958.7.0.958073645566.issue36525@roundup.psfhosted.org>
2019-04-04 20:32:38	josh.r	link	issue36525 messages
2019-04-04 20:32:38	josh.r	create