Message 85884 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	rhettinger
Recipients	dschult, loewis, rhettinger
Date	2009-04-12.00:58:52
SpamBayes Score	2.8232976e-06
Marked as misclassified	No
Message-id	<1239497934.31.0.271829015803.issue5730@psf.upfronthosting.co.za>
In-reply-to

Content
I would support fixing the double call to PyObject_Hash(). For user defined classes with their own __hash__ methods, this is by far the slowest part of the operation. > from my perspective creating an internal SetItem adds another > function handling the data structure just as setdefault would Incorrect comparison. Your in-lining manipulated the ep structure directly (not a good thing). In contrast, adding an alternative _PyDict_SetItemWithHash uses the insertdict() function, fully isolating itself of from the table implementation. The whole purpose of having insertdict() and lookdict() is to isolate the data structure internals from all externally visible functions. >>> setup = ''' class A(object): def __hash__(self): return 10 class B(object): pass a = A() b = B() ''' >>> min(Timer('{}.setdefault(a, 0)', setup).repeat(7, 100000)) 0.12840011789208106 >>> min(Timer('{}.setdefault(b, 0)', setup).repeat(7, 100000)) 0.053155359130840907 The double call to a very simple user defined __hash__ adds .07 to call that takes on .05 with the double call to builtin object hash. So, we could save half of the the .07 just by elimating the double call to __hash__. With more complex hash functions the overall speedup is huge (essentially cutting the total work almost in half).

I would support fixing the double call to PyObject_Hash().  For user
defined classes with their own __hash__ methods, this is by far the
slowest part of the operation.

> from my perspective creating an internal SetItem adds another
> function handling the data structure just as setdefault would

Incorrect comparison.  Your in-lining manipulated the ep structure
directly (not a good thing).  In contrast, adding an alternative
_PyDict_SetItemWithHash uses the insertdict() function, fully isolating
itself of from the table implementation.  The whole purpose of having
insertdict() and lookdict() is to isolate the data structure internals
from all externally visible functions.

>>> setup = '''
class A(object):
	def __hash__(self):
		return 10
class B(object):
   pass
a = A()
b = B()
'''
>>> min(Timer('{}.setdefault(a, 0)', setup).repeat(7, 100000))
0.12840011789208106
>>> min(Timer('{}.setdefault(b, 0)', setup).repeat(7, 100000))
0.053155359130840907

The double call to a very simple user defined __hash__ adds .07 to call
that takes on .05 with the double call to builtin object hash.  So, we
could save half of the the .07 just by elimating the double call to
__hash__.  With more complex hash functions the overall speedup is huge
(essentially cutting the total work almost in half).

History
Date	User	Action	Args
2009-04-12 00:58:54	rhettinger	set	recipients: + rhettinger, loewis, dschult
2009-04-12 00:58:54	rhettinger	set	messageid: <1239497934.31.0.271829015803.issue5730@psf.upfronthosting.co.za>
2009-04-12 00:58:53	rhettinger	link	issue5730 messages
2009-04-12 00:58:52	rhettinger	create