Message270836
In dictobject.c, the function dictviews_and performs a very expensive set creation operation on the keys of the self dict object:
From Python 2.7:
...
static PyObject*
dictviews_and(PyObject* self, PyObject *other)
{
PyObject *result = PySet_New(self);
...
tmp = PyObject_CallMethod(result, "intersection_update", "(O)", other);
...
Similar logic can be found in the Python 3 code as well.
For large dicts (~10 000 000 keys), it takes a significant amount of time to copy the keys into a new set, and also wastes a lot of memory. This issue is amplified when the intersection operation is performed many times.
This implementation uses O(len(self)) time and space, while it is expected to use O(min(len(self), len(other)) time and space.
Solution 1: Manually copy the keys of the dict into a set, and perform all intersection operations on that set. However, still wastes lots of memory.
Solution 2 (Recommended): Port the intersection logic from the set class to the dict view class. |
|
Date |
User |
Action |
Args |
2016-07-19 17:51:47 | David Su2 | set | recipients:
+ David Su2 |
2016-07-19 17:51:47 | David Su2 | set | messageid: <1468950707.74.0.640405379684.issue27575@psf.upfronthosting.co.za> |
2016-07-19 17:51:47 | David Su2 | link | issue27575 messages |
2016-07-19 17:51:47 | David Su2 | create | |
|