This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author rhettinger
Recipients rhettinger, serhiy.storchaka, xiang.zhang
Date 2016-10-22.22:16:58
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1477174618.9.0.903755499844.issue28508@psf.upfronthosting.co.za>
In-reply-to
Content
> Isn't this already implemented?

No.

    >>> class A:
            pass

    >>> d = dict.fromkeys('abcdefghi')
    >>> a = A()
    >>> a.__dict__.update(d)
    >>> b = A()
    >>> b.__dict__.update(d)
    >>> import sys
    >>> [sys.getsizeof(m) for m in [d, vars(a), vars(b)]]
    [368, 648, 648]
    >>> c = A()
    >>> c.__dict__.update(d)
    >>> [sys.getsizeof(m) for m in [d, vars(a), vars(b), vars(c)]]
    [368, 648, 648, 648]

There is no benefit reported for key-sharing.  Even if you make a thousand of these instances, the size reported is the same.  Here is the relevant code:

    _PyDict_SizeOf(PyDictObject *mp)
    {
        Py_ssize_t size, usable, res;

        size = DK_SIZE(mp->ma_keys);
        usable = USABLE_FRACTION(size);

        res = _PyObject_SIZE(Py_TYPE(mp));
        if (mp->ma_values)
            res += usable * sizeof(PyObject*);
        /* If the dictionary is split, the keys portion is accounted-for
           in the type object. */
        if (mp->ma_keys->dk_refcnt == 1)
            res += (sizeof(PyDictKeysObject)
                    - Py_MEMBER_SIZE(PyDictKeysObject, dk_indices)
                    + DK_IXSIZE(mp->ma_keys) * size
                    + sizeof(PyDictKeyEntry) * usable);
        return res;
    }

It looks like the fixed overhead is included for every instance of a split-dictionary.   Instead, it might make sense to take the fixed overhead and divide it by the number of instances sharing the keys (averaging the overhead across the multiple shared instances):

     res = _PyObject_SIZE(Py_TYPE(mp)) / num_instances;

Perhaps use ceiling division:

     res = -(- _PyObject_SIZE(Py_TYPE(mp)) / num_instances);
History
Date User Action Args
2016-10-22 22:16:58rhettingersetrecipients: + rhettinger, serhiy.storchaka, xiang.zhang
2016-10-22 22:16:58rhettingersetmessageid: <1477174618.9.0.903755499844.issue28508@psf.upfronthosting.co.za>
2016-10-22 22:16:58rhettingerlinkissue28508 messages
2016-10-22 22:16:58rhettingercreate