This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author v+python
Recipients Arach, Arfrever, Huzaifa.Sidhpurwala, Mark.Shannon, PaulMcMillan, Zhiping.Deng, alex, barry, benjamin.peterson, christian.heimes, dmalcolm, eric.araujo, georg.brandl, gvanrossum, gz, jcea, lemburg, pitrou, skrah, terry.reedy, tim.peters, v+python, vstinner
Date 2012-01-08.00:19:14
SpamBayes Score 1.6653345e-16
Marked as misclassified No
Message-id <1325981956.53.0.572573613251.issue13703@psf.upfronthosting.co.za>
In-reply-to
Content
Given Martin's comment (msg150832) I guess I should add my suggestion to this issue, at least for the record.

Rather than change hash functions, randomization could be added to those dicts that are subject to attack by wanting to store user-supplied key values.  The list so far seems to be   urllib.parse, cgi, json  Some have claimed there are many more, but without enumeration.  These three are clearly related to the documented issue.

The technique would be to wrap dict and add a short random prefix to each key value, preventing the attacker from supplier keys that are known to collide... and even if he successfully stumbles on a set that does collide on one request, it is unlikely to collide on a subsequent request with a different prefix string.

The technique is fully backward compatible with all applications except those that contain potential vulnerabilities as described by the researchers. The technique adds no startup or runtime overhead to any application that doesn't contain the potential vulnerabilities.  Due to the per-request randomization, the complexity of creating a sequence of sets of keys that may collide is enormous, and requires that such a set of keys happen to arrive on a request in the right sequence where the predicted prefix randomization would be used to cause the collisions to occur.  This might be possible on a lightly loaded system, but is less likely on a system with heavy load, which are more interesting to attack.

Serhiy Storchaka provided a sample implementation on the python-dev, copied below, and attached as a file (but is not a patch).

# -*- coding: utf-8 -*-
from collections import MutableMapping
import random


class SafeDict(dict, MutableMapping):

    def __init__(self, *args, **kwds):
        dict.__init__(self)
        self._prefix = str(random.getrandbits(64))
        self.update(*args, **kwds)

    def clear(self):
        dict.clear(self)
        self._prefix = str(random.getrandbits(64))

    def _safe_key(self, key):
        return self._prefix + repr(key), key

    def __getitem__(self, key):
        try:
            return dict.__getitem__(self, self._safe_key(key))
        except KeyError as e:
            e.args = (key,)
            raise e

    def __setitem__(self, key, value):
        dict.__setitem__(self, self._safe_key(key), value)

    def __delitem__(self, key):
        try:
            dict.__delitem__(self, self._safe_key(key))
        except KeyError as e:
            e.args = (key,)
            raise e

    def __iter__(self):
        for skey, key in dict.__iter__(self):
            yield key

    def __contains__(self, key):
        return dict.__contains__(self, self._safe_key(key))

    setdefault = MutableMapping.setdefault
    update = MutableMapping.update
    pop = MutableMapping.pop
    popitem = MutableMapping.popitem
    keys = MutableMapping.keys
    values = MutableMapping.values
    items = MutableMapping.items

    def __repr__(self):
        return '{%s}' % ', '.join('%s: %s' % (repr(k), repr(v))
            for k, v in self.items())

    def copy(self):
        return self.__class__(self)

    @classmethod
    def fromkeys(cls, iterable, value=None):
        d = cls()
        for key in iterable:
            d[key] = value
        return d

    def __eq__(self, other):
        return all(k in other and other[k] == v for k, v in self.items()) and \
            all(k in self and self[k] == v for k, v in other.items())

    def __ne__(self, other):
        return not self == other
History
Date User Action Args
2012-01-08 00:19:17v+pythonsetrecipients: + v+python, lemburg, gvanrossum, tim.peters, barry, georg.brandl, terry.reedy, jcea, pitrou, vstinner, christian.heimes, benjamin.peterson, eric.araujo, Arfrever, alex, skrah, dmalcolm, gz, Arach, Mark.Shannon, Zhiping.Deng, Huzaifa.Sidhpurwala, PaulMcMillan
2012-01-08 00:19:16v+pythonsetmessageid: <1325981956.53.0.572573613251.issue13703@psf.upfronthosting.co.za>
2012-01-08 00:19:15v+pythonlinkissue13703 messages
2012-01-08 00:19:15v+pythoncreate