Message 152125 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	dmalcolm
Recipients	Arach, Arfrever, Huzaifa.Sidhpurwala, Jim.Jewett, Mark.Shannon, PaulMcMillan, Zhiping.Deng, alex, barry, benjamin.peterson, christian.heimes, dmalcolm, eric.araujo, eric.snow, fx5, georg.brandl, grahamd, gregory.p.smith, gvanrossum, gz, jcea, lemburg, loewis, mark.dickinson, neologix, pitrou, skorgu, skrah, terry.reedy, tim.peters, v+python, vstinner, zbysz
Date	2012-01-27.21:42:36
SpamBayes Score	2.9833747e-11
Marked as misclassified	No
Message-id	<1327700522.2219.86.camel@surprise>
In-reply-to	<20120127220233.Horde.-kL5FML8999PIxDp7fqjHtA@webmail.df.eu>

Content
On Fri, 2012-01-27 at 21:02 +0000, Martin v. Löwis wrote: > Martin v. Löwis <martin@v.loewis.de> added the comment: > > > But then isn't it vulnerable to Frank's first attack as exposed in > > http://mail.python.org/pipermail/python-dev/2012-January/115726.html ? > > It would be, yes. That's sad. > > That could be fixed by indeed creating trees in all cases (i.e. moving > away from open addressing altogether). The memory consumption does not worry > me here; however, dictionary order will change in more cases. > > Compatibility could be restored by introducing a threshold for > tree creation: if insertion visits more than N slots, go back to the > original slot and put a tree there. I'd expect that N could be small, > e.g. N==4. Lookup would then have to consider all AVL trees along the > chain of visited slots, but ISTM it could also stop after visiting N > slots. Perhaps we could combine my attack-detection code from http://bugs.python.org/issue13703#msg151714 with Martin's AVL approach? Use the ma_smalltable to track stats, and when a dict detects that it's under attack, if all the keys are AVL-compatible, it could transition to full-AVL mode. [I believe that my patch successfully handles both of Frank's attacks, but I don't have the test data - I'd be very grateful to receive a copy (securely)]. [See hybrid-approach-dmalcolm-2012-01-25-002.patch for the latest version of attack-detection; I'm working on a rewrite in which I restrict it to working just on pure-str dicts. With that idea, when a dict detects that it's under attack, if all the keys satisfy this condition (new_hash(keyA) == new_hash(keyB)) iff (hash(keyA) == hash(keyB)) then all hash values get recalculated using new_hash (which is randomized), which should offer protection in many common attack scenarios, without the semantic change Alex and Antoine indicated]

On Fri, 2012-01-27 at 21:02 +0000, Martin v. Löwis wrote:
> Martin v. Löwis <martin@v.loewis.de> added the comment:
> 
> > But then isn't it vulnerable to Frank's first attack as exposed in
> > http://mail.python.org/pipermail/python-dev/2012-January/115726.html ?
> 
> It would be, yes. That's sad.
> 
> That could be fixed by indeed creating trees in all cases (i.e. moving
> away from open addressing altogether). The memory consumption does not worry
> me here; however, dictionary order will change in more cases.
> 
> Compatibility could be restored by introducing a threshold for
> tree creation: if insertion visits more than N slots, go back to the
> original slot and put a tree there. I'd expect that N could be small,
> e.g. N==4. Lookup would then have to consider all AVL trees along the
> chain of visited slots, but ISTM it could also stop after visiting N
> slots.

Perhaps we could combine my attack-detection code from 
  http://bugs.python.org/issue13703#msg151714
with Martin's AVL approach?  Use the ma_smalltable to track stats, and
when a dict detects that it's under attack,  *if* all the keys are
AVL-compatible, it could transition to full-AVL mode.  [I believe that
my patch successfully handles both of Frank's attacks, but I don't have
the test data - I'd be very grateful to receive a copy (securely)].

[See hybrid-approach-dmalcolm-2012-01-25-002.patch for the latest
version of attack-detection; I'm working on a rewrite in which I
restrict it to working just on pure-str dicts.  With that idea, when a
dict detects that it's under attack, *if* all the keys satisfy this
condition
  (new_hash(keyA) == new_hash(keyB)) iff (hash(keyA) == hash(keyB))
then all hash values get recalculated using new_hash (which is
randomized), which should offer protection in many common attack
scenarios, without the semantic change Alex and Antoine indicated]

History
Date	User	Action	Args
2012-01-27 21:42:37	dmalcolm	set	recipients: + dmalcolm, lemburg, gvanrossum, tim.peters, loewis, barry, georg.brandl, terry.reedy, gregory.p.smith, jcea, mark.dickinson, pitrou, vstinner, christian.heimes, benjamin.peterson, eric.araujo, grahamd, Arfrever, v+python, alex, zbysz, skrah, gz, neologix, Arach, Mark.Shannon, eric.snow, Zhiping.Deng, Huzaifa.Sidhpurwala, Jim.Jewett, PaulMcMillan, fx5, skorgu
2012-01-27 21:42:37	dmalcolm	link	issue13703 messages
2012-01-27 21:42:36	dmalcolm	create