Author haypo
Recipients Arach, Arfrever, Huzaifa.Sidhpurwala, Mark.Shannon, PaulMcMillan, Zhiping.Deng, alex, barry, benjamin.peterson, christian.heimes, dmalcolm, georg.brandl, gvanrossum, gz, haypo, jcea, lemburg, merwok, pitrou, skrah, terry.reedy, tim.peters, v+python, zbysz
Date 2012-01-11.09:56:10
SpamBayes Score 7.43849e-15
Marked as misclassified No
Message-id <CAMpsgwZraWfhT_bXn2F-BmYUWkPKf5Gg1evRrijffn2xmTXjzg@mail.gmail.com>
In-reply-to <4F0D5638.7090903@egenix.com>
Content
>  * it is exceedingly complex

Which part exactly? For hash(str), it just add two extra XOR.

>  * the method would need to be implemented for all hashable Python types

It was already discussed, and it was said that only hash(str) need to
be modified.

>  * it causes startup time to increase (you need urandom data for
>   every single hashable Python data type)

My patch reads 8 or 16 bytes from /dev/urandom which doesn't block. Do
you have a benchmark showing a difference?

I didn't try my patch on Windows yet.

>  * it causes run-time to increase due to changes in the hash
>   algorithm (more operations in the tight loop)

I posted a micro-benchmark on hash(str) on python-dev: the overhead is
nul. Did you have numbers showing that the overhead is not nul?

>  * causes different processes in a multi-process setup to use different
>   hashes for the same object

Correct. If you need to get the same hash, you can disable the
randomized hash (PYTHONHASHSEED=0) or use a fixed seed (e.g.
PYTHONHASHSEED=42).

>  * doesn't appear to work well in embedded interpreters that
>   regularly restarted interpreters (AFAIK, some objects persist across
>   restarts and those will have wrong hash values in the newly started
>   instances)

test_capi runs _testembed which restarts a embedded interpreters 3
times, and the test pass (with my patch version 5). Can you write a
script showing the problem if there is a real problem?

In an older version of my patch, the hash secret was recreated at each
initiliazation. I changed my patch to only generate the secret once.

> The most important issue, though, is that it doesn't really
> protect Python against the attack - it only makes it less
> likely that an adversary will find the init vector (or a way
> around having to find it via crypt analysis).

I agree that the patch is not perfect. As written in the patch, it
just makes the attack more complex. I consider that it is enough.

Perl has a simpler protection than the one proposed in my patch. Is
Perl vulnerable to the hash collision vulnerability?
History
Date User Action Args
2012-01-11 09:56:12hayposetrecipients: + haypo, lemburg, gvanrossum, tim.peters, barry, georg.brandl, terry.reedy, jcea, pitrou, christian.heimes, benjamin.peterson, merwok, Arfrever, v+python, alex, zbysz, skrah, dmalcolm, gz, Arach, Mark.Shannon, Zhiping.Deng, Huzaifa.Sidhpurwala, PaulMcMillan
2012-01-11 09:56:11haypolinkissue13703 messages
2012-01-11 09:56:10haypocreate