Message151574
> Don't you think that the number of corrections you have to apply in order
> to get the tests working again shows how much impact such a change would
> have in real-world applications ?
Let see the diffstat:
Doc/using/cmdline.rst | 7
Include/pythonrun.h | 2
Include/unicodeobject.h | 6
Lib/json/__init__.py | 4
Lib/os.py | 17 -
Lib/packaging/create.py | 7
Lib/packaging/tests/test_create.py | 18 -
Lib/test/mapping_tests.py | 2
Lib/test/regrtest.py | 5
Lib/test/test_builtin.py | 1
Lib/test/test_dis.py | 36 ++-
Lib/test/test_gdb.py | 11 -
Lib/test/test_inspect.py | 1
Lib/test/test_os.py | 35 ++-
Lib/test/test_set.py | 25 ++
Lib/test/test_unicode.py | 39 ++++
Lib/test/test_urllib.py | 16 -
Lib/test/test_urlparse.py | 6
Lib/tkinter/test/test_ttk/test_functions.py | 2
Makefile.pre.in | 1
Modules/posixmodule.c | 126 ++-----------
Objects/unicodeobject.c | 20 +-
PCbuild/pythoncore.vcproj | 4
Python/pythonrun.c | 3
Python/random.c | 268 ++++++++++++++++++++++++++++
25 files changed, 488 insertions(+), 174 deletions(-)
Except Lib/packaging/create.py, all other changes are related to the
introduction of the randomized hash function, or fix tests... Even
Lib/packaging/create.py change is related to fixing tests. The test
can be changed differently, but I like the idea of having always the
same output in packaging (e.g. it is more readable for the user if
files are sorted).
I expected to have to do something on multiprocessing, but nope, it
doesn't care of the hash value.
So I expect something similar in applications: no change in the
applications, but a lot of hacks/tricks in tests.
> Perhaps we should start to think about a compromise: make both the
> collision counting and the hash seeding optional and let the user
> decide which option is best.
I don't think that we need two fixes for a single vulnerability (in
the same Python version), one is enough. If we decide to count
collisions, the randomized hash idea can be simply dropped. But we may
use a different fix for Python 3.3 and for stable versions (e.g. count
collisions for stable versions and use randomized hash for 3.3).
> BTW: The patch still includes the unnecessary _Py_unicode_hash_secret.suffix
> which needlessly complicates the code and doesn't any additional
> protection against hash value collisions
How does it complicate the code? It adds an extra XOR to hash(str) and
4 or 8 bytes in memory, that's all. It is more difficult to compute
the secret from hash(str) output if there is a prefix *and* a suffix.
If there is only a prefix, knowning a single hash(str) value is just
enough to retrieve directly the secret.
.
> I don't think it affects more than 0.01% of applications/users :)
It would help to try a patched Python on a real world application like
Django to realize how much code is broken (or not) by a randomized
hash function. |
|
Date |
User |
Action |
Args |
2012-01-18 22:52:46 | vstinner | set | recipients:
+ vstinner, lemburg, gvanrossum, tim.peters, barry, georg.brandl, terry.reedy, gregory.p.smith, jcea, mark.dickinson, pitrou, christian.heimes, benjamin.peterson, eric.araujo, grahamd, Arfrever, v+python, alex, zbysz, skrah, dmalcolm, gz, neologix, Arach, Mark.Shannon, eric.snow, Zhiping.Deng, Huzaifa.Sidhpurwala, Jim.Jewett, PaulMcMillan, fx5 |
2012-01-18 22:52:46 | vstinner | link | issue13703 messages |
2012-01-18 22:52:45 | vstinner | create | |
|