classification
Title: _md5 module crashes on large data
Type: crash Stage: resolved
Components: Extension Modules Versions: Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: gregory.p.smith, jcea, pitrou, python-dev
Priority: low Keywords: patch

Created on 2012-05-23 13:20 by pitrou, last changed 2012-05-23 21:20 by pitrou. This issue is now closed.

Files
File name Uploaded Description Edit
md5_huge.patch pitrou, 2012-05-23 13:42
Messages (12)
msg161406 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-05-23 13:20
This appears to be 2.7-only:

$ ./python -m test.regrtest -M5G -v test_hashlib
== CPython 2.7.3+ (2.7:086afe7b61f5, May 23 2012, 15:15:34) [GCC 4.5.2]
==   Linux-2.6.38.8-desktop-10.mga-x86_64-with-mandrake-1-Official little-endian
==   /home/antoine/cpython/27/build/test_python_6042
Testing with flags: sys.flags(debug=0, py3k_warning=0, division_warning=0, division_new=0, inspect=0, interactive=0, optimize=0, dont_write_bytecode=0, no_user_site=0, no_site=0, ignore_environment=0, tabcheck=0, verbose=0, unicode=0, bytes_warning=0, hash_randomization=0)
test_hashlib
test_algorithms_attribute (test.test_hashlib.HashLibTestCase) ... ok
test_case_md5_0 (test.test_hashlib.HashLibTestCase) ... ok
test_case_md5_1 (test.test_hashlib.HashLibTestCase) ... ok
test_case_md5_2 (test.test_hashlib.HashLibTestCase) ... ok
test_case_md5_huge (test.test_hashlib.HashLibTestCase) ... python: /home/antoine/cpython/27/Modules/md5module.c:276: MD5_new: Assertion `(Py_ssize_t)(unsigned int)(view.len) == (view.len)' failed.
Abandon
msg161407 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-05-23 13:42
Here is a patch.
msg161411 - (view) Author: Jesús Cea Avión (jcea) * (Python committer) Date: 2012-05-23 14:20
Does this affect other hash modules?. Why is this not affecting python 3?

Patch looks good.
msg161412 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-05-23 14:24
> Does this affect other hash modules?

I don't know, only md5 seems to have tests for large data.

> . Why is this not affecting python 3?

The _md5 module was apparently rewritten in Python 3.
msg161426 - (view) Author: Jesús Cea Avión (jcea) * (Python committer) Date: 2012-05-23 15:47
I can't reproduce this issue in my 64 bit machines, neither in Solaris neither in Ubuntu. I guess the assertion can be fooled by compiler optimizacions.

As a consequence, I can't check other hashes functions.

Antoine, can you reproduce it doing "md5.new("A"*(2**32+5)).hexdigest()"?. That is what test HUGE does.
msg161427 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-05-23 15:48
> I can't reproduce this issue in my 64 bit machines, neither in Solaris
> neither in Ubuntu. I guess the assertion can be fooled by compiler
> optimizacions.

You should compile in debug mode.
msg161438 - (view) Author: Jesús Cea Avión (jcea) * (Python committer) Date: 2012-05-23 17:03
sha1 fails the same way. Same error. Just clone the test to show it.

Please, correct sha1 too and add a test for it :).

sha224, sha256, sha384 and sha512 seems OK.
msg161439 - (view) Author: Jesús Cea Avión (jcea) * (Python committer) Date: 2012-05-23 17:08
sha224, sha256, sha384 and sha512 are not failing because they are missing the "Py_SAFE_DOWNCAST" safety net completely.

So I would tell that we have an issue here :).
msg161440 - (view) Author: Jesús Cea Avión (jcea) * (Python committer) Date: 2012-05-23 17:12
Same can be said about Python 3 hash modules: they are not using the sanity check, so they work.

Maybe the real question should be if the sanity check really makes sense at all. If not, remove everywhere (the calculated md5 with no checks looks correct, after all). If it makes sense, mark this as "python 3.2 and 3.3" and patch everywhere :).

Do you agree, Antoine?
msg161441 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-05-23 17:14
> sha1 fails the same way. Same error. Just clone the test to show it.
> 
> Please, correct sha1 too and add a test for it :).

Well, do you want to provide an updated patch?
msg161456 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012-05-23 21:19
New changeset 290d970c011d by Antoine Pitrou in branch '2.7':
Issue #14888: Fix misbehaviour of the _md5 module when called on data larger than 2**32 bytes.
http://hg.python.org/cpython/rev/290d970c011d
msg161457 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-05-23 21:20
I've now pushed the fix. Jesus, if you want to propose a test and patch for the _sha1 issue, please open a separate issue. Thanks!
History
Date User Action Args
2012-05-23 21:20:05pitrousetstatus: open -> closed
resolution: fixed
messages: + msg161457

stage: patch review -> resolved
2012-05-23 21:19:20python-devsetnosy: + python-dev
messages: + msg161456
2012-05-23 17:14:04pitrousetmessages: + msg161441
2012-05-23 17:12:19jceasetmessages: + msg161440
2012-05-23 17:08:56jceasetmessages: + msg161439
2012-05-23 17:03:02jceasetmessages: + msg161438
2012-05-23 15:48:06pitrousetmessages: + msg161427
2012-05-23 15:47:10jceasetmessages: + msg161426
2012-05-23 14:24:05pitrousetmessages: + msg161412
2012-05-23 14:20:01jceasetnosy: + jcea
messages: + msg161411
2012-05-23 13:42:42pitrousetstage: needs patch -> patch review
2012-05-23 13:42:30pitrousetfiles: + md5_huge.patch
keywords: + patch
messages: + msg161407
2012-05-23 13:20:13pitroucreate