Issue4858
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2009-01-06 20:06 by ebfe, last changed 2022-04-11 14:56 by admin. This issue is now closed.
Messages (12) | |||
---|---|---|---|
msg79281 - (view) | Author: Lukas Lueg (ebfe) | Date: 2009-01-06 20:06 | |
MD5 is one of the most popular cryptographic hash-functions around, mainly for it's good performance and availability throughout applications and libraries. The MD5 algorithm is currently implemented in python as part of the hashlib-module and (in more general terms) as part of SSL in the ssl-module. However, concerns about the security of MD5 have risen during the last few years. In 2007 a practical attack to create collisions in the compression-function has been released and on 12/31/2008 US-CERT issued a note to warn about the general insecurity of MD5 (http://www.kb.cert.org/vuls/id/836068). I propose and strongly suggest to start deprecate direct support for MD5 during this year and completly remove support for it afterwards. * MD5 is a cryptographic hash function, it's reason for being is security. By means of current hardware and attack vectors it's a matter of hours to create collisions and fool MD5 hashes. The reason for being has come to an end. * Python runs an uncountable number of exposed user interfaces on the web. Usually the programmers rely on the security of the backing libraries. Python can't provide this with MD5. * The functionality of MD5 can be easily replaced by using other hashes that are supported by python (e.g. SHA1). They supply compareable performance but are not binary-compatible (yay). * Programmers use MD5 in python without the need for it's cryptographic attributes (e.g. creating unique indexes). Keeping MD5 for this use however devaluates overall security of python for the good of few. I'd like to start a discussion about this. Please keep in mind that - although MD5 is currently still very popular and python's support for it is justifed by demand - it's existence will come to an end soon. We should now act and give people time to update their implementations. In a rough cut: - Patch haslib to throw a DeprecationWarning, starting during the first half of 2009. - Update documentation not to use MD5 for security reasons - Remove MD5 from python in 2010. - Keep accordance to PEP 4 Goodbye MD5 and thanks for all the fish. |
|||
msg79282 - (view) | Author: Marc-Andre Lemburg (lemburg) * | Date: 2009-01-06 20:17 | |
On 2009-01-06 21:06, Lukas Lueg wrote: > MD5 is one of the most popular cryptographic hash-functions around, > mainly for it's good performance and availability throughout > applications and libraries. The MD5 algorithm is currently implemented > in python as part of the hashlib-module and (in more general terms) as > part of SSL in the ssl-module. However, concerns about the security of > MD5 have risen during the last few years. In 2007 a practical attack to > create collisions in the compression-function has been released and on > 12/31/2008 US-CERT issued a note to warn about the general insecurity of > MD5 (http://www.kb.cert.org/vuls/id/836068). > > > I propose and strongly suggest to start deprecate direct support for MD5 > during this year and completly remove support for it afterwards. A strong -1 on that idea. MD5 is in wide-spread use as hash function. It can no longer be considered a cryptographic hash function, but still serves its purpose as fast, easy to use general purpose hash function well. Removing it from Python would cripple Python for no apparent reason. |
|||
msg79283 - (view) | Author: Raymond Hettinger (rhettinger) * | Date: 2009-01-06 20:20 | |
Because MD5 is used widely, Python needs to support it, if only to be able to verify MD5 signatures when offered. |
|||
msg79285 - (view) | Author: Gregory P. Smith (gregory.p.smith) * | Date: 2009-01-06 20:33 | |
The hashlib docs already mention the problems with md5 et al via a bright red: "Warning Some algorithms have known hash collision weaknesses, see the FAQ at the end." thanks for closing this. not gonna happen. |
|||
msg79291 - (view) | Author: Lukas Lueg (ebfe) | Date: 2009-01-06 21:42 | |
As I already said to Raymond: At least we should update the documentation. The "FAQ" currently linked is from 2005. The CERT-Advisory from provides a clean and simple language: "In 2008, researchers demonstrated the practical vulnerability [...] We are currently unaware of a practical solution to this problem. *Do not use the MD5 algorithm*." |
|||
msg79293 - (view) | Author: Marc-Andre Lemburg (lemburg) * | Date: 2009-01-06 21:59 | |
On 2009-01-06 22:42, Lukas Lueg wrote: > Lukas Lueg <knabberknusperhaus@yahoo.de> added the comment: > > As I already said to Raymond: At least we should update the > documentation. The "FAQ" currently linked is from 2005. > > The CERT-Advisory from provides a clean and simple language: "In 2008, > researchers demonstrated the practical vulnerability [...] We are > currently unaware of a practical solution to this problem. *Do not use > the MD5 algorithm*." That's a correct statement for cryptographic work based on MD5. However, it's not true with respect to using MD5 as fast general purpose hash algorithm in non-crypto applications, so I think the warning on http://docs.python.org/library/hashlib.html is sufficient. Note that the various SHA implementations are also starting to get some heat lately, so it's only a question of time until these get excluded from the set of cryptographic hash functions: http://en.wikipedia.org/wiki/SHA1 http://en.wikipedia.org/wiki/Cryptographic_hash_function also see: http://en.wikipedia.org/wiki/Hash_function """ Hash functions are related to (and often confused with) checksums, check digits, fingerprints, randomizing functions, error correcting codes, and cryptographic hash functions. Although these concepts overlap to some extent, each has its own uses and requirements. """ It might be a good idea to remove the word "secure" from the hashlib documentation, since security of these algorithms is always limited to a certain period of time. |
|||
msg79295 - (view) | Author: Lukas Lueg (ebfe) | Date: 2009-01-06 22:10 | |
> It might be a good idea to remove the word "secure" from the > hashlib documentation, since security of these algorithms is > always limited to a certain period of time. I'm sorry, was that a boy attempted humor ? [Misuse quote from DH3: Check] Anyway, in fact that might be a good idea: Reflect that the hashlib module includes hash functions for the sake of compatibility and interoperability and not everlasting security. |
|||
msg79296 - (view) | Author: Raymond Hettinger (rhettinger) * | Date: 2009-01-06 22:13 | |
Secure hash or cryptographic hash is the correct term and I think we should leave that in, if only to make the original intent clear and to make them easier to search for. I propose adding a sentence to the first paragraph noting that the level of security varies by algorithm and that over time some of the algorithms are being found to have possible cryptographic weaknesses or exploits. |
|||
msg79297 - (view) | Author: Marc-Andre Lemburg (lemburg) * | Date: 2009-01-06 22:39 | |
On 2009-01-06 23:10, Lukas Lueg wrote: > Lukas Lueg <knabberknusperhaus@yahoo.de> added the comment: > >> It might be a good idea to remove the word "secure" from the >> hashlib documentation, since security of these algorithms is >> always limited to a certain period of time. > > I'm sorry, was that a boy attempted humor ? [Misuse quote from DH3: Check] No, it's the reality of life and one of the reasons why digitally signed data needs to be resigned every few years in order to keep the data secured and the legal status of the signature intact. Note that SHA-0 and -1 were broken in 2005: http://www.schneier.com/blog/archives/2005/08/new_cryptanalyt.html In Germany, the BSI which corresponds to the NSA in the US, publishes a list of algorithms each year that are deemed safe, including their expiration year: http://www.bundesnetzagentur.de/enid/Veroeffentlichungen/Algorithmen_sw.html (in German) They regard SHA-1 as expired by the end of this year. For SHA-2 functions they give 2015 as expiry date. The NSA has similar guidelines: http://csrc.nist.gov/groups/ST/hash/statement.html They currently suggest using SHA-2 functions for crypto applications, but are also running a new contest for SHA-3: http://csrc.nist.gov/groups/ST/hash/sha-3/Round1/submissions_rnd1.html > Anyway, in fact that might be a good idea: Reflect that the hashlib > module includes hash functions for the sake of compatibility and > interoperability and not everlasting security. BTW: Not sure what Deer Hunter 3 has to do with all this ;-) http://www.planetdeerhunter.com/dh3 |
|||
msg79298 - (view) | Author: Lukas Lueg (ebfe) | Date: 2009-01-06 22:54 | |
actually I smelled irony and was referring to die hard 3 :-\ > No, it's the reality of life and one of the reasons why digitally > signed data needs to be resigned every few years in order to keep > the data secured and the legal status of the signature intact. I know that of course and that's why I brought this all up. > I propose adding a sentence to the first paragraph noting that the level > of security varies by algorithm and that over time some of the > algorithms are being found to have possible cryptographic weaknesses or > exploits. Fine. |
|||
msg79315 - (view) | Author: Martin v. Löwis (loewis) * | Date: 2009-01-07 10:14 | |
> I propose and strongly suggest to start deprecate direct support for MD5 > during this year and completly remove support for it afterwards. -1. Stopping usage of md5 should be the user's choice, not Python's. > * MD5 is a cryptographic hash function, it's reason for being is > security. By means of current hardware and attack vectors it's a matter > of hours to create collisions and fool MD5 hashes. The reason for being > has come to an end. I think you misunderstand the kind of problem that has been detected. It is still *not* possible to produce a colliding text within reasonable, when given the md5 hash. So when md5 is used as the trap function for password storage, it's use remains perfectly safe. Likewise, md5 is still well capable of detecting corruption of binary files (e.g. during downloads), and will remain in use for this application for many more years. It is only in the context of digital signatures that the "chosen prefix" attack can be demonstrated successfully. > * Python runs an uncountable number of exposed user interfaces on the > web. Usually the programmers rely on the security of the backing > libraries. Python can't provide this with MD5. That's like saying "Mercedes drivers rely on efficient operation of the motor. By putting water into the tank, the motor fails to deliver. So let's put a ban on the usage of water in cars." > * The functionality of MD5 can be easily replaced by using other hashes > that are supported by python (e.g. SHA1). They supply compareable > performance but are not binary-compatible (yay). In some case, yes, replacement is easy. In other cases, replacement is not so easy. For example, for password hashes, you cannot simply rehash all passwords - because you typically don't know what they are. |
|||
msg79341 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2009-01-07 15:38 | |
For the record, I'm with Martin -- there are many existing uses that we can't just legislate away. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:56:43 | admin | set | github: 49108 |
2009-01-07 15:39:00 | gvanrossum | set | nosy:
+ gvanrossum messages: + msg79341 |
2009-01-07 10:14:50 | loewis | set | nosy:
+ loewis messages: + msg79315 |
2009-01-06 22:54:34 | ebfe | set | messages: + msg79298 |
2009-01-06 22:39:23 | lemburg | set | messages: + msg79297 |
2009-01-06 22:13:36 | rhettinger | set | messages: + msg79296 |
2009-01-06 22:10:04 | ebfe | set | messages: + msg79295 |
2009-01-06 21:59:13 | lemburg | set | messages: + msg79293 |
2009-01-06 21:42:21 | ebfe | set | messages: + msg79291 |
2009-01-06 20:33:14 | gregory.p.smith | set | nosy:
+ gregory.p.smith messages: + msg79285 |
2009-01-06 20:20:34 | rhettinger | set | status: open -> closed resolution: rejected messages: + msg79283 nosy: + rhettinger |
2009-01-06 20:17:53 | lemburg | set | nosy:
+ lemburg messages: + msg79282 |
2009-01-06 20:06:15 | ebfe | create |