This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author antoine.pietri
Recipients antoine.pietri, christian.heimes, loewis, vstinner
Date 2018-10-08.11:37:02
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1538998622.72.0.545547206417.issue34930@psf.upfronthosting.co.za>
In-reply-to
Content
SHA-1 has been broken a while ago. While the general recommandation is to migrate to more recent hashes (like SHA-2 and SHA-3), a lot of industry applications (notably Merkle DAG implementations like Git or Blockchains) require backwards compatibility with SHA-1, at least for the time being required for all the users to transition.

The SHAttered authors published along with their paper a reference implementation of a "hardened SHA-1" algorithm, a SHA-1 implementation that uses counter-cryptanalysis to detect inputs that were forged to produce a hash collision. What that means is that Hardened SHA-1 is a secure hash function that produces the same output as SHA-1 in 99.999999...% of cases, and only differs when two inputs were specifically made to generate collisions. The reference implementation is here: https://github.com/cr-marcstevens/sha1collisiondetection

A large part of the industry has adopted Hardened SHA-1 as a temporary replacement for SHA-1, most notably Git under the name "sha1dc": https://github.com/git/git/commit/28dc98e343ca4eb370a29ceec4c19beac9b5c01e

Since CPython has its own implementation of SHA-1, I think it would be a good idea to provide a hardened SHA-1 implementation. So either:

1. we replace the current implementation of sha1 by sha1dc completely, which might be a problem for people who write script to detect whether two files collide with classic sha1

2. we replace the current implementation but we keep the old one under a new name, like "sha1_broken" or "sha1_classic", which breaks backwards compatibility in a few marginal cases but the functionality can be trivially restored by changing the name of the hash

3. we keep the current implementation but add a new one under a new name "sha1dc", which probably means most people will stay on a broken implementation for no good reason, but it will be fully backwards-compatible even in the marginal cases

4. we don't implement Hardened SHA-1 at all, and we advise people to change their hash algorithm, while realizing that this solution is not feasible in a lot of cases.

I'd suggest going with either 1. or 2. What would be your favorite option?

Not sure whether this should go in security or enhancement, so I put it in the latter category to be more conservative in issue prioritization. I added the devs who worked the most on Modules/sha1module.c in the Nosy list.
History
Date User Action Args
2018-10-08 11:37:02antoine.pietrisetrecipients: + antoine.pietri, loewis, vstinner, christian.heimes
2018-10-08 11:37:02antoine.pietrisetmessageid: <1538998622.72.0.545547206417.issue34930@psf.upfronthosting.co.za>
2018-10-08 11:37:02antoine.pietrilinkissue34930 messages
2018-10-08 11:37:02antoine.pietricreate