classification
Title: distutils.command.upload md5_digest
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.10, Python 3.9
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: christian.heimes, cstratak, dstufft, eric.araujo, gregory.p.smith, miss-islington
Priority: normal Keywords: patch

Created on 2020-05-20 11:32 by christian.heimes, last changed 2020-05-20 14:59 by christian.heimes. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 20260 merged christian.heimes, 2020-05-20 13:22
PR 20261 merged miss-islington, 2020-05-20 14:37
Messages (8)
msg369442 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2020-05-20 11:32
The distutils upload command creates a MD5 digest of the file content. This is not compatible with systems with systems that run under a strict security policy that blocks MD5.

Possible fixes are:

* declare that the MD5 digest is not used for security. Security is provided by TLS/SSL and HTTPS. The digest is just a simple checksum to detect file corruption during upload.
* Remove MD5 digest completely
* Don't create a MD5 digest if ``hashlib.md5(content)`` fails
* Skip the test case if MD5 is not available

Does PyPI support other digests, e.g. SHA2-256 digest?
msg369446 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2020-05-20 11:50
Charis pointed me to https://github.com/pypa/warehouse/issues/681 / https://github.com/pypa/warehouse/pull/891
msg369447 - (view) Author: Donald Stufft (dstufft) * (Python committer) Date: 2020-05-20 11:55
> Does PyPI support other digests, e.g. SHA2-256 digest?

There is a simple and a complicated answer to this.

The simple answer is yes, PyPI supports uploads with any combination of MD5, SHA256, and blake2_256 (blake2b with a 256 digest, no personalization or key). It will also compute all 3 on an upload on it's own and verify that they match any provided hashes and to fill in any missing hashes.

The more complicated answer is the upload API is an old API from long before we started documenting and standardizing them, so when you start talking about non PyPI implementations of that API, what they support is kind of a big who knows.

More to the problem at hand:

We don't rely on this hash for security (We couldn't, it comes in the exact same payload as the artifact itself from the exact same source, someone who can modify the artifact en route can modify the hash too). So the inclusion of MD5 is not a concern.

Removing it *might* break non-PyPI servers that attempted to implement this API and assumed it was a mandatory field (though I do not have any a priori knowledge of this being the case).

Adding additional hashes *might* break non-PyPI servers that assumed what distutils used to send was all it would ever send (this is unlikely though, most web tools ignore unknown form fields).

I looked into what twine is doing here, and it appears it is sending md5, sha256, and blake2_256 hashes all along with every request. However if FIPS mode has disabled MD5 it just skips generating and sending MD5 (but still sends the other two) and it appears it's done this for 2+ years.

It's probably safe to just mimc what twine is doing here, sending all 3 hashes, skip MD5 if it's unavailable.
msg369448 - (view) Author: Charalampos Stratakis (cstratak) * Date: 2020-05-20 11:55
There is also https://github.com/pypa/warehouse/pull/888

So I would assume it's safe it change the digest to sha256.
msg369452 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2020-05-20 13:40
Thanks for your elaborate explanation, Donald!

I have implemented your proposal in PR 20260.
msg369457 - (view) Author: miss-islington (miss-islington) Date: 2020-05-20 14:37
New changeset e572c7f6dbe5397153803eab256e4a4ca3384f80 by Christian Heimes in branch 'master':
bpo-40698: Improve distutils upload hash digests (GH-20260)
https://github.com/python/cpython/commit/e572c7f6dbe5397153803eab256e4a4ca3384f80
msg369458 - (view) Author: miss-islington (miss-islington) Date: 2020-05-20 14:57
New changeset f541a371a5e608517314a106012e0c19739d2d02 by Miss Islington (bot) in branch '3.9':
bpo-40698: Improve distutils upload hash digests (GH-20260)
https://github.com/python/cpython/commit/f541a371a5e608517314a106012e0c19739d2d02
msg369459 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2020-05-20 14:59
Thanks Charis and Donald!
History
Date User Action Args
2020-05-20 14:59:09christian.heimessetstatus: open -> closed
resolution: fixed
messages: + msg369459

stage: patch review -> resolved
2020-05-20 14:57:15miss-islingtonsetmessages: + msg369458
2020-05-20 14:37:42miss-islingtonsetpull_requests: + pull_request19547
2020-05-20 14:37:33miss-islingtonsetnosy: + miss-islington
messages: + msg369457
2020-05-20 13:40:17christian.heimessetmessages: + msg369452
2020-05-20 13:22:34christian.heimessetkeywords: + patch
stage: patch review
pull_requests: + pull_request19546
2020-05-20 11:55:10cstrataksetnosy: + cstratak
messages: + msg369448
2020-05-20 11:55:02dstufftsetmessages: + msg369447
2020-05-20 11:50:21christian.heimessetmessages: + msg369446
2020-05-20 11:33:52christian.heimessetnosy: + gregory.p.smith, eric.araujo, dstufft
2020-05-20 11:32:43christian.heimescreate