This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Security of CPython Builds
Type: enhancement Stage: resolved
Components: Build, Devguide, Documentation, Windows Versions:
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: brett.cannon, docs@python, ezio.melotti, ned.deily, paul.moore, phelix, r.david.murray, steve.dower, tim.golden, willingc, zach.ware
Priority: normal Keywords:

Created on 2015-09-28 08:05 by phelix, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Messages (9)
msg251753 - (view) Author: phelix (phelix) Date: 2015-09-28 08:05
A description of the build and release process for CPython binaries (e.g. for Windows) would be great. Maybe I am missing something? I could not find any information other than the 14 years old PEP 101 which says: "Notify the experts that they can start building binaries." 

E.g. how is it ensured there are no backdoors in the binaries?

Background: For the Namecoin project we are currently discussing the potential necessity of reproducible builds.
msg251773 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2015-09-28 16:52
Basically through trusting the people who produce the builds.

You can also verify the hg changeset by looking at sys.version and matching it to the tagged release. If there are any differences between the tagged commit and the one used to build, there will be a "+" in the version (though honestly, there are ways to avoid showing that if someone really wants to hide it, so it isn't a guarantee).
msg251774 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2015-09-28 16:54
And just as an FYI, while PEP 101 was created 14 years ago, it has been updated regularly (last edit was 13 days ago): https://hg.python.org/peps/log/tip/pep-0101.txt
msg251775 - (view) Author: Paul Moore (paul.moore) * (Python committer) Date: 2015-09-28 16:59
Also, the Windows build process is documented in PCBuild/readme.txt - see https://hg.python.org/cpython/file/tip/PCbuild/readme.txt

More generally the devguide documents how to build CPython - https://docs.python.org/devguide/setup.html#compiling
msg251786 - (view) Author: phelix (phelix) Date: 2015-09-28 18:53
@Brett: Thanks for the info, I had not noticed PEP 101 had been updated.

@Paul: Ah, I had not found PCBuild/readme.txt yet. I did look at the devguide but I got the impression it was mostly meant for debug builds.

> Basically through trusting the people who produce the builds.
I assume these builders are very experienced and well known developers (thanks btw I like Python very much). I would trust them a very long way.

But it is not their integrity that is in question. Python is so popular that there might be large monetary (and other) incentives to force builders into something. Just for Bitcoin alone probably millions of dollars.

I was only recently made aware about this from Namecoin team members (and this [1] video about reproducible builds from CCC14) but as far as I see it now there is a very valid core in their argumentation. 

Our well respected team member Joseph Bisch has looked into reproducible builds of CPython and concluded that it might a difficult thing to do with a project as large as Python [2]. But maybe there are other ways to make builds more secure? I realize it is a lot I am asking here but build security will certainly get more and more important with time. Could things be improved by getting several developers together to create a secure VM as a starting point that make reproducible builds easier?

[1] https://media.ccc.de/browse/congress/2014/31c3_-_6240_-_en_-_saal_g_-_201412271400_-_reproducible_builds_-_mike_perry_-_seth_schoen_-_hans_steiner.html#video&t=18
[2] https://forum.namecoin.info/viewtopic.php?p=15869#p15869
msg251790 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-09-28 19:16
Well, making the build process more automated would help us, so if someone wants to help make that kind of thing happen it will probably be well received.  The platform installer builds (OSX, Windows) are tricky things, though, and a fair amount of knowledge is unfortunately locked up (currently) in Steve and Ned's brains.  It would indeed be good to make that not so (bus factor, if nothing else), so support from additional people willing to put in effort on installer builds will (I'm assuming) be welcome. But, since anything along those lines would require additional time from Steve and Ned, it isn't obvious how best to go about it.

I haven't looked at your linked article, but there might be alternate pathways to your goal based on starting from the source and producing your own installers or embedded python installation.
msg251795 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2015-09-28 20:00
I do need to contribute some PEP 101 updates at some point, since the Windows build no longer resembles what is described there, but it's mostly about configuration.

* Install x, y, z
* Obtain extra externals
* Install signing certificate
* Configure non-default settings
* Check out correct repo/branch
* Run tools/msi/buildrelease.cmd
(Optional: install x, configure SSH key, run tools/msi/uploadrelease.cmd)

Since I can't release the PSF signing key or my own GPG key, there's only so automated this configuration can be. The "correct repo/branch/changeset" varies depending on the RM, and not all of the build tests are automatically verified (high chance of false positives that require manual inspection).

Probably the first thing I should do is put the extra externals (binutils, gpg, htmlhelp, redist and wix) onto svn.python.org with the others and grab them automatically. I can add checks for configuration (things like the eol extension not being enabled, for example) and the default build doesn't need a signing certificate, so that's optional too.

But the overriding point is, these things aren't required for most people, and automating them is going to be pretty restrictive. For example, I would have to automate it by detecting VS 2015 and failing if it's not there - otherwise you don't have repeatability - and that's going to prevent people using earlier or later versions of VS for their own uses. HTML Help is basically stable (a.k.a. dead), but gpg and binutils are frequently updated and locking them down is also restrictive. Also, people may want to use their own MinGW or GPG installs, while I don't want to do that, so the model I use for building would be restrictive there too.

Finally, so few people actually want to produce builds that I can do plenty of work to make it easy, and there may be major issues that are never discovered because nobody else uses it.
msg251796 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2015-09-28 20:04
Having read your link [2] above (at least briefly), it seems the aim is to compare hashes of builds from multiple people to verify that nobody maliciously modified the binaries.

That isn't going to work for Windows because we cryptographically sign the binaries. The only people who could produce bit-for-bit identical builds are those trusted by the PSF, and not independent people. So if you don't trust the PSF and implicitly the people trusted by the PSF, you can't actually do anything besides building your own version and using that.

However, the rest of the build is so automated that other personal variations will not occur. As I mentioned above, I have exactly one batch file to build the full span of releases for Windows, and I just run that. It's public and in the repo, so anyone else can also run it, they just won't get bit-for-bit identical builds because of timestamps, embedded paths, and certificates.
msg251801 - (view) Author: phelix (phelix) Date: 2015-09-28 21:19
Thank you all for your responses.

> Having read your link [2] above (at least briefly), it seems the aim is to compare hashes of builds from multiple people to verify that nobody maliciously modified the binaries.
Exactly. Also it might protect the people actually doing the builds from extortion and accusations from backdoor victims (e.g. in case of hacked build system).

> That isn't going to work for Windows because we cryptographically sign the binaries. The only people who could produce bit-for-bit identical builds are those trusted by the PSF, and not independent people. So if you don't trust the PSF and implicitly the people trusted by the PSF, you can't actually do anything besides building your own version and using that.
Joseph tried just that but ran into issues.

> However, the rest of the build is so automated that other personal variations will not occur. As I mentioned above, I have exactly one batch file to build the full span of releases for Windows, and I just run that. It's public and in the repo, so anyone else can also run it, they just won't get bit-for-bit identical builds because of timestamps, embedded paths, and certificates.
Timestamps and paths should be handled by the Gitian secure build system (cross compile).

From my point this issue can be closed as my questions are answered. We will take another look at building reproducibly. If we run into problems I will create another issue here in the hope you can help again. :)
History
Date User Action Args
2022-04-11 14:58:21adminsetgithub: 69442
2015-09-28 21:24:34r.david.murraysetstatus: open -> closed
resolution: not a bug
stage: resolved
2015-09-28 21:19:20phelixsetmessages: + msg251801
2015-09-28 20:04:41steve.dowersetmessages: + msg251796
2015-09-28 20:00:51steve.dowersetmessages: + msg251795
2015-09-28 19:16:50r.david.murraysetassignee: docs@python ->

messages: + msg251790
nosy: + ned.deily, r.david.murray
2015-09-28 18:53:23phelixsetmessages: + msg251786
2015-09-28 16:59:48paul.mooresetmessages: + msg251775
2015-09-28 16:54:20brett.cannonsetnosy: + brett.cannon
messages: + msg251774
2015-09-28 16:52:22steve.dowersetmessages: + msg251773
2015-09-28 08:05:18phelixcreate