Title: Submit the re, json, & csv modules to oss-fuzz testing
Type: security Stage: patch review
Components: Versions: Python 3.7, Python 3.6, Python 2.7
Status: open Resolution:
Dependencies: Superseder:
Assigned To: gregory.p.smith Nosy List: Devin Jeanpierre, alex, brett.cannon, christian.heimes, gregory.p.smith, r.david.murray, terry.reedy
Priority: normal Keywords: patch

Created on 2017-02-08 20:54 by gregory.p.smith, last changed 2017-09-12 01:21 by r.david.murray.

Pull Requests
URL Status Linked Edit
PR 2878 merged Devin Jeanpierre, 2017-07-26 00:26
PR 3407 merged Devin Jeanpierre, 2017-09-07 00:53
PR 3437 open Devin Jeanpierre, 2017-09-07 22:39
Messages (18)
msg287363 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2017-02-08 20:54
For reference, read

We should investigate creating fuzz targets for the Python re module (_sre.c) at a minimum.  There are probably other good targets as well such as _json.c and _csv.c.

pickle and marshal are not intended to be secure so ignore those.
msg287568 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2017-02-10 19:34
It does not appear to me that targets have to be security critical, though that is certainly a good place to start.  The Chrome tests found 100s of "security vulnerabilities and stability bugs".

The important thing is that there be someone willing to receive and act on reports.  Would 'make public after 90 days' ever be a problem?  AFAIK, most Python security issues are already public here on the tracker from day 1.
msg292821 - (view) Author: Devin Jeanpierre (Devin Jeanpierre) * Date: 2017-05-02 23:04
Aha, I found an existing issue!

For adding to oss-fuzz, is there a contact email we can use that is connected to a google account? I am tempted to just put gregory.p.smith on there if not. :)

I can volunteer to fuzz some interesting subset of the stdlib. The list I've come up with (by counting uses in my code) is:

the XML parser (which seems to be written in C)
struct (unpack)
the various builtins that parse strings (like int())
datetime's parsing

I'd also suggest the ast module, since people do use ast.literal_eval on untrusted strings, but I probably won't do that one myself.

I wrote a fuzz test for json via upstream simplejson, but the bug on github is getting stale:

Should I add it to CPython instead?

> We should investigate creating fuzz targets for the Python re module (_sre.c) at a minimum.

If we prioritize based on security risk, I'd argue that this is lower priority than things like json's speedup extension module, because people should generally not pass untrusted strings to the re module: it's very easy to DOS a service with regexes unless you're using RE2 or similar -- which is fuzzed.  In contrast, json is supposed to accept untrusted input and people do that very often.

(OTOH, I would be willing to bet that fuzzing re will yield more bugs than fuzzing json.)
msg293263 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2017-05-09 00:34
you can list me as a oss-fuzz contact.  use my work email address.

simplejson is worthy but as both it and the python standard library ship separately people use both so they both ultimately deserve fuzzing and fixing on their own so I'd add it to CPython as well.
msg293340 - (view) Author: Devin Jeanpierre (Devin Jeanpierre) * Date: 2017-05-09 19:08 is the PR to oss-fuzz to add the project. I'm working on actual tests to be submitted here.
msg299183 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2017-07-26 02:07
As I read 583, they are planning to fuzz 3.6.  Why not branch master?  I think it more likely that we accidentally add a vulnerability to master then that we accidentally close one.
msg299207 - (view) Author: Devin Jeanpierre (Devin Jeanpierre) * Date: 2017-07-26 04:35
I think they misspoke, it's normal with fuzzing to test against master. The current draft of the code runs this git pull before building/launching any tests:

    git clone --depth 1 cpython

Speaking of which, I forgot to update this bug thread with the followup PR to actually run CPython's fuzz tests (when they exist): That's where I grabbed the git clone statement from. I think that will be merged after some version of PR 2878 lands in CPython (still in code review / broken).

For Python 2 I guess it's different, and we will test against the 2.7 branch, right?
msg301494 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2017-09-06 18:15
New changeset c5bace2bf7874cf47ef56e1d8d19f79ad892eef5 by Gregory P. Smith (Devin Jeanpierre) in branch 'master':
bpo-29505: Add fuzz tests for float(str), int(str), unicode(str) (#2878)
msg301495 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2017-09-06 18:18
alright, with that in, feel free to figure out the oss-fuzz configuration side and fire things up Devin. :)
msg301530 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2017-09-06 22:29
GCC complains about the patch:

/home/heimes/dev/python/cpython/Modules/_xxtestfuzz/fuzzer.c: In function ‘LLVMFuzzerTestOneInput’:
/home/heimes/dev/python/cpython/Modules/_xxtestfuzz/fuzzer.c:109:1: warning: this use of "defined" may not be portable [-Wexpansion-to-defined]
 #if _Py_FUZZ_YES(fuzz_builtin_float)
/home/heimes/dev/python/cpython/Modules/_xxtestfuzz/fuzzer.c:109:1: warning: this use of "defined" may not be portable [-Wexpansion-to-defined]
/home/heimes/dev/python/cpython/Modules/_xxtestfuzz/fuzzer.c:112:1: warning: this use of "defined" may not be portable [-Wexpansion-to-defined]
 #if _Py_FUZZ_YES(fuzz_builtin_int)
/home/heimes/dev/python/cpython/Modules/_xxtestfuzz/fuzzer.c:112:1: warning: this use of "defined" may not be portable [-Wexpansion-to-defined]
/home/heimes/dev/python/cpython/Modules/_xxtestfuzz/fuzzer.c:115:1: warning: this use of "defined" may not be portable [-Wexpansion-to-defined]
 #if _Py_FUZZ_YES(fuzz_builtin_unicode)
/home/heimes/dev/python/cpython/Modules/_xxtestfuzz/fuzzer.c:115:1: warning: this use of "defined" may not be portable [-Wexpansion-to-defined]
msg301535 - (view) Author: Devin Jeanpierre (Devin Jeanpierre) * Date: 2017-09-06 22:47
Huh. I would not have predicted that.

I'll send a fix.
msg301732 - (view) Author: Devin Jeanpierre (Devin Jeanpierre) * Date: 2017-09-08 21:05
So here's an interesting issue: oss-fuzz requires that the built location be movable. IOW, we build Python into $OUT, and then the $OUT directory gets moved somewhere else and the fuzz test gets run from there. This causes problems because Python can no longer find where the modules it needs are (encodings for example).

First thought: wouldn't it be nice if we could make a prepackaged and hermetic executable that we can move around freely?

Second thought: isn't that "Hermetic Python", as used within Google?

Third thought: doesn't Google have an internal fuzz testing environment we can use, instead of oss-fuzz?

So unless someone says this is a bad idea, I'd propose we not run these in oss-fuzz and instead run them in Google proper. The alternative is if there's a way to make it easy to move Python around -- is there a way to build it s.t. the import path is relative and so on?
msg301759 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2017-09-09 04:37
i'd rather make this work in oss-fuzz on cpython.  can you point me to how oss-fuzz works and what it wants to do so i can better understand what it needs?

it it has an expectation that the thing being fuzzed is a single binary with no data or directory tree layout dependencies that can just be plopped somewhere else is not a great assumption to make.

but environment variables _should_ be able to be set to point the python binary at what it needs if it must work that way.
msg301897 - (view) Author: Devin Jeanpierre (Devin Jeanpierre) * Date: 2017-09-11 20:14
> i'd rather make this work in oss-fuzz on cpython.  can you point me to how oss-fuzz works and what it wants to do so i can better understand what it needs?

I don't have any details except for what's in the PR to oss-fuzz (  My understanding is matches what you've said so far:

Python is built to one directory (/out/), but then needs to be run from another directory (/out/ is renamed to /foo/bar/baz/out/). We need python to still work. I have no idea how to do this.

The only suggestion on #python-dev IRC was to statically link a libpython.a, but this doesn't avoid needing to import libraries like "encodings" dynamically, so they still need to be locatable on disk.

Is there a way to build python so that it doesn't use absolute paths to everything, and so that the install can be moved at will? Or is there a way to tell it that it was moved at runtime? (I am unconvinced PYTHONPATH is a maintainable solution, if it works at all...)

oss-fuzz is not going to change away from its model (I asked if they could, they said no), so we're stuck with making Python compatible with it one way or another.  This is why I am so drawn to running the test internally on Google's infrastructure anyway: we already _did_ all this work already, via hermetic python. Doing it a second time, but worse, seems annoying.
msg301901 - (view) Author: Devin Jeanpierre (Devin Jeanpierre) * Date: 2017-09-11 20:55
kcc strongly disagrees though. Copying latest comment:

fwiw - I object to us running any of this internally at Google. We need to be part of the main oss-fuzz project pulling from upstream revisions. Doing this testing within our blackhole of internal stuff adds more work for us internally (read: which we're not going to do) and wouldn't provide results feedback to the upstream CPython project in a useful timely manner.

We must figure out how to get this to build and run on the external oss-fuzz infrastructure
msg301918 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2017-09-11 22:27
misquote.  that was me objecting to running it internally. :)

i believe this is solvable, i haven't had time to spend on this part yet.
msg301922 - (view) Author: Devin Jeanpierre (Devin Jeanpierre) * Date: 2017-09-11 22:39
Oops, so it is. I can't read apparently.

I'll spend my time on making more fuzz tests in the meantime.
msg301932 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2017-09-12 01:21
Seems like it ought to be possible to use the same hooks that venv uses to make this work, but I haven't looked at the details of how those work.
Date User Action Args
2017-09-12 01:21:27r.david.murraysetnosy: + r.david.murray
messages: + msg301932
2017-09-11 22:39:38Devin Jeanpierresetmessages: + msg301922
2017-09-11 22:27:39gregory.p.smithsetmessages: + msg301918
2017-09-11 20:55:00Devin Jeanpierresetmessages: + msg301901
2017-09-11 20:14:29Devin Jeanpierresetmessages: + msg301897
2017-09-09 04:37:21gregory.p.smithsetmessages: + msg301759
2017-09-08 21:05:33Devin Jeanpierresetmessages: + msg301732
2017-09-07 22:39:44Devin Jeanpierresetkeywords: + patch
stage: test needed -> patch review
pull_requests: + pull_request3434
2017-09-07 00:53:15Devin Jeanpierresetpull_requests: + pull_request3412
2017-09-06 22:47:56Devin Jeanpierresetmessages: + msg301535
2017-09-06 22:29:00christian.heimessetmessages: + msg301530
2017-09-06 18:18:37gregory.p.smithsetassignee: gregory.p.smith
messages: + msg301495
2017-09-06 18:15:41gregory.p.smithsetmessages: + msg301494
2017-07-26 04:35:59Devin Jeanpierresetmessages: + msg299207
2017-07-26 02:07:02terry.reedysetmessages: + msg299183
2017-07-26 00:26:16Devin Jeanpierresetpull_requests: + pull_request2929
2017-05-09 19:08:38Devin Jeanpierresetmessages: + msg293340
2017-05-09 00:34:10gregory.p.smithsetmessages: + msg293263
2017-05-02 23:04:26Devin Jeanpierresetnosy: + Devin Jeanpierre
messages: + msg292821
2017-02-18 05:17:15alexsetnosy: + alex
2017-02-10 19:34:11terry.reedysetnosy: + terry.reedy
messages: + msg287568
2017-02-09 17:38:23brett.cannonsetnosy: + brett.cannon, christian.heimes
2017-02-08 20:54:21gregory.p.smithcreate