This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Submit the re, json, csv, & struct modules to oss-fuzz testing
Type: security Stage: resolved
Components: Versions: Python 3.7, Python 3.6, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: gregory.p.smith Nosy List: Devin Jeanpierre, Guido, alex, ammar2, brett.cannon, christian.heimes, gregory.p.smith, miss-islington, r.david.murray, terry.reedy
Priority: normal Keywords: patch

Created on 2017-02-08 20:54 by gregory.p.smith, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 2878 merged Devin Jeanpierre, 2017-07-26 00:26
PR 3407 merged Devin Jeanpierre, 2017-09-07 00:53
PR 3437 closed Devin Jeanpierre, 2017-09-07 22:39
PR 13907 merged ammar2, 2019-06-08 04:26
PR 13914 merged miss-islington, 2019-06-08 14:43
PR 13915 merged miss-islington, 2019-06-08 14:43
PR 13991 merged ammar2, 2019-06-11 22:14
PR 14005 merged miss-islington, 2019-06-12 04:30
PR 14006 merged miss-islington, 2019-06-12 04:30
PR 14255 merged ammar2, 2019-06-20 03:48
PR 14478 merged miss-islington, 2019-06-30 05:54
PR 14479 closed miss-islington, 2019-06-30 05:55
PR 18679 merged ammar2, 2020-02-28 07:03
PR 28777 merged ammar2, 2021-10-06 21:50
Messages (29)
msg287363 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2017-02-08 20:54
For reference, read https://github.com/google/oss-fuzz.

We should investigate creating fuzz targets for the Python re module (_sre.c) at a minimum.  There are probably other good targets as well such as _json.c and _csv.c.

pickle and marshal are not intended to be secure so ignore those.
msg287568 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2017-02-10 19:34
It does not appear to me that targets have to be security critical, though that is certainly a good place to start.  The Chrome tests found 100s of "security vulnerabilities and stability bugs".

The important thing is that there be someone willing to receive and act on reports.  Would 'make public after 90 days' ever be a problem?  AFAIK, most Python security issues are already public here on the tracker from day 1.
msg292821 - (view) Author: Devin Jeanpierre (Devin Jeanpierre) * Date: 2017-05-02 23:04
Aha, I found an existing issue!

For adding to oss-fuzz, is there a contact email we can use that is connected to a google account? I am tempted to just put gregory.p.smith on there if not. :)




I can volunteer to fuzz some interesting subset of the stdlib. The list I've come up with (by counting uses in my code) is:

the XML parser (which seems to be written in C)
struct (unpack)
the various builtins that parse strings (like int())
hashlib
binascii
datetime's parsing
json


I'd also suggest the ast module, since people do use ast.literal_eval on untrusted strings, but I probably won't do that one myself.



I wrote a fuzz test for json via upstream simplejson, but the bug on github is getting stale: https://github.com/simplejson/simplejson/issues/163

Should I add it to CPython instead?



> We should investigate creating fuzz targets for the Python re module (_sre.c) at a minimum.

If we prioritize based on security risk, I'd argue that this is lower priority than things like json's speedup extension module, because people should generally not pass untrusted strings to the re module: it's very easy to DOS a service with regexes unless you're using RE2 or similar -- which is fuzzed.  In contrast, json is supposed to accept untrusted input and people do that very often.

(OTOH, I would be willing to bet that fuzzing re will yield more bugs than fuzzing json.)
msg293263 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2017-05-09 00:34
you can list me as a oss-fuzz contact.  use my work email address.

simplejson is worthy but as both it and the python standard library ship separately people use both so they both ultimately deserve fuzzing and fixing on their own so I'd add it to CPython as well.
msg293340 - (view) Author: Devin Jeanpierre (Devin Jeanpierre) * Date: 2017-05-09 19:08
https://github.com/google/oss-fuzz/pull/583 is the PR to oss-fuzz to add the project. I'm working on actual tests to be submitted here.
msg299183 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2017-07-26 02:07
As I read 583, they are planning to fuzz 3.6.  Why not branch master?  I think it more likely that we accidentally add a vulnerability to master then that we accidentally close one.
msg299207 - (view) Author: Devin Jeanpierre (Devin Jeanpierre) * Date: 2017-07-26 04:35
I think they misspoke, it's normal with fuzzing to test against master. The current draft of the code runs this git pull before building/launching any tests:

    git clone --depth 1 https://github.com/python/cpython.git cpython

Speaking of which, I forgot to update this bug thread with the followup PR to actually run CPython's fuzz tests (when they exist): https://github.com/google/oss-fuzz/pull/731. That's where I grabbed the git clone statement from. I think that will be merged after some version of PR 2878 lands in CPython (still in code review / broken).



For Python 2 I guess it's different, and we will test against the 2.7 branch, right?
msg301494 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2017-09-06 18:15
New changeset c5bace2bf7874cf47ef56e1d8d19f79ad892eef5 by Gregory P. Smith (Devin Jeanpierre) in branch 'master':
bpo-29505: Add fuzz tests for float(str), int(str), unicode(str) (#2878)
https://github.com/python/cpython/commit/c5bace2bf7874cf47ef56e1d8d19f79ad892eef5
msg301495 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2017-09-06 18:18
alright, with that in, feel free to figure out the oss-fuzz configuration side and fire things up Devin. :)
msg301530 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2017-09-06 22:29
GCC complains about the patch:

/home/heimes/dev/python/cpython/Modules/_xxtestfuzz/fuzzer.c: In function ‘LLVMFuzzerTestOneInput’:
/home/heimes/dev/python/cpython/Modules/_xxtestfuzz/fuzzer.c:109:1: warning: this use of "defined" may not be portable [-Wexpansion-to-defined]
 #if _Py_FUZZ_YES(fuzz_builtin_float)
 ^~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/heimes/dev/python/cpython/Modules/_xxtestfuzz/fuzzer.c:109:1: warning: this use of "defined" may not be portable [-Wexpansion-to-defined]
/home/heimes/dev/python/cpython/Modules/_xxtestfuzz/fuzzer.c:112:1: warning: this use of "defined" may not be portable [-Wexpansion-to-defined]
 #if _Py_FUZZ_YES(fuzz_builtin_int)
 ^~~~~~~~~~~~~~~~~~~~~~~~~
/home/heimes/dev/python/cpython/Modules/_xxtestfuzz/fuzzer.c:112:1: warning: this use of "defined" may not be portable [-Wexpansion-to-defined]
/home/heimes/dev/python/cpython/Modules/_xxtestfuzz/fuzzer.c:115:1: warning: this use of "defined" may not be portable [-Wexpansion-to-defined]
 #if _Py_FUZZ_YES(fuzz_builtin_unicode)
 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/heimes/dev/python/cpython/Modules/_xxtestfuzz/fuzzer.c:115:1: warning: this use of "defined" may not be portable [-Wexpansion-to-defined]
msg301535 - (view) Author: Devin Jeanpierre (Devin Jeanpierre) * Date: 2017-09-06 22:47
Huh. I would not have predicted that.

https://gcc.gnu.org/onlinedocs/cpp/Defined.html

I'll send a fix.
msg301732 - (view) Author: Devin Jeanpierre (Devin Jeanpierre) * Date: 2017-09-08 21:05
So here's an interesting issue: oss-fuzz requires that the built location be movable. IOW, we build Python into $OUT, and then the $OUT directory gets moved somewhere else and the fuzz test gets run from there. This causes problems because Python can no longer find where the modules it needs are (encodings for example).

First thought: wouldn't it be nice if we could make a prepackaged and hermetic executable that we can move around freely?

Second thought: isn't that "Hermetic Python", as used within Google?

Third thought: doesn't Google have an internal fuzz testing environment we can use, instead of oss-fuzz?

So unless someone says this is a bad idea, I'd propose we not run these in oss-fuzz and instead run them in Google proper. The alternative is if there's a way to make it easy to move Python around -- is there a way to build it s.t. the import path is relative and so on?
msg301759 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2017-09-09 04:37
i'd rather make this work in oss-fuzz on cpython.  can you point me to how oss-fuzz works and what it wants to do so i can better understand what it needs?

it it has an expectation that the thing being fuzzed is a single binary with no data or directory tree layout dependencies that can just be plopped somewhere else is not a great assumption to make.

but environment variables _should_ be able to be set to point the python binary at what it needs if it must work that way.
msg301897 - (view) Author: Devin Jeanpierre (Devin Jeanpierre) * Date: 2017-09-11 20:14
> i'd rather make this work in oss-fuzz on cpython.  can you point me to how oss-fuzz works and what it wants to do so i can better understand what it needs?

I don't have any details except for what's in the PR to oss-fuzz (https://github.com/google/oss-fuzz/pull/731)  My understanding is matches what you've said so far:

Python is built to one directory (/out/), but then needs to be run from another directory (/out/ is renamed to /foo/bar/baz/out/). We need python to still work. I have no idea how to do this.

The only suggestion on #python-dev IRC was to statically link a libpython.a, but this doesn't avoid needing to import libraries like "encodings" dynamically, so they still need to be locatable on disk.

Is there a way to build python so that it doesn't use absolute paths to everything, and so that the install can be moved at will? Or is there a way to tell it that it was moved at runtime? (I am unconvinced PYTHONPATH is a maintainable solution, if it works at all...)


oss-fuzz is not going to change away from its model (I asked if they could, they said no), so we're stuck with making Python compatible with it one way or another.  This is why I am so drawn to running the test internally on Google's infrastructure anyway: we already _did_ all this work already, via hermetic python. Doing it a second time, but worse, seems annoying.
msg301901 - (view) Author: Devin Jeanpierre (Devin Jeanpierre) * Date: 2017-09-11 20:55
kcc strongly disagrees though. Copying latest comment:

"""
fwiw - I object to us running any of this internally at Google. We need to be part of the main oss-fuzz project pulling from upstream revisions. Doing this testing within our blackhole of internal stuff adds more work for us internally (read: which we're not going to do) and wouldn't provide results feedback to the upstream CPython project in a useful timely manner.

We must figure out how to get this to build and run on the external oss-fuzz infrastructure
"""
msg301918 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2017-09-11 22:27
misquote.  that was me objecting to running it internally. :)

i believe this is solvable, i haven't had time to spend on this part yet.
msg301922 - (view) Author: Devin Jeanpierre (Devin Jeanpierre) * Date: 2017-09-11 22:39
Oops, so it is. I can't read apparently.

I'll spend my time on making more fuzz tests in the meantime.
msg301932 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2017-09-12 01:21
Seems like it ought to be possible to use the same hooks that venv uses to make this work, but I haven't looked at the details of how those work.
msg345039 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2019-06-08 14:43
New changeset a15a7bcaea54e1845ab2abe27e6f583294cd715b by Gregory P. Smith (Ammar Askar) in branch 'master':
bpo-29505: Fix interpreter in fuzzing targets to be relocatable (GH-13907)
https://github.com/python/cpython/commit/a15a7bcaea54e1845ab2abe27e6f583294cd715b
msg345044 - (view) Author: miss-islington (miss-islington) Date: 2019-06-08 15:03
New changeset 22b69da4c38042e923d633530bdafc1b5fb94928 by Miss Islington (bot) in branch '3.8':
bpo-29505: Fix interpreter in fuzzing targets to be relocatable (GH-13907)
https://github.com/python/cpython/commit/22b69da4c38042e923d633530bdafc1b5fb94928
msg345045 - (view) Author: miss-islington (miss-islington) Date: 2019-06-08 15:03
New changeset 6692d35317a45905a043dccae3940ea5d5d84352 by Miss Islington (bot) in branch '3.7':
bpo-29505: Fix interpreter in fuzzing targets to be relocatable (GH-13907)
https://github.com/python/cpython/commit/6692d35317a45905a043dccae3940ea5d5d84352
msg345304 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2019-06-12 04:30
New changeset a6e190e94b47324f14e22a09200c68b722d55699 by Gregory P. Smith (Ammar Askar) in branch 'master':
bpo-29505: Fuzz json module, enforce size limit on int(x) fuzz (GH-13991)
https://github.com/python/cpython/commit/a6e190e94b47324f14e22a09200c68b722d55699
msg345306 - (view) Author: miss-islington (miss-islington) Date: 2019-06-12 04:47
New changeset 534136ac6790a701e24f364a9b7f1e34bf5f3ce7 by Miss Islington (bot) in branch '3.7':
bpo-29505: Fuzz json module, enforce size limit on int(x) fuzz (GH-13991)
https://github.com/python/cpython/commit/534136ac6790a701e24f364a9b7f1e34bf5f3ce7
msg345307 - (view) Author: miss-islington (miss-islington) Date: 2019-06-12 04:48
New changeset 878227e7217f3363f9c095b7fb8c1dbdde1ec34f by Miss Islington (bot) in branch '3.8':
bpo-29505: Fuzz json module, enforce size limit on int(x) fuzz (GH-13991)
https://github.com/python/cpython/commit/878227e7217f3363f9c095b7fb8c1dbdde1ec34f
msg346916 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2019-06-30 05:54
New changeset 5cbbbd73a6acb6f96f5d6646aa7498d3dfb1706d by Gregory P. Smith (Ammar Askar) in branch 'master':
bpo-29505: Add more fuzzing for re.compile, re.load and csv.reader (GH-14255)
https://github.com/python/cpython/commit/5cbbbd73a6acb6f96f5d6646aa7498d3dfb1706d
msg346918 - (view) Author: miss-islington (miss-islington) Date: 2019-06-30 06:13
New changeset ffcc161c753a72e7c4237c1e3c433d47b020978e by Miss Islington (bot) in branch '3.8':
bpo-29505: Add more fuzzing for re.compile, re.load and csv.reader (GH-14255)
https://github.com/python/cpython/commit/ffcc161c753a72e7c4237c1e3c433d47b020978e
msg347496 - (view) Author: Guido Vranken (Guido) Date: 2019-07-08 14:11
Hi,

I've built a generic Python fuzzer and submitted it to OSS-Fuzz.

It works by implementing a "def FuzzerRunOne(FuzzerInput):" function in Python in which some arbitrary code is run based on FuzzerInput, which is a bytes object.

This is a more versatile solution than the current re, json, csv fuzzers as it requires no custom C code and adding more fuzzing targets is as easy as writing a new harness in Python and adding a build rule.

Code coverage is measured at both the CPython level (*.c) and the Python level (*.py). CPython is compiled with AddressSanitizer. What this means is that both CPython memory bugs and Python library bugs (excessive memory consumption, hangs, slowdowns, unexpected exceptions) are expected to transpire.

You can see my current set of fuzzers here: https://github.com/guidovranken/python-library-fuzzers

The PR to OSS-Fuzz is https://github.com/google/oss-fuzz/pull/2567

Currently, the only Python maintainer who will be receiving automated bug reports is gpshead. Are there any other developers who normally process Python security bug reports and would like to receive notifications?

Feel free to respond directly in the OSS-Fuzz PR thread.
msg390076 - (view) Author: Ammar Askar (ammar2) * (Python committer) Date: 2021-04-02 16:05
All the modules prescribed in the original bug are now actively being fuzzed, thank you to Devin for the original work on this and Gregory for reviewing all these changes.

I'm closing this now, feel free to make a new bug and nosy me in if there are any other C-based modules commonly exposed to user input that should be fuzzed.
msg403340 - (view) Author: miss-islington (miss-islington) Date: 2021-10-06 23:22
New changeset db72e58ea5940c3942ede9f70cb897510b52fc36 by Ammar Askar in branch 'main':
bpo-29505: Add fuzzer for ast.literal_eval (GH-28777)
https://github.com/python/cpython/commit/db72e58ea5940c3942ede9f70cb897510b52fc36
History
Date User Action Args
2022-04-11 14:58:43adminsetgithub: 73691
2021-10-06 23:22:19miss-islingtonsetmessages: + msg403340
2021-10-06 21:50:05ammar2setpull_requests: + pull_request27111
2021-04-02 16:05:39ammar2setstatus: open -> closed
resolution: fixed
messages: + msg390076

stage: patch review -> resolved
2020-02-28 07:03:08ammar2setnosy: + ammar2
pull_requests: + pull_request18043
2020-02-28 07:02:45gregory.p.smithsettitle: Submit the re, json, & csv modules to oss-fuzz testing -> Submit the re, json, csv, & struct modules to oss-fuzz testing
2019-07-08 14:11:58Guidosetnosy: + Guido
messages: + msg347496
2019-06-30 06:13:22miss-islingtonsetmessages: + msg346918
2019-06-30 05:55:00miss-islingtonsetpull_requests: + pull_request14296
2019-06-30 05:54:54miss-islingtonsetpull_requests: + pull_request14295
2019-06-30 05:54:46gregory.p.smithsetmessages: + msg346916
2019-06-20 03:48:57ammar2setpull_requests: + pull_request14085
2019-06-12 04:48:20miss-islingtonsetmessages: + msg345307
2019-06-12 04:47:44miss-islingtonsetmessages: + msg345306
2019-06-12 04:30:59gregory.p.smithsetmessages: + msg345304
2019-06-12 04:30:53miss-islingtonsetpull_requests: + pull_request13869
2019-06-12 04:30:46miss-islingtonsetpull_requests: + pull_request13868
2019-06-11 22:14:32ammar2setpull_requests: + pull_request13854
2019-06-08 15:03:49miss-islingtonsetmessages: + msg345045
2019-06-08 15:03:08miss-islingtonsetnosy: + miss-islington
messages: + msg345044
2019-06-08 14:43:55miss-islingtonsetpull_requests: + pull_request13788
2019-06-08 14:43:47miss-islingtonsetpull_requests: + pull_request13787
2019-06-08 14:43:19gregory.p.smithsetmessages: + msg345039
2019-06-08 04:26:10ammar2setpull_requests: + pull_request13780
2017-09-12 01:21:27r.david.murraysetnosy: + r.david.murray
messages: + msg301932
2017-09-11 22:39:38Devin Jeanpierresetmessages: + msg301922
2017-09-11 22:27:39gregory.p.smithsetmessages: + msg301918
2017-09-11 20:55:00Devin Jeanpierresetmessages: + msg301901
2017-09-11 20:14:29Devin Jeanpierresetmessages: + msg301897
2017-09-09 04:37:21gregory.p.smithsetmessages: + msg301759
2017-09-08 21:05:33Devin Jeanpierresetmessages: + msg301732
2017-09-07 22:39:44Devin Jeanpierresetkeywords: + patch
stage: test needed -> patch review
pull_requests: + pull_request3434
2017-09-07 00:53:15Devin Jeanpierresetpull_requests: + pull_request3412
2017-09-06 22:47:56Devin Jeanpierresetmessages: + msg301535
2017-09-06 22:29:00christian.heimessetmessages: + msg301530
2017-09-06 18:18:37gregory.p.smithsetassignee: gregory.p.smith
messages: + msg301495
2017-09-06 18:15:41gregory.p.smithsetmessages: + msg301494
2017-07-26 04:35:59Devin Jeanpierresetmessages: + msg299207
2017-07-26 02:07:02terry.reedysetmessages: + msg299183
2017-07-26 00:26:16Devin Jeanpierresetpull_requests: + pull_request2929
2017-05-09 19:08:38Devin Jeanpierresetmessages: + msg293340
2017-05-09 00:34:10gregory.p.smithsetmessages: + msg293263
2017-05-02 23:04:26Devin Jeanpierresetnosy: + Devin Jeanpierre
messages: + msg292821
2017-02-18 05:17:15alexsetnosy: + alex
2017-02-10 19:34:11terry.reedysetnosy: + terry.reedy
messages: + msg287568
2017-02-09 17:38:23brett.cannonsetnosy: + brett.cannon, christian.heimes
2017-02-08 20:54:21gregory.p.smithcreate