msg315464 - (view) |
Author: Gregory P. Smith (gregory.p.smith) *  |
Date: 2018-04-18 21:59 |
Build CPython (master in this case - though I originally noticed the problem when building a 3.6 tree) as follows with clang installed:
build$ LD=clang-5.0 LDFLAGS=-fsanitize=undefined CC=clang-5.0 CXX=clang-5.0 CFLAGS=-fsanitize=undefined CXXFLAGS=-fsanitize=undefined ../gpshead/configure
build$ make -j12
...
notice many of the warnings scroll by during the build itself as it executes the interpreter
then execute it yourself at the end and you'll get a bunch of these:
../gpshead/Objects/dictobject.c:547:12: runtime error: index 16 out of bounds for type 'int8_t [8]'
../gpshead/Objects/dictobject.c:1145:18: runtime error: index 16 out of bounds for type 'int8_t [8]'
../gpshead/Objects/dictobject.c:2817:15: runtime error: index 16 out of bounds for type 'int8_t [8]'
../gpshead/Objects/dictobject.c:831:27: runtime error: index 16 out of bounds for type 'int8_t [8]'
../gpshead/Objects/dictobject.c:1144:18: runtime error: index 16 out of bounds for type 'int8_t [8]'
../gpshead/Objects/dictobject.c:1034:15: runtime error: index 16 out of bounds for type 'int8_t [8]'
../gpshead/Objects/dictobject.c:728:11: runtime error: index 16 out of bounds for type 'int8_t [8]'
../gpshead/Objects/dictobject.c:1064:9: runtime error: index 64 out of bounds for type 'int8_t [8]'
../gpshead/Objects/dictobject.c:2960:31: runtime error: index 64 out of bounds for type 'int8_t [8]'
../gpshead/Objects/dictobject.c:1489:11: runtime error: index 32 out of bounds for type 'int8_t [8]'
../gpshead/Objects/dictobject.c:637:27: runtime error: index 128 out of bounds for type 'int8_t [8]'
../gpshead/Objects/dictobject.c:788:27: runtime error: index 16 out of bounds for type 'int8_t [8]'
../gpshead/Objects/dictobject.c:1671:22: runtime error: index 16 out of bounds for type 'int8_t [8]'
../gpshead/Objects/dictobject.c:554:31: runtime error: index 16 out of bounds for type 'int8_t [8]'
../gpshead/Objects/dictobject.c:1223:15: runtime error: index 16 out of bounds for type 'int8_t [8]'
../gpshead/Objects/dictobject.c:876:27: runtime error: index 16 out of bounds for type 'int8_t [8]'
../gpshead/Objects/dictobject.c:2396:15: runtime error: index 32 out of bounds for type 'int8_t [8]'
../gpshead/Objects/dictobject.c:2078:10: runtime error: index 128 out of bounds for type 'int8_t [8]'
../gpshead/Objects/dictobject.c:3584:38: runtime error: index 16 out of bounds for type 'int8_t [8]'
../gpshead/Objects/dictobject.c:3502:38: runtime error: index 64 out of bounds for type 'int8_t [8]'
At issue is the hash table here: https://github.com/python/cpython/blob/3.7/Objects/dict-common.h
which is intentionally meant to be indexed "out of bounds" off the end of the struct.
I'm not a strict C language definition so I don't know if that is _supposed_ to be defined behavior as we all tend to assume it is in C or not. If it is supposed to be okay, we should be able to annotate it as such to avoid the warning under ubsan builds.
If it is not, we need to change the way this is written.
|
msg315466 - (view) |
Author: Gregory P. Smith (gregory.p.smith) *  |
Date: 2018-04-18 22:27 |
related: https://ssl.icu-project.org/trac/ticket/13503
|
msg315473 - (view) |
Author: Benjamin Peterson (benjamin.peterson) *  |
Date: 2018-04-19 04:25 |
Yeah, I've run into this before. The "correct" thing to do is use C99 VLAs. Unfortunately, that doesn't work for PyDictKeysObject because it really wants a union of VLAs but that isn't supported. The best I could do is making a struct for every possible member of the union. See the attached patch. I didn't land it because it's gross. OTOH, undefined behavior is bad, so I'm okay landing this if you don't find it too revolting.
|
msg315479 - (view) |
Author: Gregory P. Smith (gregory.p.smith) *  |
Date: 2018-04-19 05:54 |
I think it is worth getting such a change in.
its arguable that the compiler should be able to see through the union for this use case but I assume such a fix would only land in a recent clang version.
i'm attaching a variant on your patch that doesn't cast entire structs but just casts the VLA.
thoughts?
|
msg315489 - (view) |
Author: Gregory P. Smith (gregory.p.smith) *  |
Date: 2018-04-19 15:14 |
notably, C99 variable length arrays syntax is not mentioned as allowed in https://www.python.org/dev/peps/pep-0007/. If we want to use VLAs, that should be clarified. But both our solutions should work with [1] instead of [] which is allowed by the ubsan checker.
it means keeping the annoying "- Py_MEMBER_SIZE(...)" of the array on all size calculations (but because of that the patch is smaller).
|
msg315491 - (view) |
Author: Benjamin Peterson (benjamin.peterson) *  |
Date: 2018-04-19 16:18 |
Your PR is basically what we did prior to 186122ead26f3ae4c2bc9f6715d2a29d339fdc5a. The problem is that may run afoul of different UB, namely strict aliasing. (Though, I suppose we could probably also avoid that by making dk_indices char[].)
If VLAs work with whatever MSVC we're using, it seems fine to add it to PEP 7. At worst, we can #ifdef sanitizer it.
|
msg315492 - (view) |
Author: Gregory P. Smith (gregory.p.smith) *  |
Date: 2018-04-19 16:45 |
By my reading on how strict aliasing works, I think just changing the int64_t[1] or int32_t[1] in my PR to char[1] will work as char is always assumed to alias?
the clang ubsan i was testing my PR against wasn't warning me about strict aliasing. :/
|
msg315493 - (view) |
Author: Gregory P. Smith (gregory.p.smith) *  |
Date: 2018-04-19 17:01 |
I'm going to see what appveyor says with the VLA code on the PR. I've updated it to use char[].
|
msg315494 - (view) |
Author: Gregory P. Smith (gregory.p.smith) *  |
Date: 2018-04-19 17:03 |
My PR is almost a revert of https://github.com/python/cpython/commit/186122ead26f3ae4c2bc9f6715d2a29d339fdc5a so I'm curious what your motivation behind that change to use the union was?
|
msg315495 - (view) |
Author: Benjamin Peterson (benjamin.peterson) *  |
Date: 2018-04-19 20:35 |
I was fixing strict aliasing problems in the code and I thought the union was cleaner than a bunch of casts.
|
msg315501 - (view) |
Author: Gregory P. Smith (gregory.p.smith) *  |
Date: 2018-04-20 05:41 |
New changeset 397f1b28c4a12e3b3ed59a89599eabc457412649 by Gregory P. Smith in branch 'master':
bpo-33312: Fix clang ubsan out of bounds warnings in dict. (GH-6537)
https://github.com/python/cpython/commit/397f1b28c4a12e3b3ed59a89599eabc457412649
|
msg315503 - (view) |
Author: Florian Weimer (fweimer) |
Date: 2018-04-20 06:33 |
Why does the code even need the flexible struct member? If you use the surrounding backing store directly, the aliasing issue disappears if the backing store is untyped memory (if not, you have the aliasing problem with the rest of the struct anyway).
|
msg315520 - (view) |
Author: Gregory P. Smith (gregory.p.smith) *  |
Date: 2018-04-20 16:26 |
If you think this should be written differently, please propose it in a PR so we can see what you are suggesting.
An unbounded member at the end of a struct is quite a common practice in C.
ex: PyBytesObject
|
msg315522 - (view) |
Author: Ned Deily (ned.deily) *  |
Date: 2018-04-20 16:33 |
Greg, this change has broken some buildbots:
http://buildbot.python.org/all/#/builders/85/builds/934
http://buildbot.python.org/all/#/builders/21/builds/951
|
msg315523 - (view) |
Author: Gregory P. Smith (gregory.p.smith) *  |
Date: 2018-04-20 17:06 |
New changeset 392520bd78cd18639a27e5d2803c2e1c2bd593a8 by Gregory P. Smith (Miss Islington (bot)) in branch '3.7':
bpo-33312: Fix clang ubsan out of bounds warnings in dict. (GH-6537) (GH-6543)
https://github.com/python/cpython/commit/392520bd78cd18639a27e5d2803c2e1c2bd593a8
|
msg315524 - (view) |
Author: Gregory P. Smith (gregory.p.smith) *  |
Date: 2018-04-20 17:18 |
ned: yes, it supposedly has. test_gdb. in a manner i cannot reproduce or debug without error output that indicates anything about the problem.
test_gdb passes on my systems.
|
msg315525 - (view) |
Author: Gregory P. Smith (gregory.p.smith) *  |
Date: 2018-04-20 17:21 |
I see one line in Tools/gdb/libpython.py that may be related. i'll try changing that. the only way i have to _test_ it is to merge it into master.
|
msg315528 - (view) |
Author: Gregory P. Smith (gregory.p.smith) *  |
Date: 2018-04-20 18:32 |
New changeset 53f67d401df40486fd0fb8fbcf9da725cd37290c by Gregory P. Smith in branch 'master':
bpo-33312: update Tools/gdb/libpython.py to match. (GH-6548)
https://github.com/python/cpython/commit/53f67d401df40486fd0fb8fbcf9da725cd37290c
|
msg315531 - (view) |
Author: Gregory P. Smith (gregory.p.smith) *  |
Date: 2018-04-20 20:02 |
New changeset 11659d00b9185c8f02ea6b642fa475a80e21f1a9 by Gregory P. Smith (Miss Islington (bot)) in branch '3.7':
bpo-33312: update Tools/gdb/libpython.py to match. (GH-6549)
https://github.com/python/cpython/commit/11659d00b9185c8f02ea6b642fa475a80e21f1a9
|
msg315535 - (view) |
Author: Ned Deily (ned.deily) *  |
Date: 2018-04-20 20:25 |
Thanks, Greg, looking better!
|
msg315543 - (view) |
Author: Gregory P. Smith (gregory.p.smith) *  |
Date: 2018-04-20 23:50 |
Sorry for the breakage. I'm glad the fix was that greppable and easy given I had no idea where to start otherwise. :)
This is in for 3.7. The PR for 3.6 is pending, i want to let it out in the next 3.7 builds for a while before i merge it into 3.6 even though I've got pretty high confidence at this point.
issue33321 filed to track getting an ubsan buildbot setup.
Also, thank you Benjamin for your patch and pointing out the strict aliasing issue. so many undefined behaviors. it's a wonder the language even works. :)
|
msg352171 - (view) |
Author: Gregory P. Smith (gregory.p.smith) *  |
Date: 2019-09-12 14:13 |
automated backport of the gdb/libpython change fails on 3.6, if this is needed there (I haven't looked into the code on that branch), it's a tiny change to apply. i'll leave that to the 3.6 RM to decide if at all.
|
msg352188 - (view) |
Author: Ned Deily (ned.deily) *  |
Date: 2019-09-12 14:54 |
This issue does not seem to me to be a security issue so would not meet the criteria for backporting to 3.6.
|
|
Date |
User |
Action |
Args |
2022-04-11 14:58:59 | admin | set | github: 77493 |
2019-09-12 14:54:59 | ned.deily | set | messages:
+ msg352188 |
2019-09-12 14:13:18 | gregory.p.smith | set | status: open -> closed versions:
- Python 3.7, Python 3.8 messages:
+ msg352171
resolution: fixed stage: patch review -> resolved |
2018-04-20 23:50:31 | gregory.p.smith | set | assignee: gregory.p.smith messages:
+ msg315543 |
2018-04-20 20:25:02 | ned.deily | set | messages:
+ msg315535 |
2018-04-20 20:02:40 | gregory.p.smith | set | messages:
+ msg315531 |
2018-04-20 18:32:29 | miss-islington | set | pull_requests:
+ pull_request6246 |
2018-04-20 18:32:12 | gregory.p.smith | set | messages:
+ msg315528 |
2018-04-20 17:27:39 | gregory.p.smith | set | pull_requests:
+ pull_request6245 |
2018-04-20 17:21:39 | gregory.p.smith | set | messages:
+ msg315525 |
2018-04-20 17:18:45 | gregory.p.smith | set | messages:
+ msg315524 |
2018-04-20 17:06:30 | gregory.p.smith | set | messages:
+ msg315523 |
2018-04-20 16:33:08 | ned.deily | set | nosy:
+ ned.deily messages:
+ msg315522
|
2018-04-20 16:26:16 | gregory.p.smith | set | messages:
+ msg315520 |
2018-04-20 06:33:11 | fweimer | set | messages:
+ msg315503 |
2018-04-20 05:43:28 | miss-islington | set | pull_requests:
+ pull_request6240 |
2018-04-20 05:42:39 | miss-islington | set | pull_requests:
+ pull_request6239 |
2018-04-20 05:41:28 | gregory.p.smith | set | messages:
+ msg315501 |
2018-04-19 20:35:31 | benjamin.peterson | set | messages:
+ msg315495 |
2018-04-19 17:03:49 | gregory.p.smith | set | messages:
+ msg315494 |
2018-04-19 17:01:07 | gregory.p.smith | set | messages:
+ msg315493 |
2018-04-19 16:45:41 | gregory.p.smith | set | messages:
+ msg315492 |
2018-04-19 16:18:09 | benjamin.peterson | set | messages:
+ msg315491 |
2018-04-19 15:37:24 | gregory.p.smith | set | stage: needs patch -> patch review pull_requests:
+ pull_request6233 |
2018-04-19 15:14:06 | gregory.p.smith | set | files:
+ maybe-less-horrible-no-vla-gps02.patch
messages:
+ msg315489 |
2018-04-19 13:05:49 | methane | set | nosy:
+ methane
|
2018-04-19 05:54:11 | gregory.p.smith | set | files:
+ maybe-less-horrible-gps01.patch
messages:
+ msg315479 |
2018-04-19 04:25:41 | benjamin.peterson | set | files:
+ horrible.patch keywords:
+ patch messages:
+ msg315473
|
2018-04-18 22:27:28 | gregory.p.smith | set | messages:
+ msg315466 |
2018-04-18 22:01:56 | gregory.p.smith | set | nosy:
+ twouters
|
2018-04-18 22:01:33 | gregory.p.smith | set | nosy:
+ benjamin.peterson, fweimer
|
2018-04-18 21:59:39 | gregory.p.smith | create | |