classification
Title: Python relies on C undefined behavior float-cast-overflow
Type: behavior Stage: needs patch
Components: Extension Modules, Interpreter Core Versions: Python 3.7, Python 3.6, Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: benjamin.peterson, gregory.p.smith
Priority: normal Keywords:

Created on 2018-05-19 00:09 by gregory.p.smith, last changed 2018-05-19 04:13 by benjamin.peterson.

Messages (2)
msg317074 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2018-05-19 00:09
Clang's undefined behavior sanitizer is flagging several places in CPython where it is relying on float-cast-overflow behavior.  Typically exposed where an out of bounds floating point value is cast to another type.

The clang compiler is about to start applying optimizations that alters the previous version of the undefined behavior on some platforms.  We need to make CPython clean for float-cast-overflow errors.

examples:
 _PyTime_DoubleToDenominator https://github.com/python/cpython/blob/master/Python/pytime.c#L159
 _PyTime_FromFloatObject - https://github.com/python/cpython/blob/master/Python/pytime.c#L389
 getargs double cast to a float - https://github.com/python/cpython/blob/master/Python/getargs.c#L864
 _PyFloat_Pack4 double cast to a float - https://github.com/python/cpython/blob/master/Objects/floatobject.c#L2234

These are found by running a ubsan build with this checker enabled on test_datetime, test_getargs2, test_struct, and test_thread.

There are probably others, but our own test suite happens to trigger these.

In many cases we should use correct conversion code instead of the cast that does what we want when the value is out of bounds and without a defined conversion.  In others we might want an OverflowError or ValueError.  But preserving the existing compilers up until now behavior makes more sense from a code compatibility standpoint (ie: it is not expecting an OverflowError when we make a CPython API that takes a float as input but behind the scenes uses a C API that operates on an int64 - that is an implementation detail no user should care about).
msg317093 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2018-05-19 04:13
You might want to have a look at #20941. Arguably ubsan is too pendantic in some of these cases.
History
Date User Action Args
2018-05-19 04:13:59benjamin.petersonsetnosy: + benjamin.peterson
messages: + msg317093
2018-05-19 00:09:21gregory.p.smithcreate