This patch simplifies fast path for floats that fit into C long and moves it from float.__trunc__ to PyLong_FromDouble().
+---------------------+---------------------+------------------------------+
| Benchmark | long-from-float-ref | long-from-float |
+=====================+=====================+==============================+
| int(1.) | 39.5 ns | 37.3 ns: 1.06x faster (-6%) |
+---------------------+---------------------+------------------------------+
| int(2.**20) | 46.4 ns | 45.6 ns: 1.02x faster (-2%) |
+---------------------+---------------------+------------------------------+
| int(2.**30) | 52.5 ns | 49.0 ns: 1.07x faster (-7%) |
+---------------------+---------------------+------------------------------+
| int(2.**60) | 50.0 ns | 49.2 ns: 1.02x faster (-2%) |
+---------------------+---------------------+------------------------------+
| int(-2.**63) | 76.6 ns | 48.6 ns: 1.58x faster (-37%) |
+---------------------+---------------------+------------------------------+
| int(2.**80) | 77.1 ns | 72.5 ns: 1.06x faster (-6%) |
+---------------------+---------------------+------------------------------+
| int(2.**120) | 91.5 ns | 87.7 ns: 1.04x faster (-4%) |
+---------------------+---------------------+------------------------------+
| math.ceil(1.) | 57.4 ns | 32.9 ns: 1.74x faster (-43%) |
+---------------------+---------------------+------------------------------+
| math.ceil(2.**20) | 60.5 ns | 41.3 ns: 1.47x faster (-32%) |
+---------------------+---------------------+------------------------------+
| math.ceil(2.**30) | 64.2 ns | 43.9 ns: 1.46x faster (-32%) |
+---------------------+---------------------+------------------------------+
| math.ceil(2.**60) | 66.3 ns | 42.3 ns: 1.57x faster (-36%) |
+---------------------+---------------------+------------------------------+
| math.ceil(-2.**63) | 67.7 ns | 43.1 ns: 1.57x faster (-36%) |
+---------------------+---------------------+------------------------------+
| math.ceil(2.**80) | 66.6 ns | 65.6 ns: 1.01x faster (-1%) |
+---------------------+---------------------+------------------------------+
| math.ceil(2.**120) | 79.9 ns | 80.5 ns: 1.01x slower (+1%) |
+---------------------+---------------------+------------------------------+
| math.floor(1.) | 58.4 ns | 31.2 ns: 1.87x faster (-47%) |
+---------------------+---------------------+------------------------------+
| math.floor(2.**20) | 61.0 ns | 39.6 ns: 1.54x faster (-35%) |
+---------------------+---------------------+------------------------------+
| math.floor(2.**30) | 64.2 ns | 43.9 ns: 1.46x faster (-32%) |
+---------------------+---------------------+------------------------------+
| math.floor(2.**60) | 62.1 ns | 40.1 ns: 1.55x faster (-35%) |
+---------------------+---------------------+------------------------------+
| math.floor(-2.**63) | 64.1 ns | 39.9 ns: 1.61x faster (-38%) |
+---------------------+---------------------+------------------------------+
| math.floor(2.**80) | 62.2 ns | 62.7 ns: 1.01x slower (+1%) |
+---------------------+---------------------+------------------------------+
| math.floor(2.**120) | 77.0 ns | 77.8 ns: 1.01x slower (+1%) |
+---------------------+---------------------+------------------------------+
I'm going to speed-up conversion of larger floats in a follow-up PR.
|