New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add half-float (16-bit) support to struct module #55943
Comments
Numpy 1.6.0 adds support for a half-float (16-bit) data type, but I am proposing that the struct module be extended to support half floats, and will be providing a patch that implements that behavior. The current numpy implementation (1.6.0b1) uses the 'e' character to indicate half floats; I don't have a vested interest in what character is used. If consensus between CPython and numpy is reached for using a different character, I will update my patches (but I don't plan on participating in the bike shedding prior to that point). For reference: python-ideas thread: numpy-discussion thread: cython-user thread (less relevant to this issue specifically): |
Initial patch; tests pass. A couple notes:
|
Taking a look at the patch, I see you're using the single -> half conversion routine from NumPy. This has the double rounding problem when converting double -> float -> half, so it would be better to use the double -> half routine. I implemented it with 64-bit unsigned ints, so if there are platforms not supporting 64-bit ints, something else such as using the 32-bit routine will have to be done there. |
I'm thinking of taking the current float implementation and wrapping it with something like: #if HAS_INT64_TYPE
// double implementation goes here
#else
// float implementation here (what's in the current patch)
...
#endif Does this seem like a reasonable approach? Is there a CPython-canonical way to spell "HAS_INT64_TYPE" and/or find out what type that is on the current platform? |
I've spelled "HAS_INT64_TYPE" as follows: +#if SIZEOF_LONG == 8 || SIZEOF_LONG_LONG == 8 I left the Unpack2 function using floats, since there's no rounding issues there, and it didn't feel like the extra code complexity was justified. If there's disagreement on that point, I can make a similar change for Unpack2. |
I think this won't work on Windows since there the 64-bit int is generally __int64. If you look at the long long and unsigned long long support in _struct.c you can see a better way to do this: "#ifdef HAVE_LONG_LONG" and "unsigned PY_LONG_LONG" for the type. |
Ahh, yes, much cleaner. Thank you. :) I'll remove the previous version. |
I've only glanced at the patch, but a couple of things: (1) It looks as though the patch assumes that a C double is IEEE 754 binary64 format, and that a C float is IEEE 754 binary32 format. Is that correct? If so, that's a significant break with tradition: Python's supposed to be able to work with the native doubles, no matter what their format. I, for one, wouldn't object if IEEE floating-point were made a requirement for Python >= 3.3, but this change would need to be discussed on the python-dev mailing list. (2) The typedef PY_LONG_LONG npy_uint64; looks wrong to me for two reasons: first, PY_LONG_LONG is a signed type. Second, while it's guaranteed (by the C standard) that unsigned long long will have width at least 64, it's not guaranteed that it'll be exactly 64. Does the code depend on this assumption? Similarly when using unsigned int in the union for the 32-bit code; here it's not even guaranteed by the standards that unsigned int has width at least 32. It may be better to use the PY_UINT64_T type and the HAVE_UINT64_T macros, and similarly for the 32-bit types. |
The patch currently assumes IEEE 754 with byte-order matching the integer type. I see that pyconfig.h defines three possible cases when IEEE 754 doubles are supported, DOUBLE_IS_LITTLE_ENDIAN_IEEE754, DOUBLE_IS_BIG_ENDIAN_IEEE754, and DOUBLE_IS_ARM_MIXED_ENDIAN_IEEE754. The code should have some byte swapping to match it to the integer endianness. If support for non-IEEE systems is still desired, implementing a conversion using isnan/isinf/frexp/ldexp should be pretty easy, though off hand I can't see an efficient way to extract the significand bits. Regarding the PY_LONG_LONG, it should have been unsigned PY_LONG_LONG. Using PY_UINT64_T is better, though, since a bigger PY_LONG_LONG would cause trouble in the union. For the UINT32, maybe just using the double/PY_UINT64_T versions is better, since there is no macro for FLOAT_IS_IEEE754? Falling back to a frexpr implementation if double isn't IEEE or there is no 64-bit integer type may be a reasonable tradeoff to support the few platforms where that's the case, and 2 instead of 3 separate conversion codes is a bit better maintenance-wise. |
There seems to be some disagreement about the double-rounding issue; Mark Dickinson posted on python-ideas that he doesn't think that there is one. If possible, I think that removing code paths that aren't needed are generally a good thing, especially if I'm going to have to add support for non-IEEE floats. I'll start working on the isnan/isinf/frexp/ldexp implementation for the cases where either we don't have a UINT64 type, or doubles aren't IEEE. Given my current schedule, I don't think that I'll have this done today; hopefully tomorrow morning. :) |
There's no disagreement, since they're different cases. Taking an arbitrary double, rounding to float, then rounding to half definitely has double-rounding issues. (And I would recommend constructing an example to add to the test case to make sure you understand what's going on.) Taking two halfs, doing a primitive arithmetic operation on them as floats, then rounding back to half is what Mark was referring to on the list. In NumPy I also tried to err more towards accuracy and performance than having each intermediate be strictly half. The einsum function retains intermediate values as floats when operating on half, for example, and that's what I recommended in the documentation. I'd also suggest adding some more to the test suite here to verify that ties are rounding to the nearest even properly. You can see some tests for this here: https://github.com/numpy/numpy/blob/master/numpy/core/tests/test_half.py#L124 |
I see the distinction about the rounding error now. Thanks for clarifying; I've added more tests as suggested. Looking at _struct.c, line 2085:
I'm going to assume that it's okay to allow 'e' to pass through here, since the 'e' type is *always* going to be in IEEE 754 (not "unknown") floating point format, and the handling routines don't rely on the C conversions the way the float/double ones do. Is that a good assumption to make? Also, looking at the line in the native_table declaration:
Does it make sense to have an entry here since there isn't actually a native representation? ATM it's implemented as just figuring out the platform endianness, and then calling the corresponding lp_/lu_ or bp_/bu_ function. |
More questions: The current _PyFloat_Pack4 doesn't round to even: fbits = (unsigned int)(f + 0.5); /* Round */ Is the mismatch going to be a problem? Should I change the _PyFloat_Pack4 implementation while I'm in there? |
Made the _PyFloat_Pack2 match the algo in _PyFloat_Pack4, and did similar for Unpack. This should work on platforms that don't have IEEE 754 floats except for situations where an INF or NAN is unpacked (this is the same as the Unpack4 behavior). This also gets rid of any need for #ifdef'd code (an ad-hoc speed test showed no noticeable difference between the numpy-based version and the CPython-based version of the functions). I've also added a bit of documentation to this version, as well as more tests. I haven't changed anything about the native_table entry or Pack4's non-round-to-even behavior. |
What he said.
No; let's keep the patch as simple as possible. We can open a separate issue for fixing _Pack4 if necessary. Is the failure to round-to-even only for legacy formats, or is it for IEEE formats as well? If the former, it's difficult to care much; if the latter, it should definitely be fixed, but not as part of this issue.
Thanks; haven't looked at the new patch yet, but hope to do so in the next few days. |
Ah; I see it's just for the non-IEEE formats, so not really an issue. When the float format is known, the code just depends on a (float) cast doing the right thing, which it generally will. |
And I second this suggestion. |
All four changes suggested via review from Mark Dickinson have been made. One note: the else if (!(e == 0 && f == 0.0)) { change was made; this makes the Pack2 function parallel the Pack4 function slightly less. I agree that it's cleaner this way, but I've left the original if in as a comment for comparison to Pack4. I'll remove the comment if it's decided that it's better to be clean than parallel. The suggested test cases were added in the v4 patch, just in case that wasn't clear. I'm going to assume that my questions about _struct.c aren't issues unless someone explicitly says they are. Thanks for all of the review and feedback! :) |
Just a few small tweaks. I think you meant to subtract the second term here: Isn't the "x < 0" part unnecessary? Maybe a comment explaining why the signbit function is used, to preserve -0, would help people who otherwise might think to replace it with "x < 0". This is creating a signaling nan. It would be better to create a quiet nan by setting the highest order bit instead of the lowest order bit. Otherwise it looks great to me, good work! |
Updated patch with changes suggested by Mark Wiebe. I also removed the commented-out "if (!(e == 0 && f == 0.0))" part. It feels like the patch is approaching an acceptable state; after both of you agree it's there, what do I need to do next? |
Tests passed, looks good. Code is also looks ok. /home/andrew/projects/py3k/Modules/_struct.c: In function ‘nu_halffloat’: After fixing warnings and structure.rst I like to see that patch in standard library. |
Given than there is no native "half float" in C (AFAIK), does it make sense to have a native variant of half floats? + {'e', sizeof(short), SHORT_ALIGN, nu_halffloat, np_halffloat}, Shouldn't it be 2 rather than sizeof(short)? And why SHORT_ALIGN? Other comments:
I'll let others review the numerical code, if necessary. |
I used half-float in GPU programming and from my perspective it was just native. There are no half-float in C, right. But there are half-floats used in NVIDIA libraries for example and I like to think used format is native and platform-depended in general. Correct me if I'm mistaking. |
By "native", the struct module means there is a corresponding C type and Apparently the patch takes the stance that the corresponding C type (for |
Antoine, agree with you after explanation. |
I'll take a look at the numerics. |
Just another voice for seeing this put in a deployed version of Python. Half-precision floats are becoming more common in applications. Question: does adding this affect math.isnan and math.isinf? |
I don't see how math.isnan or math.isinf would be affected: we're not proposing to make a float16 Python type, so math.isnan would never encounter a float16 value. |
Mark:
I was assuming that this proposal would also make this part of class 'float' just like the other floats returned from struct. Otherwise, you will get goofy cases where someone has to special-case what they get from struct. import math, struct
FullInfinityAsInt = 0x7f800000
ThisFloat = (struct.unpack("f", struct.pack("I", FullInfinityAsInt)))[0]
print("The type is " + str(type(ThisFloat)))
if math.isinf(ThisFloat):
print("This is an infiniity")
else:
print("This is not an infinity") This should work similarly for the new half-precision, I would think. |
Paul: your example works because struct.unpack('f', packed_stuff)[0] returns a regular Python float, backed by C double---i.e., exactly the same type that's returned by struct.unpack('d', packed_stuff)[0]. In effect, the single-precision C value is computed and then converted to a double-precision C value (a lossless conversion) before being wrapped in a Python float as usual. There's no dedicated float32 type, and math.isnan and math.isinf don't have to be aware of anything other than normal Python floats. float16 packing and unpacking would work the same way: on packing, a Python float would be converted to the closest float16 value and then serialised to a pair of bytes; on unpacking, that pair of bytes is (at least conceptually) converted to a float16 value and then to the corresponding float64 value, wrapped up in a Python float. |
I *think* what you are saying is that there will be no float16 type, but the result of struct.unpack('eorwhatever', packed_stuff)[0] will be the same as struct.unpack('f', packed_stuff)[0], yes? If so, that's what I wanted. I don't want a new float16; I want the output of unpacking a half-precision to be the same as unpacking the other two. I didn't see a statement like that earlier in the thread, but I could have missed it. Or maybe this was covered in msg132976? |
Refreshed patch, with a few minor updates:
|
Refreshed and updated patch, cleaning up some style issues, fixing a failure to return -1 on exception in np_halffloat, and removing some endianness-determining casty expressions in favour of using PY_LITTLE_ENDIAN. |
In the docs for note 7, consider expanding the text to describe what a half-float is (1-bit sign, 10 bit significand, 5 bit exponent) and its allowable range. This wasn't necessary for codes "f" and "d" since they have well-known C equivalents, but in this case some elaboration is needed. Also consider including a link to the wikipedia article and perhaps to IEEE 754-2008. Otherwise, he patch looks complete and passes the test suite. I think it is ready to apply. |
Thanks, Raymond. I'll do a final pass and aim to apply this in the next couple of days. |
There's some NaN behaviour that needs fixing in the packing code: packing a NaN currently creates a signalling NaN rather than a quiet NaN, and the sign of a NaN isn't respected. (One can make a strong argument that the sign of the NaN doesn't matter, but we're respecting the sign for '<d' and '<f' formats, so I think we should do the same for '<e'.) I'm working on an updated patch, taking into account the above and Antoine and Raymond's comments. This is the behaviour with the current patch, on OS X 10.10.5. >>> '{:064b}'.format(struct.unpack('<Q', struct.pack('<d', math.nan))[0])
'0111111111111000000000000000000000000000000000000000000000000000'
>>> '{:032b}'.format(struct.unpack('<I', struct.pack('<f', math.nan))[0])
'01111111110000000000000000000000'
>>> '{:016b}'.format(struct.unpack('<H', struct.pack('<e', math.nan))[0])
'0111110000000001'
>>> '{:064b}'.format(struct.unpack('<Q', struct.pack('<d', -math.nan))[0])
'1111111111111000000000000000000000000000000000000000000000000000'
>>> '{:032b}'.format(struct.unpack('<I', struct.pack('<f', -math.nan))[0])
'11111111110000000000000000000000'
>>> '{:016b}'.format(struct.unpack('<H', struct.pack('<e', -math.nan))[0])
'0111110000000001' |
Updated patch:
If there's no further feedback, I'll apply this in the next day or two. |
I missed Antoine's 2015 comments on the test code. Here's one more version of the patch, which addresses those comments. |
Mark probably already knows this, but I found this SO answer to be informative: http://stackoverflow.com/questions/3886988/how-to-distinguish-different-types-of-nan-float-in-python |
New changeset 519bde9db8e0 by Mark Dickinson in branch 'default': |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: