Issue 32978: Issues with reading large float values in AIFC files

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/77159

classification

Title:	Issues with reading large float values in AIFC files
Type:	behavior	Stage:	patch review
Components:	Library (Lib)	Versions:	Python 3.9, Python 3.8

process

Status:	open	Resolution:
Dependencies:		Superseder:
Assigned To:		Nosy List:	mark.dickinson, serhiy.storchaka, tim.peters
Priority:	normal	Keywords:	patch

Created on 2018-03-01 16:50 by serhiy.storchaka, last changed 2022-04-11 14:58 by admin.

Pull Requests
URL	Status	Linked	Edit
PR 5952	open	serhiy.storchaka, 2018-03-01 16:53

Messages (10)
msg313098 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2018-03-01 16:50
The frequency rate is saved as a 80-bit floating point value in AIFC files. There are issues with reading large values. 1. Values with maximal exponent are read as aifc._HUGE_VAL which is less then sys.float_info.max. Thus greater values can be read as lesser values. >>> aifc._read_float(io.BytesIO(b'\x7f\xff\xff\xff\xff\xff\xff\xff\xf8\x00')) 1.79769313486231e+308 >>> aifc._read_float(io.BytesIO(b'\x43\xfe\xff\xff\xff\xff\xff\xff\xf8\x00')) 1.7976931348623157e+308 >>> aifc._read_float(io.BytesIO(b'\x43\xfe\xff\xff\xff\xff\xff\xff\xff\xff')) inf 2. If exponent is not maximal, but large enough, this cause an OverflowError. It would be better consistently return the maximal value or inf. >>> aifc._read_float(io.BytesIO(b'\x44\xfe\xff\xff\xff\xff\xff\xff\xf8\x00')) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/serhiy/py/cpython3.7/Lib/aifc.py", line 198, in _read_float f = (himant * 0x100000000 + lomant) * pow(2.0, expon - 63) OverflowError: (34, 'Numerical result out of range') OverflowError when read a broken aifc file can be unexpected. The proposed PR tries to make reading floats more consistent. I'm not sure it is correct.
msg313102 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2018-03-01 18:12
I'm finding it hard to imagine why you'd ever have a sample rate greater than 1e308 in an audio file. (In fact, it's hard to imagine needing a sample rate greater than 1e6.) Raising an OverflowError for a value that's too large to fit in an IEEE 754 binary64 float seems entirely reasonable.
msg313107 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2018-03-01 19:01
I'm fine with an OverflowError, but it looks inconsistent to me that for some huge values an OverflowError is not raised, but aifc._HUGE_VAL is returned instead. And I think that using math.ldexp() can be more preferable that pow(2.0, ...), especially for exponent around sys.float_info.max_exp and sys.float_info.min_exp.
msg313136 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2018-03-02 15:49
> And I think that using math.ldexp() can be more preferable that pow(2.0, ...) Yes, absolutely agreed. I'll take a look at the PR.
msg313141 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2018-03-02 17:14
Thank you for your review Mark. If left the OverflowError propagated I would catch it at the caller place (_read_float() is used only once) and reraise as aifc.Error. OverflowError is not expected exception. It never is raised for valid AIFC files, and the probability of raising it rather of aifc.Error for a random binary file is very small. I suppose that most users of this module don't catch it. See also issue32056. But on other side, if there are files with an exponent of 0x7fff in wild, they are currently opened without errors. Raising an exception can be a regression.
msg313143 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2018-03-02 17:21
> If left the OverflowError propagated I would catch it at the caller place (_read_float() is used only once) and reraise as aifc.Error. Ah, if there's a dedicated exception type already then yes, agreed that raising aifc.Error (with a suitable message) makes much more sense than raising an OverflowError or a ValueError. > if there are files with an exponent of 0x7fff in wild I hope there aren't, and I'd consider any such file to be broken/corrupted. That is, unless there's a common practice of using NaNs or infinities for the frame rate. I suppose it's conceivable that people us this as a placeholder meaning "I don't know the sample rate". No idea whether that's true in practice: I'm not a frequent user of this format.
msg313144 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2018-03-02 17:27
What if return 80-bit infinities and NaN as 64-bit infinities and NaN and raise an error in the case of finite numbers overflow?
msg313145 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2018-03-02 17:28
> What if return 80-bit infinities and NaN as 64-bit infinities and NaN and raise an error in the case of finite numbers overflow? That's certainly reasonable from a pure floating-point-conversion perspective: it's exactly what I'd expect a general purpose extended-precision to double-precision converter to do.
msg313147 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2018-03-02 17:31
But I don't know the exact format for infinities and NaNs.
msg313149 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2018-03-02 17:45
> But I don't know the exact format for infinities and NaNs. From the Intel software developer manuals: For infinities: Sign bit: 0 or 1 (positive or negative infinity) Exponent field (15 bits): all ones Significand (64 bits): first bit 1, all remaining bits 0. NaN: Sign bit: 0 or 1 Exponent field (15 bits): all ones Significand (64 bits): first bit 1, at least one of the remaining bits 1.

History
Date	User	Action	Args
2022-04-11 14:58:58	admin	set	github: 77159
2020-01-12 02:01:05	cheryl.sabella	set	versions: + Python 3.9, - Python 2.7, Python 3.6, Python 3.7
2018-03-02 17:45:16	mark.dickinson	set	messages: + msg313149
2018-03-02 17:31:32	serhiy.storchaka	set	messages: + msg313147
2018-03-02 17:28:41	mark.dickinson	set	messages: + msg313145
2018-03-02 17:27:31	serhiy.storchaka	set	messages: + msg313144
2018-03-02 17:21:53	mark.dickinson	set	messages: + msg313143
2018-03-02 17:14:29	serhiy.storchaka	set	messages: + msg313141
2018-03-02 15:49:12	mark.dickinson	set	messages: + msg313136
2018-03-01 19:01:43	serhiy.storchaka	set	messages: + msg313107
2018-03-01 18:12:46	mark.dickinson	set	messages: + msg313102
2018-03-01 16:53:30	serhiy.storchaka	set	keywords: + patch stage: patch review pull_requests: + pull_request5718
2018-03-01 16:50:35	serhiy.storchaka	create