This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: sndhdr erroneously detects files as vox
Type: behavior Stage: test needed
Components: Library (Lib) Versions: Python 3.1, Python 3.2
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: r.david.murray Nosy List: jbit, r.david.murray, vstinner
Priority: normal Keywords: patch

Created on 2010-07-13 11:47 by jbit, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
sndhdr-fix.patch jbit, 2010-07-13 11:47 Patch which fixes the issue described.
test_sndhdr.tar.gz jbit, 2010-07-13 15:44 Unittest for sndhdr format detection.
Messages (5)
msg110172 - (view) Author: James Lee (jbit) Date: 2010-07-13 11:47
http://svn.python.org/view/python/branches/py3k/Lib/sndhdr.py?r1=56957&r2=56987 seems to have broken sndhdr as it incorrectly detects files as voc (all that time ago and nobody has noticed...):

[jbit@miku]~$ xxd test.wav | head -n 1
0000000: 5249 4646 2478 e001 5741 5645 666d 7420  RIFF$x..WAVEfmt 
[jbit@miku]~$ python3.1
>>> import sndhdr
>>> sndhdr.what("test.wav")
('voc', 0, 1, -1, 8)

I'm a little unsure why it got changed from indexing to startswith() at all, but changes like the following are just totally incorrect.
-    if h[:20] != b'Creative Voice File\032':
+    if h.startswith(b'Creative Voice File\032'):

Attached is a patch against the py3k branch which fixes the issue as well as some other things obviously broken by the same changelist, although it might be a good idea to review the changelist again to make sure there's nothing else obviously broken :)
>>> sndhdr.what("test.wav")
('wav', 48000, 2, -1, 16)
msg110198 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-07-13 14:33
Since email uses this and it doesn't look like anyone else is likely to have much knowledge of this module, I'm going to assign it to myself so I don't forget about it.

This module has no unit tests.  It would be really great if you could create some to at least test for this regression.  I'll be glad to provide guidance on the test suite if you need it, but I don't know all that much about audio files so I'm not in a position to write the tests myself at the moment.

The first changeset is just the translation to bytes, I don't see any issues there.  I'm adding Victor as nosy since he wrote the second changeset, but the fix looks straightforwardly correct to me.

By the way, if you had written r56957 and r56987, roundup would have turned them into links, instead of quoting the link you did supply so that it wasn't usable :)
msg110199 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-07-13 14:34
Oh, and the reason it got changed to startswith is probably a micro-optimization: startswith is more efficient than doing a slice-and-compare.
msg110206 - (view) Author: James Lee (jbit) Date: 2010-07-13 15:44
Thanks for the update... The link was actually just a "diff to previous" of the changelist that caused the problem (r56987), sorry for the confusion. :)

Attached is a quick and dirty unittest complete with some test files. It only tests the format of the file though, not any of the additional information returned.

With an untouched sndhdr.py it finds the issue reported, and with sndhdr-fix.patch applied it reports only one issue:

Traceback (most recent call last):
  File "test_sndhdr.py", line 22, in test_aiff
    self.assertEqual(ret[0], 'aiff')
AssertionError: b'aiff' != 'aiff'

I'm thinking that on line 65 of sndhdr.py the b'aiff' should be just 'aiff' (like everything else). At least I can't see a reason for it to be different. (Changing this fixes the unittest issue)

Cheers
msg110243 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-07-13 23:33
> Oh, and the reason it got changed to startswith is probably 
> a micro-optimization: startswith is more efficient than doing
> a slice-and-compare.

Not really. I changed it because it's more readable (it's easier to understand that the code).

--

James: I commited all your fixes + your test suite to 3.1 (r82859 + r82860) and 3.2 (r82856 + r82857) and  added your name to Misc/ACKS. Thanks for your contribution.
History
Date User Action Args
2022-04-11 14:57:03adminsetgithub: 53489
2010-07-13 23:33:25vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg110243
2010-07-13 15:44:06jbitsetfiles: + test_sndhdr.tar.gz

messages: + msg110206
2010-07-13 14:34:58r.david.murraysetmessages: + msg110199
2010-07-13 14:33:01r.david.murraysetversions: + Python 3.2
nosy: + vstinner, r.david.murray

messages: + msg110198

assignee: r.david.murray
stage: test needed
2010-07-13 11:47:48jbitcreate