This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author terry.reedy
Recipients ckern, jcea, terry.reedy
Date 2012-11-16.20:16:52
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1353097013.24.0.811804311496.issue16461@psf.upfronthosting.co.za>
In-reply-to
Content
The Wikipedia sentence "The WAV format is limited to files that are less than 4 GB, because of its use of a 32-bit unsigned integer to record the file size header" is unambiguous and appears correct (see below). The rest of the Wikipedia sentence "(some programs limit the file size to 2 GB)" must be because some programs mistakenly read into signed instead of unsigned ints and fail to later adjust (by, for instance, later casting to unsigned).

The statement reflects the original specification given in reference 3, a .pdf. On page 11, it has
'''
The basic building block of a RIFF file is called a chunk. Using C syntax, a chunk can be defined
as follows:
typedef unsigned long DWORD;
typedef DWORD CKSIZE; // 32-bit unsigned size value
typedef struct { // Chunk structure
  CKID ckID; // Chunk type identifier
  CKSIZE ckSize; // Chunk size field (size of ckData)
'''
and on page 19
"<WORD> 16-bit unsigned quantity in Intel format    unsigned int"
INT and LONG are defined as 16 and 32 bit signed versions.

The WAVE specification, start on p. 56, uses WORD and DWORD, not INT and LONG for chunk header fields. Certainly, the 2 bytes for samples/sec should be unsigned to allow the standard 44100 CD rate.

http://www-mmsp.ece.mcgill.ca/Documents/AudioFormats/WAVE/WAVE.html
(reference 4) summarized the .wav chunk formats. I think it just takes it for granted that sizes and counts are unsigned.

The patch given did not touch the write format in line 469(3.3.0):
        self._file.write(struct.pack('<l4s4slhhllhh4s',
The first 'l' is filled with 36 + self._datalength and I believe the whole thing should be '<L4s4sLHHLLHH4s'. The two struct.unpack formats on lines 264 and 266 should also be changed.

A workaround is to write large numbers as signed negatives.
>>> struct.unpack('L', struct.pack('l', 3000000000 -2**32))
(3000000000,)
>>> struct.unpack('H', struct.pack('h', 44100 - 2**16))[0]
44100

It is possible that someone is using this to write CD-quality files. It is also possible that anyone who has tried just gave up.
History
Date User Action Args
2012-11-16 20:16:53terry.reedysetrecipients: + terry.reedy, jcea, ckern
2012-11-16 20:16:53terry.reedysetmessageid: <1353097013.24.0.811804311496.issue16461@psf.upfronthosting.co.za>
2012-11-16 20:16:53terry.reedylinkissue16461 messages
2012-11-16 20:16:52terry.reedycreate