Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wave.py cannot write wave files into a shell pipeline #49452

Closed
drj11 mannequin opened this issue Feb 10, 2009 · 20 comments
Closed

wave.py cannot write wave files into a shell pipeline #49452

drj11 mannequin opened this issue Feb 10, 2009 · 20 comments
Assignees
Labels
stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@drj11
Copy link
Mannequin

drj11 mannequin commented Feb 10, 2009

BPO 5202
Nosy @terryjreedy, @drj11, @serhiy-storchaka
Dependencies
  • bpo-18919: Unify audio modules tests
  • Files
  • wave-20090210.patch: wave.py patch
  • writeheader_without_tell.diff
  • wave_write_unseekable.patch
  • wave_write_unseekable_2.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/serhiy-storchaka'
    closed_at = <Date 2013-11-16.11:57:36.463>
    created_at = <Date 2009-02-10.11:45:00.295>
    labels = ['type-feature', 'library']
    title = 'wave.py cannot write wave files into a shell pipeline'
    updated_at = <Date 2014-03-08.17:54:16.321>
    user = 'https://github.com/drj11'

    bugs.python.org fields:

    activity = <Date 2014-03-08.17:54:16.321>
    actor = 'python-dev'
    assignee = 'serhiy.storchaka'
    closed = True
    closed_date = <Date 2013-11-16.11:57:36.463>
    closer = 'serhiy.storchaka'
    components = ['Library (Lib)']
    creation = <Date 2009-02-10.11:45:00.295>
    creator = 'drj'
    dependencies = ['18919']
    files = ['13011', '13015', '31633', '32575']
    hgrepos = []
    issue_num = 5202
    keywords = ['patch']
    message_count = 20.0
    messages = ['81539', '81541', '81542', '81546', '81547', '81553', '81563', '81601', '81665', '81676', '81735', '81903', '112649', '182113', '196848', '197110', '197111', '202636', '203025', '212938']
    nosy_count = 5.0
    nosy_names = ['terry.reedy', 'gpolo', 'drj', 'python-dev', 'serhiy.storchaka']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue5202'
    versions = ['Python 3.4']

    @drj11
    Copy link
    Mannequin Author

    drj11 mannequin commented Feb 10, 2009

    When using the wave module to output wave files, the output file cannot
    be a Unix pipeline.

    Example. The following program outputs a (trivial) wave file on stdout:

    #!/usr/bin/env python
    import sys
    import wave
    w = wave.open(sys.stdout, 'w')
    w.setnchannels(1)
    w.setsampwidth(1)
    w.setframerate(32000)
    w.setnframes(0)
    w.close()

    It can create a wave file like this:

    $ ./bugex > foo.wav

    When used in a pipeline we get:

    $ ./bugex | wc
    Traceback (most recent call last):
      File "./bugex", line 9, in <module>
        w.close()
      File 
    "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/wave.py
    ", line 437, in close
        self._ensure_header_written(0)
      File 
    "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/wave.py
    ", line 458, in _ensure_header_written
        self._write_header(datasize)
      File 
    "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/wave.py
    ", line 465, in _write_header
        self._form_length_pos = self._file.tell()
    IOError: [Errno 29] Illegal seek
    Exception exceptions.IOError: (29, 'Illegal seek') in <bound method 
    Wave_write.__del__ of <wave.Wave_write instance at 0x71418>> ignored
           0       1       8

    The wave module has almost all it needs to work around this problem.
    The wave module will only seek the output if it needs to patch the
    header. If you use setnframes to write the correct number of frames
    before writing them with writeframesraw then the header will not be
    patched upon calling close. However...

    The problem is that the "tell" method is invoked on the output stream
    (to record where the header is, in the event that we need to patch it);
    the "tell" method fails with an exception when the output is a pipeline
    (see example, above).

    Exceptions from "tell" when writing the header initially (in
    _write_header) should be ignored. If _patchheader is later invoked it
    will fail due to lack of pos.

    @drj11 drj11 mannequin added type-bug An unexpected behavior, bug, or error stdlib Python modules in the Lib dir labels Feb 10, 2009
    @drj11
    Copy link
    Mannequin Author

    drj11 mannequin commented Feb 10, 2009

    Attached is a patch which is a diff from this version of wave.py :

    http://svn.python.org/view/*checkout*/python/trunk/Lib/wave.py?rev=54394

    @gpolo
    Copy link
    Mannequin

    gpolo mannequin commented Feb 10, 2009

    Wouldn't it be better if you only ignored the 'illegal seek' error
    instead of ignoring any ioerror (should it even be always discarded) ? I
    get a 'bad file descriptor' under Windows 7, but, again, can it be
    always discarded ?

    You can also reproduce the problem without using wave:

    >> import sys
    >> sys.stdout.tell()

    I'm really unsure about the proposed patch.

    @drj11
    Copy link
    Mannequin Author

    drj11 mannequin commented Feb 10, 2009

    On 10 Feb 2009, at 12:28, Guilherme Polo wrote:

    Guilherme Polo <ggpolo@gmail.com> added the comment:

    Wouldn't it be better if you only ignored the 'illegal seek' error
    instead of ignoring any ioerror (should it even be always discarded) ?

    No.

    I
    get a 'bad file descriptor' under Windows 7, but, again, can it be
    always discarded ?

    Yes.

    To expand: Observe that the exception is raised when we are writing
    the header for the first time. The exception is not raised when we
    attempt to seek to patch the header, it is raised when we recording
    the file position so that we can seek to it later.

    We record the file position even though we might not use it later
    (the file position is only needed if we need to patch the header).

    So if we don't need to patch the header, we do not need the file
    position. So we can clearly ignore any error in attempting to get
    the file position.

    If we do need to patch the header, then we need the file position.
    If we do not have the file position (because the earlier attempt to
    get it failed), then patching the header will fail when it attempts a
    seek. This seems reasonable to me.

    You can also reproduce the problem without using wave:

    >>> import sys
    >>> sys.stdout.tell()

    That does not reproduce the problem. The problem is not that tell
    raises an exception, the problem is that tell raises an exception and
    it is only being used to get some information that may be not needed
    later. Therefore the exception should be ignored, and a problem
    should only be raised if it turns out that we did need for
    information that we couldn't get.

    I'm really unsure about the proposed patch.

    Noted.

    I also note that my patch can be improved by removing its last 11 lines.

    @drj11
    Copy link
    Mannequin Author

    drj11 mannequin commented Feb 10, 2009

    On 10 Feb 2009, at 12:28, Guilherme Polo wrote:

    Guilherme Polo <ggpolo@gmail.com> added the comment:

    I'm really unsure about the proposed patch.

    Perhaps my example was too trivial. The point is that if you call
    setnframes then you can get wave.py to avoid patching the header; so
    it does not need to seek on the output file. However, that _still_
    doesn't let you pipe the output, because of the "tell" problem.
    That's what the patch is for.

    Here is a (slightly) less trivial example:

    #!/usr/bin/env python
    import sys
    import wave
    w = wave.open(sys.stdout, 'w')
    w.setnchannels(1)
    w.setsampwidth(1)
    w.setframerate(2000)
    w.setnframes(100)
    for _ in range(50): w.writeframesraw('\x00\xff')
    w.close()

    (The wave file that it outputs is 100ms of 1000 Hz sine wave by the way)

    Note the call to setnframes _before_ the data is written. That's
    what means the header does not need to be patched. With my patch
    applied the output of this program can be fed to a pipe.

    If you remove the call to setnframes then the header will need to be
    patched, and this still (correctly, usefully) raises an error with my
    patch applied.

    @drj11
    Copy link
    Mannequin Author

    drj11 mannequin commented Feb 10, 2009

    On 10 Feb 2009, at 13:02, David Jones wrote:

    I also note that my patch can be improved by removing its last 11
    lines.

    Er, no it can't. What was I thinking?

    @gpolo
    Copy link
    Mannequin

    gpolo mannequin commented Feb 10, 2009

    I see what you want to do, but I fell really uncomfortable by totally
    ignoring IOError. I could get a bad file descriptor under Linux too, and
    I wouldn't like to see it discarded for no reason.

    Now, is there some problem if we remove the calls to the "tell" method
    in _write_header ? See patch attached (tests are very welcome too).

    @drj11
    Copy link
    Mannequin Author

    drj11 mannequin commented Feb 10, 2009

    On 10 Feb 2009, at 16:57, Guilherme Polo wrote:

    Guilherme Polo <ggpolo@gmail.com> added the comment:

    Now, is there some problem if we remove the calls to the "tell" method
    in _write_header ? See patch attached (tests are very welcome too).

    Yes

    @drj11
    Copy link
    Mannequin Author

    drj11 mannequin commented Feb 11, 2009

    On 10 Feb 2009, at 21:15, David Jones wrote:

    David Jones <drj@pobox.com> added the comment:

    On 10 Feb 2009, at 16:57, Guilherme Polo wrote:

    >
    > Guilherme Polo <ggpolo@gmail.com> added the comment:
    >
    > Now, is there some problem if we remove the calls to the "tell"
    > method
    > in _write_header ? See patch attached (tests are very welcome too).

    Yes

    Ahem. Pardon me for answering you without reading your patch. I
    have now read your patch and it does more than just remove the calls
    to "tell". In fact it looks very fine.

    It makes wave.py more like sunau.py in that it "just knows" what the
    offsets into the header are. I think I like that (especially the way
    you use the struct format string to compute the second offset). It
    also removes that nagging question at the back of my mind: "why does
    wave.py use tell when it could simply just know the offsets, which
    are constant anyway?".

    And it works. How cool is that? I had changed my project to use
    sunau anyway, because that worked with pipes already.

    Tests, you say...

    @gpolo
    Copy link
    Mannequin

    gpolo mannequin commented Feb 11, 2009

    Nice.

    I said tests in hope wave gets more tests, since right one there is a
    single one. I will see if I can produce something.

    @drj11
    Copy link
    Mannequin Author

    drj11 mannequin commented Feb 12, 2009

    The following program does a very basic do-i-get-back-what-i-wrote test.
    sunau can't cope; I am investigating.

    #!/usr/bin/env python
    # $Id$
    # Audio File Tests

    import aifc
    import sunau
    import wave
    
    import struct
    import sys
    from StringIO import StringIO
    
    frames = struct.pack('256B', *range(256))
    log = sys.stderr
    
    # Basic test of reproducability.
    # We test that a set of frames (an entirely artifical set, see `frames`, 
    # above) can be written to an audio file and read back again to get the 
    # same set of frames.
    # We test mono/stereo, 8-bit/16-bit, and a few framerates.    
    # As of 2009-02-12 sunau does not pass these tests, so I recommend that 
    # you remove it.
    for af in (aifc, sunau, wave):
      for nchannels in (1, 2):
        for sampwidth in (1, 2):
          for framerate in (11000, 44100, 96000):
            print >> log, "%s %d/%d/%d" % (af.__name__,
              nchannels, sampwidth, framerate)
            f = StringIO()
            w = af.open(f, 'w')
            w.setnchannels(nchannels)
            w.setsampwidth(sampwidth)
            w.setframerate(framerate)
            w.writeframesraw(frames)
            w.close()
            s = f.getvalue()
            f = StringIO(s)
            w = af.open(f)
            assert w.getnchannels() == nchannels
            assert w.getsampwidth() == sampwidth
            assert w.getframerate() == framerate
            assert w.readframes(len(frames)//nchannels//sampwidth) == frames
            assert w.readframes(1) == ''

    @drj11
    Copy link
    Mannequin Author

    drj11 mannequin commented Feb 13, 2009

    On 12 Feb 2009, at 09:00, David Jones wrote:

    David Jones <drj@pobox.com> added the comment:

    The following program does a very basic do-i-get-back-what-i-wrote
    test.
    sunau can't cope; I am investigating.

    I see. sunau uses mu-law compression by default which makes it non-
    invertable. How stupid. Inserting:

             w.setcomptype('NONE', 'Pointless Argument')

    just after setframerate fixes the tests so that all 3 modules pass.

    Of course, this is still only the most very basic test that one might
    want to do. And it doesn't cover the case mentioned in this bug
    report anyway.

    (drat, just found this, should've sent it yesterday)

    @terryjreedy
    Copy link
    Member

    Is this still a problem with 2.7-3.2?
    GP, what state do you think either patch is in?

    @serhiy-storchaka
    Copy link
    Member

    Now, is there some problem if we remove the calls to the "tell" method
    in _write_header ? See patch attached (tests are very welcome too).

    Yes, there is a problem. User can pass already open file to wave.open() and file position can be not 0 at the start of the WAVE file. But you can do with only one tell(). Note a magic number 36 in many places of the code. This is struct.calcsize(wave_header_format).

    Test is needed as well as a documentation change.

    I think this is rather a new feature and should be added only in 3.4. Actually the current behavior is documented: "If *file* is a string, open the file by that name, otherwise treat it as a seekable file-like object."

    @serhiy-storchaka serhiy-storchaka added type-feature A feature request or enhancement and removed type-bug An unexpected behavior, bug, or error labels Feb 14, 2013
    @serhiy-storchaka
    Copy link
    Member

    Here is corrected patch (it uses relative seek()) with a lot of tests.

    @serhiy-storchaka
    Copy link
    Member

    Oh, I forgot attach a patch. In any case it already slightly outdated.

    After looking at other audio modules I think David's approach is better. It is also used in the chunk module. Here is updated patch with tests (tests are not final, bpo-18919 provides better tests).

    1 similar comment
    @serhiy-storchaka
    Copy link
    Member

    Oh, I forgot attach a patch. In any case it already slightly outdated.

    After looking at other audio modules I think David's approach is better. It is also used in the chunk module. Here is updated patch with tests (tests are not final, bpo-18919 provides better tests).

    @serhiy-storchaka serhiy-storchaka self-assigned this Sep 6, 2013
    @serhiy-storchaka
    Copy link
    Member

    Here is simplified and updated to tip patch.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Nov 16, 2013

    New changeset 6a599249e8b7 by Serhiy Storchaka in branch 'default':
    Issue bpo-5202: Added support for unseekable files in the wave module.
    http://hg.python.org/cpython/rev/6a599249e8b7

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Mar 8, 2014

    New changeset b861c7717c79 by R David Murray in branch 'default':
    whatsnew: Wave_write handles unseekable files. (bpo-5202)
    http://hg.python.org/cpython/rev/b861c7717c79

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants