This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author nbargnesi
Recipients nbargnesi
Date 2013-07-11.19:34:01
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1373571241.88.0.314815867736.issue18430@psf.upfronthosting.co.za>
In-reply-to
Content
Using existing file objects as arguments to the open functions in the gzip, bz2, and lzma libraries can cause the underlying fileobj position to get changed - and not quite in ways one would expect.

Calling peek against the returned file objects -- gzip.GzipFile, bz2.BZ2File, and lzma.LZMAFile will in one scenario advance the position of the supplied file object:

    >>> import bz2
    >>> fileobj = open('test.bz2', mode='rb')
    >>> bzfile = bz2.open(fileobj)
    >>>
    >>> # file positions at 0
    >>> assert fileobj.tell() == 0, bzfile.tell() == 0
    >>>
    >>> bzfile.peek()
    b'Test file.\n'
    >>> fileobj.tell()
    52

If after the initial peek, we rewind the underlying fileobj and peek again, the behavior changes:

    >>> fileobj.seek(0)
    0
    >>> bzfile.peek()
    b'Test file.\n'
    >>> fileobj.tell()
    0

The second scenario serves to complicate things a bit with the change in behavior.

I would be less surprised if the module documentation simply stated the affect on file position of the file object being used. Though it would be beautiful if the underlying file object didn't change at all. The latter seems possible since the three modules know whether the file object is seekable.

The gzip and lzma modules exhibit similar behavior - gzip for example:

    >>> import gzip
    >>> fileobj = open('test.gz', mode='rb')
    >>> gzfile = gzip.open(fileobj)
    >>>
    >>> # file positions at 0
    >>> assert fileobj.tell() == 0 and gzfile.tell() == 0
    >>>
    >>> gzfile.peek(1)
    b'Test file.\n'
    >>> fileobj.tell()
    36
    >>>
    >>> # rewind, and do it again
    >>> fileobj.seek(0)
    >>>
    >>> gzfile.peek(1)
    b'Test file.\n'
    >>> fileobj.tell()
    0
History
Date User Action Args
2013-07-11 19:34:01nbargnesisetrecipients: + nbargnesi
2013-07-11 19:34:01nbargnesisetmessageid: <1373571241.88.0.314815867736.issue18430@psf.upfronthosting.co.za>
2013-07-11 19:34:01nbargnesilinkissue18430 messages
2013-07-11 19:34:01nbargnesicreate