Message 152029 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	amcnabb
Recipients	amcnabb
Date	2012-01-26.19:40:47
SpamBayes Score	5.433624e-05
Marked as misclassified	No
Message-id	<1327606849.1.0.885842099772.issue13881@psf.upfronthosting.co.za>
In-reply-to

Content
The stream encoder for the zlib_codec doesn't use the incremental encoder, so it has limited usefulness in practice. This is easiest to show with an example. Here is the behavior with the stream encoder: >>> filelike = io.BytesIO() >>> wrapped = codecs.getwriter('zlib_codec')(filelike) >>> wrapped.write(b'hello') >>> filelike.getvalue() b'x\x9c\xab\x00\x00\x00y\x00y' >>> wrapped.write(b'x') >>> filelike.getvalue() b'x\x9c\xab\x00\x00\x00y\x00yx\x9c\xab\x00\x00\x00y\x00y' >>> However, this is the behavior of the incremental encoder: >>> ienc = codecs.getincrementalencoder('zlib_codec')() >>> ienc.encode(b'x') b'x\x9c' >>> ienc.encode(b'x', final=True) b'\xab\xa8\x00\x00\x01j\x00\xf1' >>> The stream encoder is apparently encoding each write as an individual block, but the incremental encoder buffers until it gets a block that's large enough to be meaningfully compressed. Fixing this may require addressing a separate issue with stream encoders. Unlike with the GzipFile module, closing a stream encoder closes the underlying file. If this underlying file is a BytesIO, then closing makes it free its buffer, making it impossible to get at the completed file.

The stream encoder for the zlib_codec doesn't use the incremental encoder, so it has limited usefulness in practice. This is easiest to show with an example.

Here is the behavior with the stream encoder:

>>> filelike = io.BytesIO()
>>> wrapped = codecs.getwriter('zlib_codec')(filelike)
>>> wrapped.write(b'hello')
>>> filelike.getvalue()
b'x\x9c\xab\x00\x00\x00y\x00y'
>>> wrapped.write(b'x')
>>> filelike.getvalue()
b'x\x9c\xab\x00\x00\x00y\x00yx\x9c\xab\x00\x00\x00y\x00y'
>>>

However, this is the behavior of the incremental encoder:

>>> ienc = codecs.getincrementalencoder('zlib_codec')()
>>> ienc.encode(b'x')
b'x\x9c'
>>> ienc.encode(b'x', final=True)
b'\xab\xa8\x00\x00\x01j\x00\xf1'
>>>

The stream encoder is apparently encoding each write as an individual block, but the incremental encoder buffers until it gets a block that's large enough to be meaningfully compressed.

Fixing this may require addressing a separate issue with stream encoders. Unlike with the GzipFile module, closing a stream encoder closes the underlying file. If this underlying file is a BytesIO, then closing makes it free its buffer, making it impossible to get at the completed file.

History
Date	User	Action	Args
2012-01-26 19:40:49	amcnabb	set	recipients: + amcnabb
2012-01-26 19:40:49	amcnabb	set	messageid: <1327606849.1.0.885842099772.issue13881@psf.upfronthosting.co.za>
2012-01-26 19:40:48	amcnabb	link	issue13881 messages
2012-01-26 19:40:47	amcnabb	create