This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author nadeem.vawda
Recipients Michael.Fox, nadeem.vawda, pitrou, rhettinger, serhiy.storchaka, vstinner
Date 2013-05-19.17:52:20
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1368985940.65.0.0946283632411.issue18003@psf.upfronthosting.co.za>
In-reply-to
Content
I agree that making lzma.open() wrap its return value in a BufferedReader
(or BufferedWriter, as appropriate) is the way to go. I'm currently
travelling and don't have my SSH key with me - Serhiy, can you make the
change?

I'll put together a documentation patch that recommends using lzma.open()
rather than LZMAFile directly, and mentions the performance implications.


> Interestingly, opening in text (i.e. unicode) mode is almost as fast as with a BufferedReader:

This is because opening in text mode returns a TextIOWrapper, which is
written in C, and presumably does its own buffering on top of
LZMAFile.read1() instead of calling LZMAFile.readline().


> From my perspective default wrapping with io.BufferedReader is a great
> idea. I can't think of who would suffer. Maybe someone who wants to
> open thousands of simultaneous streams wouldn't appreciate the memory
> overhead. If that person exists then he would want an option to turn
> it off.

If someone doesn't want the BufferedReader/BufferedWriter, they can
create an LZMAFile directly; we don't plan to remove that possibility. So
I don't think that should be a problem.
History
Date User Action Args
2013-05-19 17:52:20nadeem.vawdasetrecipients: + nadeem.vawda, rhettinger, pitrou, vstinner, serhiy.storchaka, Michael.Fox
2013-05-19 17:52:20nadeem.vawdasetmessageid: <1368985940.65.0.0946283632411.issue18003@psf.upfronthosting.co.za>
2013-05-19 17:52:20nadeem.vawdalinkissue18003 messages
2013-05-19 17:52:20nadeem.vawdacreate