This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author Michael.Fox
Recipients Arfrever, Michael.Fox, nadeem.vawda, pitrou, rhettinger, serhiy.storchaka, vstinner
Date 2013-05-20.16:41:58
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <CABbL6oZRuVE7WznVmKyjbSMNbFcjpr660PJpgZWGij_c7VHZtA@mail.gmail.com>
In-reply-to <1369060963.08.0.432691945326.issue18003@psf.upfronthosting.co.za>
Content
You're right. In fact, what doesn't make sense is to be doing
line-oriented reads on a binary file. Why was I doing that?

I do have another quibble though. The open() function is like this:

open(file, mode='r', buffering=-1, encoding=None,
         errors=None, newline=None, closefd=True, opener=None) -> file object

The lzma.open() function is like this:

lzma.open = open(filename, mode='rb', *, format=None, check=-1,
preset=None, filters=None, encoding=None, errors=None, newline=None)

It seems to me that it would be best for them to be as congruent as
possible. Because people will try to do this (I did):

if filename.endswith('.xz'):
    f = lzma.open(filename)
else:
    f = open(filename)
for line in f: ...

And then they will be in for a surprise. Would you consider changing
the default mode of lzma.open() to 'rt' and implement the 'buffering'
parameter as it is implemented in open()? And further, can we discuss
whether "duck typing" is becoming generally problematic in an
expanding standard library and whether there should be some process,
language, testing or something to ensure the ducks really quack the
same?

For example, there could be a standard testsuite which everything
purporting to implement an open() function should be subject to.

On Mon, May 20, 2013 at 7:42 AM, Nadeem Vawda <report@bugs.python.org> wrote:
>
> Nadeem Vawda added the comment:
>
> No, that is the intended behavior for binary streams - they operate at
> the level of individual byes. If you want to treat your input file as
> Unicode-encoded text, you should open it in text mode. This will return a
> TextIOWrapper which handles the decoding and line splitting properly.
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue18003>
> _______________________________________

-- 

-
Michael
History
Date User Action Args
2013-05-20 16:41:58Michael.Foxsetrecipients: + Michael.Fox, rhettinger, pitrou, vstinner, nadeem.vawda, Arfrever, serhiy.storchaka
2013-05-20 16:41:58Michael.Foxlinkissue18003 messages
2013-05-20 16:41:58Michael.Foxcreate