Author Michael.Fox
Recipients Arfrever, Michael.Fox, nadeem.vawda, pitrou, rhettinger, serhiy.storchaka, vstinner
Date 2013-05-20.16:50:52
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <CABbL6oadh1YxXradN47kKrCudfH9yCTtL4YLpTzLXKQuG3kkSQ@mail.gmail.com>
In-reply-to <CABbL6oZRuVE7WznVmKyjbSMNbFcjpr660PJpgZWGij_c7VHZtA@mail.gmail.com>
Content
I thought of an even more hazardous case:

if compression == 'gz':
    import gzip
    open = gzip.open
elif compression == 'xz':
    import lzma
    open = lzma.open
else:
    pass

On Mon, May 20, 2013 at 9:41 AM, Michael Fox <report@bugs.python.org> wrote:
>
> Michael Fox added the comment:
>
> You're right. In fact, what doesn't make sense is to be doing
> line-oriented reads on a binary file. Why was I doing that?
>
> I do have another quibble though. The open() function is like this:
>
> open(file, mode='r', buffering=-1, encoding=None,
>          errors=None, newline=None, closefd=True, opener=None) -> file object
>
> The lzma.open() function is like this:
>
> lzma.open = open(filename, mode='rb', *, format=None, check=-1,
> preset=None, filters=None, encoding=None, errors=None, newline=None)
>
> It seems to me that it would be best for them to be as congruent as
> possible. Because people will try to do this (I did):
>
> if filename.endswith('.xz'):
>     f = lzma.open(filename)
> else:
>     f = open(filename)
> for line in f: ...
>
> And then they will be in for a surprise. Would you consider changing
> the default mode of lzma.open() to 'rt' and implement the 'buffering'
> parameter as it is implemented in open()? And further, can we discuss
> whether "duck typing" is becoming generally problematic in an
> expanding standard library and whether there should be some process,
> language, testing or something to ensure the ducks really quack the
> same?
>
> For example, there could be a standard testsuite which everything
> purporting to implement an open() function should be subject to.
>
> On Mon, May 20, 2013 at 7:42 AM, Nadeem Vawda <report@bugs.python.org> wrote:
>>
>> Nadeem Vawda added the comment:
>>
>> No, that is the intended behavior for binary streams - they operate at
>> the level of individual byes. If you want to treat your input file as
>> Unicode-encoded text, you should open it in text mode. This will return a
>> TextIOWrapper which handles the decoding and line splitting properly.
>>
>> ----------
>>
>> _______________________________________
>> Python tracker <report@bugs.python.org>
>> <http://bugs.python.org/issue18003>
>> _______________________________________
>
> --
>
> -
> Michael
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue18003>
> _______________________________________

-- 

-
Michael
History
Date User Action Args
2013-05-20 16:50:52Michael.Foxsetrecipients: + Michael.Fox, rhettinger, pitrou, vstinner, nadeem.vawda, Arfrever, serhiy.storchaka
2013-05-20 16:50:52Michael.Foxlinkissue18003 messages
2013-05-20 16:50:52Michael.Foxcreate