Message 189676 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	Michael.Fox
Recipients	Arfrever, Michael.Fox, nadeem.vawda, pitrou, rhettinger, serhiy.storchaka, vstinner
Date	2013-05-20.16:50:52
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<CABbL6oadh1YxXradN47kKrCudfH9yCTtL4YLpTzLXKQuG3kkSQ@mail.gmail.com>
In-reply-to	<CABbL6oZRuVE7WznVmKyjbSMNbFcjpr660PJpgZWGij_c7VHZtA@mail.gmail.com>

Content
I thought of an even more hazardous case: if compression == 'gz': import gzip open = gzip.open elif compression == 'xz': import lzma open = lzma.open else: pass On Mon, May 20, 2013 at 9:41 AM, Michael Fox <report@bugs.python.org> wrote: > > Michael Fox added the comment: > > You're right. In fact, what doesn't make sense is to be doing > line-oriented reads on a binary file. Why was I doing that? > > I do have another quibble though. The open() function is like this: > > open(file, mode='r', buffering=-1, encoding=None, > errors=None, newline=None, closefd=True, opener=None) -> file object > > The lzma.open() function is like this: > > lzma.open = open(filename, mode='rb', *, format=None, check=-1, > preset=None, filters=None, encoding=None, errors=None, newline=None) > > It seems to me that it would be best for them to be as congruent as > possible. Because people will try to do this (I did): > > if filename.endswith('.xz'): > f = lzma.open(filename) > else: > f = open(filename) > for line in f: ... > > And then they will be in for a surprise. Would you consider changing > the default mode of lzma.open() to 'rt' and implement the 'buffering' > parameter as it is implemented in open()? And further, can we discuss > whether "duck typing" is becoming generally problematic in an > expanding standard library and whether there should be some process, > language, testing or something to ensure the ducks really quack the > same? > > For example, there could be a standard testsuite which everything > purporting to implement an open() function should be subject to. > > On Mon, May 20, 2013 at 7:42 AM, Nadeem Vawda <report@bugs.python.org> wrote: >> >> Nadeem Vawda added the comment: >> >> No, that is the intended behavior for binary streams - they operate at >> the level of individual byes. If you want to treat your input file as >> Unicode-encoded text, you should open it in text mode. This will return a >> TextIOWrapper which handles the decoding and line splitting properly. >> >> ---------- >> >> _______________________________________ >> Python tracker <report@bugs.python.org> >> <http://bugs.python.org/issue18003> >> _______________________________________ > > -- > > - > Michael > > ---------- > > _______________________________________ > Python tracker <report@bugs.python.org> > <http://bugs.python.org/issue18003> > _______________________________________ -- - Michael

I thought of an even more hazardous case:

if compression == 'gz':
    import gzip
    open = gzip.open
elif compression == 'xz':
    import lzma
    open = lzma.open
else:
    pass

On Mon, May 20, 2013 at 9:41 AM, Michael Fox <report@bugs.python.org> wrote:
>
> Michael Fox added the comment:
>
> You're right. In fact, what doesn't make sense is to be doing
> line-oriented reads on a binary file. Why was I doing that?
>
> I do have another quibble though. The open() function is like this:
>
> open(file, mode='r', buffering=-1, encoding=None,
>          errors=None, newline=None, closefd=True, opener=None) -> file object
>
> The lzma.open() function is like this:
>
> lzma.open = open(filename, mode='rb', *, format=None, check=-1,
> preset=None, filters=None, encoding=None, errors=None, newline=None)
>
> It seems to me that it would be best for them to be as congruent as
> possible. Because people will try to do this (I did):
>
> if filename.endswith('.xz'):
>     f = lzma.open(filename)
> else:
>     f = open(filename)
> for line in f: ...
>
> And then they will be in for a surprise. Would you consider changing
> the default mode of lzma.open() to 'rt' and implement the 'buffering'
> parameter as it is implemented in open()? And further, can we discuss
> whether "duck typing" is becoming generally problematic in an
> expanding standard library and whether there should be some process,
> language, testing or something to ensure the ducks really quack the
> same?
>
> For example, there could be a standard testsuite which everything
> purporting to implement an open() function should be subject to.
>
> On Mon, May 20, 2013 at 7:42 AM, Nadeem Vawda <report@bugs.python.org> wrote:
>>
>> Nadeem Vawda added the comment:
>>
>> No, that is the intended behavior for binary streams - they operate at
>> the level of individual byes. If you want to treat your input file as
>> Unicode-encoded text, you should open it in text mode. This will return a
>> TextIOWrapper which handles the decoding and line splitting properly.
>>
>> ----------
>>
>> _______________________________________
>> Python tracker <report@bugs.python.org>
>> <http://bugs.python.org/issue18003>
>> _______________________________________
>
> --
>
> -
> Michael
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue18003>
> _______________________________________

-- 

-
Michael

History
Date	User	Action	Args
2013-05-20 16:50:52	Michael.Fox	set	recipients: + Michael.Fox, rhettinger, pitrou, vstinner, nadeem.vawda, Arfrever, serhiy.storchaka
2013-05-20 16:50:52	Michael.Fox	link	issue18003 messages
2013-05-20 16:50:52	Michael.Fox	create