classification
Title: Clarify requirements for file-like objects
Type: enhancement Stage:
Components: Documentation Versions: Python 3.4, Python 3.5
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: docs@python Nosy List: benjamin.peterson, docs@python, eric.araujo, ezio.melotti, georg.brandl, hynek, ncoghlan, nikratio, pitrou, r.david.murray, rhettinger, stutzbach
Priority: normal Keywords: patch

Created on 2014-06-14 22:16 by nikratio, last changed 2014-06-17 21:52 by ncoghlan.

Files
File name Uploaded Description Edit
iobase.diff nikratio, 2014-06-14 22:16 review
Messages (15)
msg220588 - (view) Author: Nikolaus Rath (nikratio) * Date: 2014-06-14 22:16
It is currently not perfectly clear what Python (and the standard library) assumes about file-like objects (see e.g. http://article.gmane.org/gmane.comp.python.devel/148199).

The attached doc patch tries to improve the current situation by  stating explicitly that the description of IOBase et al specifies a *mandatory* interface for anything that claims to be file-like.
msg220647 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2014-06-15 15:29
I don't think that's true, though.  "file like" pretty much means "has the file attributes that I actually use".  That is, it is context dependent (duck typing).

I'm also not sure I see the point in the change.  It is inherent in the definition of what ABCs are.  I think the language should be audited for imperative/prescriptive voice, though:

  Flush and close this stream. If called again, do nothing. Once the file is closed, any operation on the file (e.g. reading or writing) should raise a ValueError.

My use of 'should' there might be controversial, though, since in the default implementation 'will' is correct.  If 'will' is kept, then perhaps some variation of your note would be appropriate.
msg220663 - (view) Author: Nikolaus Rath (nikratio) * Date: 2014-06-15 18:50
On 06/15/2014 08:29 AM, R. David Murray wrote:
> I don't think that's true, though.  "file like" pretty much means "has the file attributes that I actually use".  That is, it is context dependent (duck typing).

Well, but when you pass your file-like object to some function from the
standard library, you don't know what file attributes will be used. So
to make sure that things work as expected, you have to make sure that
your file-like object behaves as prescribed by the IOBase* classes.

> I'm also not sure I see the point in the change.  It is inherent in the definition of what ABCs are. 

True. But not everyone reading the io documentation is familiar enough
with ABCs to immediately make that mental translation.
msg220680 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2014-06-15 21:49
Asking the question "Does it quack and walk *enough* like a duck for my
code to work and my tests to pass?" is part of the nature of ducktyping.

ABCs are definitely a useful guide to expectations, but even there it's
possible to lie to the interpreter and have "required" methods that raise
NotImplementedError, or do an explicit registration without implementing
the full interface.
msg220685 - (view) Author: Nikolaus Rath (nikratio) * Date: 2014-06-15 22:39
Maybe I'm missing some important point here, but I think that the documentation ought to tell me how I have to design a file-like object such that it fulfills all expectations of the standard library.

Yes, you can get away with less than that in many situations, but that doesn't mean that the documentation should not tell me about the full set of expectations.
msg220687 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2014-06-16 01:26
[R David Murray]
>I don't think that's true, though.  "file like" pretty much means 
> "has the file attributes that I actually use".  
> That is, it is context dependent (duck typing).

That is pretty much on-target.  

Also, the phrase "file-like" has been used very loosely from Python's inception.  Laying down a "mandatory" specification doesn't match the reality of how the phrase is used or the way our code has been written.

> Maybe I'm missing some important point here

Yes, I think you are.  That is evident in this and your other tracker items whose theme is "there must be precisely documented rules for everything, all expectations, norms, cultural conventions, patterns must be written down, made precise, and enforced, etc".

Before creating more tracker items, please take time to learn about how Python's history, how it is used, and its cultural norms.  In particular, read the Zen of Python, consider what is meant by duck-typing, what is meant by "a consenting adults language", what is meant by over-specification, etc.  Python is quite different from Java in this regard.

In a quest for "tell me exactly what I have to do", I think you're starting to make-up new rules that don't reflect the underlying reality of  
the actual code or its intended requirements.

I recommend this tracker item be closed for the reasons listed by Nick Coghlan and David Murray.  I think the proposed patch doesn't make the docs better, and that it seeks to create new made-up rules rather than documenting the world as it actually exists.

Side-note:  The place to talk about what "file-like" means is the glossary.   The ABC for files and the term "file-like" are related but are not equal.
msg220777 - (view) Author: Nikolaus Rath (nikratio) * Date: 2014-06-16 22:18
On 06/15/2014 06:26 PM, Raymond Hettinger wrote:
> Before creating more tracker items, please take time to learn about how Python's history, 
[...]

It'd be nice if you would have at least followed the link to
http://article.gmane.org/gmane.comp.python.devel/148199 before
commenting. In case you still can't spare the time for that: I
explicitly asked on python-dev about this *before* I opened a tracker item.

Apart from that, I think your remarks are neither appropriate nor do
they belong in the tracker.
msg220830 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2014-06-17 14:07
Nikolaus: while I agree that Raymond's comments were a bit strongly worded, it doesn't read to me as if the thread you link to is on point for this issue.  The thread was focused on a *specific* question, that of calling close twice.  The question of what the docs mean by "a file like object" is a different question.  Specifically, you will note that "duck typing" never came up in that thread (as far as I remember).

As Raymond indicated, a glossary entry would be appropriate, and would reference the ABCs.

This entry already exists; although it is labeled "file object", it mentions that they are also called "file like objects".  It would be appropriate to link mentions of the phrase "file like object" in the docs to this glossary term, if they aren't already.
msg220875 - (view) Author: Nikolaus Rath (nikratio) * Date: 2014-06-17 20:13
"R. David Murray" <report@bugs.python.org> writes:
> R. David Murray added the comment:
>
> Nikolaus: while I agree that Raymond's comments were a bit strongly
> worded, it doesn't read to me as if the thread you link to is on point
> for this issue.  The thread was focused on a *specific* question, that
> of calling close twice.  The question of what the docs mean by "a file
> like object" is a different question.  Specifically, you will note
> that "duck typing" never came up in that thread (as far as I
> remember).

That's quite correct. But did not come up in the original issue report
either - that's why I still don't understand why we're discussing it
here at all.

My line of thought was as follows:

 - Standard library assumes that close() is idempotent in many places
 - Currently this isn't documented clearly
 - The best place to make it more explicit seemed to be the description
   of IOBase
 - However, changing the description of only IOBase.close() could easily
   give the impression that close() is somehow special, and that
   fewer/no assumptions are made in the standard lbirary about the
   presence/behavior of the other methods (e.g. fileno or readable).
 - Therefore, to me the best course of action seemed to add a paragraph
   explicitly describing that the standard library may assume that 
   *any* method/attribute of a stream object behaves as described for
   IOBase.

I still don't see how this contradicts / interacts with the concept of
duck typing. I think the documentation (with and without the patch)
clearly implies that you can implement an object that does not have the
full IOBase API and successfully hand it to some standard library
function - it's just that in that case you can't complain if it
breaks.

Or is the point of contention just the title of this issue? Maybe it was
poorly chosen, I guess "Clarify standard library's expectations from
stream objects" would have been better.

Best,
-Nikolaus

-- 
GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F
Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

             »Time flies like an arrow, fruit flies like a Banana.«
msg220878 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2014-06-17 20:28
Well, but we think it's pretty clear.  The glossary entry says file object interfaces are defined by the classes in the io module.  As do the io module docs.  Perhaps the first sentence of the io docs could be modified to strengthen that (but it already does link to the glossary entry which is explicit that the io module defines the interfaces).

On the other hand, your statement that "few or no other assumptions are made" is, while not accurate, not *wrong*, in the following sense:  *if* a standard library module depends on file attributes and methods, then it expects them to conform to the io module interfaces.  If it does not depend on a particular part of the API, then a duck type class doesn't actually have to implement that API in order to work with that standard library module.  Of course, we might change what elements of the file API are depended on, but hopefully we only do that kind of change in a feature release.  I'm sure we have made such changes while fixing bugs, though, so if you want to be *sure* of forward compatibility, you should indeed implement the full io-specified interface in your file-like class.  Which is one reason the ABCs provide default implementations for most of the methods.  It makes it really easy to build a future-proofed duck type.
msg220879 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2014-06-17 20:32
Re-reading my last paragraph, I see that I'm pretty much agreeing with you.  So the contention is more that we don't think your suggested patch is necessary.  Especially since, unlike your patch wording says, in fact most of the methods *are* implemented in the classes.
msg220880 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-06-17 20:34
Rather than a note, there could be a separate section for guidelines when implementing one's own IO classes. Assuming there's anything significant to put in such a section, that is :-)
msg220883 - (view) Author: Nikolaus Rath (nikratio) * Date: 2014-06-17 20:42
On 06/17/2014 01:28 PM, R. David Murray wrote:
> Well, but we think it's pretty clear.

This wasn't the impression that I had from the thread on python-devel,
but I'll accept your judgement on that. I'll be more restrained when
being asked for suggestions in the future.
msg220888 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2014-06-17 20:59
I believe Antoine was suggesting that you suggest wording that would make it clear (rather than implied) that close was idempotent, but "This method has no effect if the file is already closed" seems pretty unambiguous to me, so I don't really see anything to do there (and presumably neither did you :)  Which means your patch here wasn't really what Antoine was suggesting.

But please don't hesitate to offer improvements in any context, solicited or not.  You just have to be prepared for pushback, because open source :)  And when shifting from mailing list to bug tracker, you may invoke a different audience with different perceptions of the problem.

Now, two alternate suggestions have been made here in reaction to your suggestion: strengthening the "this is also a specification" sense of the first sentence in the IO docs, and writing a separate section on implementing your own IO classes.  You could take a crack at either of those if you like.  Neither of these would have been suggested if you hadn't posted your thoughts on what to do and engaged in this discussion, so it is a positive contribution even if your patch is not accepted.
msg220901 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2014-06-17 21:52
I can add a third suggestion: a "HOWTO" guide on implementing and using
file-like objects. It's actually a somewhat complex topic with various
trade-offs involved, particularly in Python 3 where the differences between
binary and text IO are greater. It could also point users to existing
file-like objects that they may have missed (like the spooled temporary
file support in tempfile).

On the more specific matter at hand, I think "close() is idempotent" is not
only one of our most pervasive assumptions about file-like objects, but
also an assumption we tend to make about resources *in general*.
History
Date User Action Args
2014-06-17 21:52:47ncoghlansetmessages: + msg220901
2014-06-17 20:59:05r.david.murraysetmessages: + msg220888
2014-06-17 20:42:34nikratiosetmessages: + msg220883
2014-06-17 20:34:28pitrousetmessages: + msg220880
2014-06-17 20:32:42r.david.murraysetmessages: + msg220879
2014-06-17 20:28:20r.david.murraysetmessages: + msg220878
2014-06-17 20:13:35nikratiosetmessages: + msg220875
2014-06-17 14:07:14r.david.murraysetmessages: + msg220830
2014-06-16 22:18:35nikratiosetmessages: + msg220777
2014-06-16 01:26:33rhettingersetnosy: + rhettinger
messages: + msg220687
2014-06-15 22:39:22nikratiosetmessages: + msg220685
2014-06-15 21:49:11ncoghlansetmessages: + msg220680
2014-06-15 18:50:41nikratiosetmessages: + msg220663
2014-06-15 15:29:34r.david.murraysetnosy: + r.david.murray
messages: + msg220647
2014-06-14 22:16:50nikratiocreate