This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Python Tutorial 4.7.1: Need to explain default parameter lifetime
Type: enhancement Stage:
Components: Documentation Versions: Python 3.5
process
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: docs@python Nosy List: docs@python, esegall, rhettinger
Priority: low Keywords:

Created on 2016-04-24 23:27 by esegall, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Messages (4)
msg264135 - (view) Author: Edward Segall (esegall) Date: 2016-04-24 23:27
I am using the tutorial to learn Python. I know many other languages, and I've taught programming language theory, but even so I found the warning in Section 4.7.1 about Default Argument Values to be confusing. After I spent some effort investigating what actually happens, I realized that the warning is incomplete. 

I'll suggest a fix below, after explaining what concerns me. 

Here is the warning in question:

-----------------------------------------------------------------
Important warning: The default value is evaluated only once. This makes a difference when the default is a mutable object such as a list, dictionary, or instances of most classes. ...

def f(a, L=[]):
    L.append(a)
    return L

print(f(1))
print(f(2))
print(f(3))

This will print

[1]
[1, 2]
[1, 2, 3]
-----------------------------------------------------------------

It's clear from this example that values are carried from one function invocation to another. That's pretty unusual behavior for a "traditional" function, but it's certainly not unheard of -- in C/C++/Java, you can preserve state across invocations by declaring that a local variable has static lifetime. When using this capability, though, it's essential to understand exactly what's happening -- or at least well enough to anticipate its behavior under a range of conditions. I don't believe the warning and example are sufficient to convey such an understanding. 

After playing with it for a while, I've concluded the following: "regular" local variables have the usual behavior (called "automatic" lifetime in C/C++ jargon), as do the function's formal parameters, EXCEPT when a default value is defined. Each default value is stored in a location that has static lifetime, and THAT is the reason it matters that (per the warning) the expression defining the default value is evaluated only once. 

This is very unfamiliar behavior -- I don't think I have used another modern language with this feature. So I think it's important that the explanation be very clear. 

I would like to suggest revising the warning and example to something more like the following: 

-----------------------------------------------------------------
Important warning: When you define a function with a default argument value, the expression defining the default value is evaluated only once, but the resultant value persists as long as the function is defined. If this value is a mutable object such as a list, dictionary, or instance of most classes, it is possible to change that object after the function is defined, and if you do that, the new (mutated) value will subsequently be used as the default value.  

For example, the following function accepts two arguments:

def f(a, L=[]):
    L.append(a)
    return L

This function is defined with a default value for its second formal parameter, called L. The expression that defines the default value denotes an empty list. When the function is defined, this expression is evaluated once. The resultant list is saved as the default value for L. 

Each time the function is called, it appends the first argument to the second one by invoking the second argument's append method. 

If we call the function with two arguments, the default value is not used. Instead, the list that is passed in as the second argument is modified. However, if we call the function with one argument, the default value is modified. 

Consider the following sequence of calls. First, we define a list and pass it in each time as the second argument. This list accumulates the first arguments, as follows: 


myList=[]
print(f(0, myList))
print(f(1, myList))

This will print: 

[0]
[0, 1]

As you can see, myList is being used to accumulate the values passed in to the first as the first argument.
 
If we then use the default value by passing in only one argument, as follows:

print(f(2))
print(f(3))

we will see: 

[2]
[2, 3]

Here, the two invocations appended values to to the default list. 

Let's continue, this time going back to myList:

print(f(4,myList))

Now the result will be:

[0, 1, 4]

because myList still contains the earlier values.

The default value still has its earlier values too, as we can see here:

print(f(5))

[2, 3, 5]

To summarize, there are two distinct cases: 

1) When the function is invoked with an argument list that includes a value for L, that L (the one being passed in) is changed by the function. 

2) When the function is invoked with an argument list that does not include a value for L, the default value for L is changed, and that change persists through future invocations. 
-----------------------------------------------------------------

I hope this is useful. I realize it is much longer than the original. I had hoped to make it shorter, but when I did I found I was glossing over important details.
msg264163 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2016-04-25 08:24
> I hope this is useful. I realize it is much longer than the original.

There lies the difficultly.  The purpose of the tutorial is to quickly introduce the language, not to take someone deeply down a rabbit hole.  The docs tend to be worded in an affirmative manner showing cases of the language being used properly, giving a brief warning where necessary.  Further explorations should be left as FAQs.  

I think we could add a FAQ link or somesuch to the existing warning but the main flow shouldn't be interrupted (many on the topics in the tutorial could warrant entire blog posts and scores of StackOverflow entries, but the tutorial itself aspires to be a "short and sweet" quick tour around the language.  

The utility and approachability of the tutorial would be undermined by overexpanding each new topic.  This is especially important in the early sections of the tutorial (i.e. section 4).  

Also, I don't really like the provided explanation, "there are two cases ...".  The actual execution model has one case (default arguments are evaluated once when the function is defined) and there are many ways to use it.

Lastly, this is only one facet of parameter specification and parameter passing.  Other facets include, variable length argument lists, keyword-only arguments, annotations, etc.   Any expanded coverage should occur later in the tutorial and cover all the facets collectively.
msg264189 - (view) Author: Edward Segall (esegall) Date: 2016-04-25 17:08
I agree with most of your points: A tutorial should be brief and should not go down rabbit holes. Expanded discussion of default parameter behavior would probably fit better with the other facets of parameter speceification and parameter passing, perhaps as a FAQ. 

But I also believe a change to the current presentation is needed. Perhaps it would be best to introduce default arguments using simple numerical types, and refer to a separate explanation (perhaps as a FAQ) of the complexities associated with using mutable objects as defaults. 

> Also, I don't really like the provided explanation, "there are two cases ...".  The actual execution model has one case (default arguments are evaluated once when the function is defined) and there are many ways to use it.

The distinction between the two cases lies in storage of the result, not in argument evaluation. In the non-default case, the result is stored in a caller-provided object, while in the default case, the result is stored in a callee-provided object. This results in different behaviors (as the example makes clear); hence the two cases are not the same. 

This distinction is important to new users because it's necessary to think of them differently, and because (to me, at least) one of them is very non-intuitive. In both cases, the change made to the object is a side effect of the function. In the non-default case, this side effect is directly visible to the caller, but in the default case, it is only indirectly visible. 

Details like this are probably obvious to people who are very familiar with both call by object reference and to Python's persistent lifetime of default argument objects, but I don't think that group fits the target audience for a tutorial.
msg264224 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2016-04-26 07:37
Sorry but I'm going to reject this one.  I tried out the text on a Python class that I'm currently teaching and the learners found the text to be clear enough (though some were jarred by the choice of *L* as the variable name) and they all got the essential points (the default variable is evaluated once and what they should do if you don't want the default value to be shared between subsequent calls).
History
Date User Action Args
2022-04-11 14:58:30adminsetgithub: 71029
2016-04-26 07:37:22rhettingersetstatus: open -> closed
resolution: wont fix
messages: + msg264224
2016-04-25 17:08:47esegallsetmessages: + msg264189
2016-04-25 08:24:50rhettingersetpriority: normal -> low
nosy: + rhettinger
messages: + msg264163

2016-04-24 23:27:22esegallcreate