Suggested improvements for the Python Language Reference document (version 3.4.0) ===================================================== The quality of this document is unusually high: the author(s) ought to be congratulated! However, there is still room for improvement. What follows are suggestions for some such improvements. (I read the text from https://docs.python.org/3.4/reference ). Section 3.1: the second „CPython implementation detail”: „(ex: always” —> „(e.g., always” Section 3.3.3.1: There are several references to „type()” as if it were a class. This is quite confusing for the reader, and the description of „type()” (to which one can „click through”) does not dispel the confusion. It seems to me that these should be references to the class „type”, not to the function „type()”: >>> type(object) Section 3.3.3.5: „to remember the order that class members were defined” —> „to remember the order in which the class members were defined” (The intention was probably to write „order that class members were defined in”, but grammarians tend to frown at such usage.) Section 4.1: „on the interpreter command line the first argument” —> „…. as the first argument” N.B. This sentence appears to introduce the new notion of a „code block”, but there are no italics, and it is not clear what is the relation of „code block” to „block” (if any). If these are two different notions, then the paragraph should be split in two, and the second one should begin with a proper characterisation of „code block”. If there is only one notion, why use different terms? At least add in the first sentence: „ Moreover, in the following discussion of „scope” it turns out that a block can contain other blocks, but that a contained block is not a part of the containing block. The text would be less confusing if these facts were not just smuggled in. I would suggest introducing the notion of scope nesting explicitly, and then discussing what is, and what is not, „inherited” from the enclosing scope. In the paragraph that introduces the notion of „scope” we read „If a local variable is defined in a block”. But the notion of being „defined” within a block seems not to be explained anywhere. It would seem that „being defined” is the same as „being bound”, but this should be made explicit. In „it does not extend to the code blocks of methods – this includes comprehensions and generator expressions since they are implemented using a function scope” it would perhaps be better to replace „this includes” with „this applies also to”. „a UnboundLocalError exception” —> „an ….” „If a name binding operation occurs anywhere within a code block, all uses of the name within the block are treated as references to the current block.” Surely, not „as references to the current block”, but rather „as references to a variable local to the current block”. But this is not actually true, since we have „global” and „nonlocal”. Some revision seems to be in order. In the next paragraph we have a discussion of „global”. Shouldn’t it be immediately followed by a discussion of „nonlocal”? „If the nearest enclosing scope for a free variable contains a global statement, the free variable is treated as a global.”. This is hard to understand, since the notion of „enclosing scope” (and, indeed, „scope” itself) has not been properly defined: we know only that „a scope defines the visibility of a name within a block”, and then see some particular examples. In the case of free variables these examples do not apply, since a free variable is, by definition, not defined. I would suggest that the general description of „scope” should be clarified, so that we know, e.g., that a scope is always a block or a collection of (nested?) blocks. Section 4.1.1: „If a variable is referenced in an enclosing scope, it is illegal to delete the name.” Surely, it should be „in a scope that encloses the current one”? After all, the innermost scope is also „enclosing” for the variable. It is also not clear what „illegal” means. Is an error raised? In fact, a simple test seems to indicate that such a deletion is simply ignored: >>> def a(): ... x = 1 ... def b(): ... del x ... print(x) ... >>> a() 1 Incidentally, it would appear that the notion of „variable” has also been smuggled in, without a proper introduction. In the beginning we read only about „names”. Are these two notions equivalent? Section 6.1: „operator implementation for built-in types works that way:” —> „… works as follows:” „a string left argument” —> „a string as a left argument” Section 6.2: „Atom” strikes me as a singularly inappropriate name for a list comprehension, an arbitrary expression in parentheses etc. Something like „basic expression” would be much better. (In a grammar for arithmetic expressions this would simply be „factor”.) It would perhaps be best to rename „primaries” to something else („basic expressions” ?) and rename „atoms” to „primaries”. Alternatively, since there are so many kinds of „atoms”, there would be no harm in simply merging „atoms” with „primaries”. Section 6.2.4: „“leak” in the enclosing scope” —> „”leak” into the enclosing scope” Section 6.2.7: „This means that you can specify the same key multiple times in the key/datum list, and the final dictionary’s value for that key will be the last one given.” I am guessing that this design choice was motivated by a desire for general consistency, given the pervasive nature of dictionaries (e.g., an assignment to a simple variable is really just an update to a dictionary). Nevertheless, the reader might be excused for being puzzled: can one even imagine an example where this would be useful? there are certainly common scenarios where a simple mistake, even a typo, would lead to huge confusion and waste of time! surely, it would not be difficult to detect such a repetition as an error? It might therefore be a good idea to explain the rationale. If it turns out that it is not easy to formulate a convincing justification, then … N.B. The last sentence of this section essentially repeats the point: is the repetition intentional? Perhaps it should end the previous paragraph, so that it would apply only to comprehensions, just as the one quoted above applies to key-datum lists? Section 6.2.8: „is called for generator object” —> „is called for the generator object” „in the same fashion as normal generators” —> „… as for normal generators” „See section Calls for the detail.” —> „See section Calls for details.” Section 6.2.9: „to generator’s caller” —> „to the generator’s caller”. Better: „to the caller of the generator”. „as if the yield expression was just another” —> „… were just another” (See, for example, http://www.getitwriteonline.com/archive/073001subjunct.htm ). „after resuming” —> „after resumption” „cannot control where should the execution continue” —> „… where the execution should continue” „yield expressions are allowed” —> „Yield expressions are allowed” (Capitalisation is used in the section heading, so apparently there is no special rule that prevents normal capitalisation of this term.) Section 6.2.9.1: „Note that calling any of the generator methods below when the generator is already executing raises a ValueError exception.” The above may confuse the non-expert. Perhaps it should be made clear that a suspended generator is not „already executing”? An example of such a call might also be helpful (say, from a function that is called by the generator). „the value of the expression_list is returned” —> „the value of the corresponding expression_list is returned”. (The next suspension might be on a different yield expression.) „where generator was paused” —> „where the generator was suspended” „where the generator function was paused” —> „where the generator was suspended” „If the generator function then” —> „If the generator then” (Not all generators are functions.) Section 6.2.9.2: The section heading is preceded by a heading containing only a period. Is this intentional? Section 6.3.1: „This object is then asked to produce the attribute whose name is the identifier (which can be customised by overriding the __getattr__() method).” The above suggests strongly that it is the identifier that can be customised. I would suggest splitting this into two sentences: the second would be something like „(The action of producing the attribute can be customized by overriding the __getattr__() method in (the class of) this object.)”. Alternatively, the phrase in parentheses could be simply deleted. Section 6.3.2: „that supports subscription, e.g. a list” —> „that supports subscription, e.g., a list” It is a general rule that „e.g.” (as well as „i.e.”) should always be followed by a comma (unless it’s immediately followed by a colon), because its expansion would have to be followed by a comma. I won’t mention it again, but suggest a global search/replace. „expression (list)” —> „expression list” (in the interest of consistency) „a nonnegative integer less than” —> „a nonnegative integer smaller than” „subclasses overriding this method will need to explicitly add that support”. Why „subclasses”, and not „classes”? But the whole phrase is too colloquial, and — if read literally — somewhat illogical. I would suggest: „programmers who override this method must explicitly add such support”. Section 6.3.3: The grammar allows „a[:]”. We learn here that this is equivalent to „a[None:None:None]”, and might well wonder what this would mean. The description of slice objects in the section on „Standard type hierarchy” does not explain the semantics. So it might be advisable to explain it here, or give a reference to the section in which it is explained. Section 6.3.4: „A call calls a callable object”. This is perfectly all right, but it does look funny and somewhat tautological. How about „A call invokes …”? „A trailing comma may be present after the positional and keyword arguments but does not affect the semantics.” The sentence is ambiguous: can a comma be present after positional arguments if there are no keyword arguments? If so, why „and” rather than „or”? But then, can the extra comma be present _between_ positional and keyword arguments? A look at the grammar explains everything: the remark should read simply „A trailing comma after the argument list does not affect the semantics.”. Unless I am missing something, there is no explanation of the meaning of a comprehension in lieu of an argument list. One assumes this is equivalent to the list of arguments obtained by „expanding” the comprehension, but this should be made clear. „The code block for the function is executed, passing it the argument list.” This is not very good English (who is doing the passing?). Perhaps something like „is executed, with access to the argument list” ? Section 6.6: „The numeric arguments are first converted to a common type.” This is repeated several times. But what are the rules of such conversion? If they are described elsewhere, please give a reference. If not, please describe them. (The rules are not self-evident, since integers are not limited to 64 bits.) „The % (modulo) operator yields the remainder from the division” —> „… floor division” „must either both be numbers or both sequences” —> „… or both be sequences” Section 6.9: „Comparison of objects of the differing types” —> „Comparison of objects of differing types” „whether a the dictionary” —> „whether the dictionary” Section 6.10: „Because not has to invent a value anyway, it does not bother to return a value of the same type as its argument”. This is a non-sequitur: the fact that it does not bother is not implied by the fact that it has to invent. I would suggest simply: „The operator ‘not’ need not return a value of the same type as its argument” — the example that follows makes it all clear. (BTW, „so e.g.,” —> „so, e.g.,”, because we would have to write „so, for example,”.) Section 6.11: „Conditional expressions (sometimes called a “ternary operator”) have the lowest priority of all Python operations.” There are two semantic problems here (an expression is not an operator; operations do not have priority in this sense: operators do), as well as a syntactic one (plural-singular). I would suggest: „The ternary operator ‘… if … else …’ has the lowest priority of all Python operators.” „The expression x if C else y first evaluates the condition, C (not x);”. I have puzzled over this for a few minutes, because I took „C ( not x )” for the evaluated expression (notice the confusing use of italics). I would suggest making it more clear, as follows (also getting rid of the colloquial but imprecise statement that an expression evaluates something): „Evaluation of the expression x if C else y begins with an evaluation of the condition, C;”. This, and what follows, makes the confusing „(not x)” quite unnecessary. Section 6.12: „Lambda expressions (sometimes called lambda forms) have the same syntactic position as expressions.” I think this remark is more confusing than explanatory. Syntactic „position”? Do you mean „status”? I would suggest deleting it, so the description would begin with: „A lambda expression is a shorthand for an anonymous function.” Please note that it is much more convenient not to begin a definition with a plural noun. Note also that „is a shorthand to create” is not idiomatic English. It might be a good idea to explain the subtle reasons that underlie the introduction of „lambda_expr_noncond” and „expression_non_cond”. The grammar looks a little weird, and it is not immediately obvious why it is constructed in this way: I did not find the discussion in PEP 308 sufficiently illuminating.  Section 6.14: „Python evaluates expressions from left to right. Notice that while evaluating an assignment, the right-hand side is evaluated before the left-hand side.” First we are told that something happens; then we are asked to „notice” that it does not. Wouldn’t it be better to write as follows? „Python generally evaluates expressions from left to right; however, during the evaluation of an assignment the right-hand side is evaluated before the left-hand side.” If this were the only exception to the rule, one would write: „Python evaluates expressions from left to right, except during the evaluation of an assignment, when the right-hand side is evaluated before the left-hand side.” However, this is not the only exception: there are also conditional expressions. So does this statement really explain anything? Besides, shouldn’t it be just about the order of evaluation of subexpressions in a single expression? Presumably not: after all, an assignment is not an expression! But we certainly cannot take it literally, for control flow is not linear: so what exactly is the scope of this rule? The conclusion seems to be that the passage should be formulated along the following lines: „Section 6.14. Order of evaluation. The subexpressions of an expression are evaluated from left to right, except in the case of conditional expressions and when evaluation is omitted (see Section 6.10).” This is immediately followed by the example („In the following lines…”). The example is followed by: „Notice that during the evaluation of an assignment the right-hand side is evaluated before the left-hand side.” Section 6.15: „Operators in the same box group left to right (except for comparisons, including tests, which all have the same precedence and chain from left to right …” Is there some subtle difference between „grouping left to right” and „chaining from left to right”? Please explain. BTW: „including tests” seems quite redundant, since they are classified as comparisons in the table. But if membership tests behave as comparisons, and „chain from left to right”, then the implementation seems to be at odds with the documentation! Is the third example below an error? >>> 1 in [1,2] True >>> True in [True] True >>> 1 in [1,2] in [True] False >>> (1 in [1,2]) in [True] True These are results from an implementation distributed with MacPorts: Python 3.4.0 (default, Mar 24 2014, 19:35:39) [GCC 4.2.1 Compatible Apple LLVM 5.1 (clang-503.0.38)] on darwin Section 7: „Simple statements are comprised within a single logical line.” Again, an ambiguity (+ syntactic infelicity) that is easily avoided by following a simple rule: don’t use a plural noun to represent the general case. Compare this with: „A simple statement is comprised within a single logical line.” This is not enough, because „comprised within” is hardly idiomatic English. We probably want: „A simple statement must fit within a single logical line.” (It does not „consist” of a single line, because a logical line may contain several simple statements, separated by semicolons.) Section 7.2: „(See section Primaries for the syntax definitions for the last three symbols.)” Not any longer: apparently ‘”*” target’ was added later. So: „(See section Primaries for the syntax definitions of attributeref, subscription and slicing.)” „If the target list is a single target: The object is assigned to that target.” It might be a good idea to add something like: „A single target cannot be prefixed by an asterisk.” (it would be too messy to modify the grammar to reflect this.) „thus changing the length of the target sequence, if the object allows it”. Shouldn’t it be „if the target allows it” ? „in the current implementation, the syntax for targets is taken to be the same as for expressions” Well, not quite, since there are no starred expressions. „WARNING: … overlaps …”. The remark is well-placed, but i don’t think it’s appropriate to talk about „safety”: this is just a straightforward consequence of the general rules. Perhaps something along the following lines would be better: „NOTE: For cases where the same variable occurs both on the left and the right side of an assignment, please recall the order of evaluation described in Sec. 6.14. In particular, a, b = b, a swaps two variables, while the following program prints [0, 2]:” Section 7.2.1: „for the syntax definitions for the last three symbols” —> „… of the last three symbols” ? „which, unlike normal assignment statements, cannot be an unpacking” There are three forms of target that are not allowed here: presumably, „an unpacking” refers to all of them, but the term has not been defined. Also, it should be „unlike in normal assignment statements”. It might also be a good idea to point out a little more explicitly that the LHS is evaluated before the RHS, unlike in a normal assignment. Section 7.5: It should be mentioned somewhere that none of the elements of the target list can be a starred target. Section 7.6: „If an expression list is present, it is evaluated, else None is substituted.” Substituted for what? Perhaps: „… else it is assumed to be None.” ? Section 7.11: import_stmt ::= "import" module ["as" name] ( "," module ["as" name] )* | "from" relative_module "import" identifier ["as" name] ( "," identifier ["as" name] )* … Some additional indentation would be helpful: import_stmt ::= "import" module ["as" name] ( "," module ["as" name] )* | "from" relative_module "import" identifier ["as" name] ( "," identifier ["as" name] )* … or even import_stmt ::= "import" module ["as" name] ( "," module ["as" name] )* | "from" relative_module "import" identifier ["as" name] ( "," identifier ["as" name] )* … This is a general remark that applies not only to this section. The first step for the basic form should be terminated with a semicolon (just as in the list for the „from” form). „just as though the clauses” —> „just as if the clauses” „individiual” —> „individual” „The details of the first step, finding and loading modules is described” —> „The details of the first step, finding and loading modules, are described” „find the module specified in the from clause loading” —> „…. clause, loading” The punctuation in the sublist is inconsistent: the third subpoint is terminated with a period. Suggest always using a semicolon, except for a period in the _last_ point. In the examples, the verb „bound” seems to be used somewhat inconsistently with the rest of the manual. For example, instead of „foo.attr bound as attr” we should probably have „attr bound to foo.attr”. „The wild card form of import — import * — is only allowed at the module level.” But seems not to be allowed by the grammar! Is this an old feature that has been removed from the language? „applications that determine which modules need to be loaded dynamically” In this formulation „dynamically” qualifies „loaded”, rather than „determine”. Also, such use of „need”, while common enough in US usage, is quite suspect (how can a module need something?). So I would suggest: „applications that must dynamically determine which modules should be loaded” or „applications in which the modules to be loaded can be determined only at runtime” Section 7.11.1: „… before the release in which the feature becomes standard” The noun „release” is very often used as a near-synonym of „version”. Here it is used to describe an event, but „of what” is missing . In both interpretations the reader feels that something is missing. I would therefore suggest clarifying it a little: „… before the release of the version in which the feature becomes standard” or „… while the version in which the feature becomes standard is not yet released” „Names listed in a global statement must not be used in the same code block textually preceding that global statement.” This is very stilted. How about: „If a name is listed in a global statement within a block, then it cannot occur in the text of that block before the global statement.” BTW, it would be nice to know what happens if the constraint is violated. The current implementation gives just a warning, which corresponds to a rather weak interpretation of the phrase „must not”. ;-) „The current implementation does not enforce the latter two restrictions” —> „… the last two restrictions”. But it is, in fact, not clear whether this refers to the previous two paragraphs, or to the last two items in the previous paragraph. Section 7.13: „The nonlocal statement causes the listed identifiers to refer to previously bound variables in the nearest enclosing scope.” Again, we have some confusion between „identifier” and „variable”. They seem to be synonymous here, but why use both? But here is a much more important point: shouldn’t it be „in the nearest of those enclosing scopes that binds this variable; however, it is an error if this nearest scope is global.” ? Consider the following two examples: >>> def a(): ... def b(): ... def c(): ... nonlocal x; ... x = 'c' ... c() ... x = 'a' ... b() ... return x ... >>> a() 'c' >>> x = 'global' >>> def a(): ... nonlocal x; ... x = 'a' ... File "", line 2 SyntaxError: no binding for nonlocal 'x' found „unlike to those” —> „unlike those” Section 8: „Compound statements consist of one or more ‘clauses.’” How can several compound statements consist of one clause? Again, use the singular: „A compound statement consists of …” „Only the latter form of suite” —> „Only the latter form of a suite” „the semicolon binds tighter than” —> „the semicolon binds more tightly than” „so that in the following example, either all or none of the print() calls are executed” —> „so that, in the following example, either all or none of the print() calls are executed” But it would be better to write: „so that either all or none of the print() calls are executed in the following example” Section 8.3: „for each item provided by the iterator, in the order of ascending indices”. Is the notion of „ascending indices” defined for all kinds of iterators? It would help if we had a link to a description of iterators. „A continue statement executed in the first suite skips the rest of the suite and continues with the next item, or with the else clause if there was no next item.” —> „… there is no next item.” „The suite may assign to the variable(s) in the target list; this does not affect the next item assigned to it.” I have no idea what this means. Was the intention to write something like „… ; this does not affect the manner in which items are assigned to the target list at the beginning of the next iteration.” ? „Names in the target list are not deleted when the loop is finished, but if the sequence is empty, it will not have been assigned to at all by the loop.” The _sequence_ will not have been assigned to? Presumably it should be „they will not have been assigned to” (i.e., the names). Section 8.4: „are stored in the sys module and can be access via” —> „… accessed via” „sys.exc_info() values are restored to their previous values (before the call) when returning from a function that handled an exception.” There are three problems here. The first is linguistic: a value cannot be restored to its previous value. The second is also linguistic: it is not the values that are returning from a function. These two problems can be solved by rephrasing, e.g., „Upon return from a function that handled an exception, sys.exc_info() is restored to a state in which it will return its previous value (before the call).” Which brings us to the third problem. The phrase „function that handled an exception” seems ambiguous: are we talking about a function that was invoked in the suite of an exception handler, or the function that contains the exception handler? Presumably the latter, since the suite of an exception handler need not contain function calls; but a little later we are told that „[t]he exception information is not available to the program during execution of the finally clause.”, so this is not at all clear. Moreover, the function that actually handles the exception may be different from the one that contains the try clause (if none of the exception handlers is appropriate). So does „(before the call)” refer to an invocation of the function that handled the exception, or an invocation of the one that contains the try statement that raised it? It might perhaps be a good idea to describe the mechanism more directly and in a little more detail instead. I _guess_ this passage is an attempt to describe what happens when an exception handler invokes (directly or indirectly) a function that contains exception handlers of its own and happens to raise another exception: the syx,exc_info() mechanism maintains a stack that allows us to obtain access to information about the „current” exception. But this is just a wild guess, the text is quite unclear: it attempts to squeeze too much information into too small a container. Section 8.6: parameter_list ::= (defparameter ",")* ( "*" [parameter] ("," defparameter)* ["," "**" parameter] | "**" parameter | defparameter [","] ) the „(” that begins the second line appears to have no corresponding „)”. At first I thought the „(„ should be a „|”. A slight improvement in the formatting would make things much more clear. parameter_list ::= (defparameter ",")* ( "*" [parameter] ("," defparameter)* ["," "**" parameter] | "**" parameter | defparameter [","] ) Section 9.2: Almost all the bulleted items in this manual are not terminated by any punctuation marks: here, they are terminated by semicolons (which is to be preferred, I think). But if so, the last one should be terminated by a period.