Python 2.5a0 Tutorial errors and observations Michael R Bax, bax3 at bigfoot.com March 2005 This is an updated list in response to Python 2.5a0. I recently read through the Python Tutorial for 2.5a0 thoroughly and in detail from the perspective of a complete Python novice. (I was one recently!) I took careful notes and made a number of observations which I hope can contribute to the documentation project, especially with reference to the novice reader. Please find attached my list of errors and observations. Each entry shows a section (where entering a new one) and a brief extract, then makes a correction (technical or grammatical) or observation. I hope it is of use. There may be a handful of corrections previously listed that have already been made in the 2.5 documentation, but I think I removed all of those. Do let me know if there's a question about any grammar or style points, or anything else that might be of assistance. Cheers Michael -------8<--------8<--------8<--------8<--------8<--------8<--------8<------- Front Matter even every commonly used feature -- commonly-used 1. Python allows you to split up your program in modules -- split your program into modules Python allows writing -- Python enables the writing of libraries that may only be available in binary form -- what about open-source libraries that are written only for C/C++? the best way to learn a language is using it, -- is to use it 2.1 called with standard input connected to a tty device -- with no arguments and with standard input (tty input with a filename argument is a different case) 2.2.1 normal output from the executed commands -- normal output from executed commands Typing an interrupt while a command is executing raises the KeyboardInterrupt exception -- it raises KeyboardInterrupt in *both* cases -- you can't "type" an interrupt 3 -- Any reason SPAM and STRING are uppercase? Other variables below like width and height are not. 3.1.2 >>> word[:1] = 'Splat' -- This is trying to assign 5 letters to one? Would word[:2] = 'Hi' make more sense as an attempted assignment? 3.1.3 When a Unicode string is printed, written to a file, or converted with str(), conversion takes place using this default encoding. -- This is not true; printing does not appear to convert: >>> print u"äöü" äöü >>> str( u"äöü") Traceback (most recent call last): File "", line 1, in ? UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-2: ordinal not in range(128) -- What is str() for? 3.2 It differs from just writing the expression you want to write -- What expression do "I want" to write? Unclear. -- Replace with "you enter"? 4.3 exactly the legal indices for items of a sequence of length 10 -- delete "exactly"? What else could it be -- "approximately"? :-) 4.6 The method append() shown in the example, is defined for list -- delete the comma (grammar) 4.2/4.6 -- Why does 4.2 not explain the method a.insert, whereas the result.append is fully explained in 4.6? 4.7 -- Why are some code segments shown with >>> prompts, and some without? 4.7.2 print "if you put", voltage, "Volts through it." -- "Volts" should not start with an uppercase letter as it is not a proper noun 4.7.5 a few features commonly found in functional programming languages and Lisp -- isn't Lisp a functional programming language? -- replace "and" with "like" 4.7.6 There are emerging conventions about the content -- conventions for the content This is done using the following convention. The first -- convention: the first its indentation is not apparent in the string literal -- apparent within the string literal 5.1 If no index is specified, a.pop() returns the last item in the list. The item is also removed from the list. -- a.pop() removes and returns the last item in the list. ##5.1.3 ##Combining these two special cases, we see that "map(None, list1, list2)" is a convenient way of turning a pair of lists into a list of pairs # -- Shouldn't one rather use zip()? 5.1.3 filter(function, sequence)" returns a sequence (of the same type, if possible) -- How could this ever be impossible? For example, to compute some primes -- to list some primes -- this is *not* an algorithm for prime computation! :-) 5.1.4 List comprehensions are much more flexible than map() and can be applied to functions with more than one argument and to nested functions: -- But map() can be applied to functions with multiple arguments (with multiple sequences)! -- suggest replacement "functions with non-sequence arguments" 5.2 There is a way to remove an item from a list given its index instead of its value: the del statement. -- How is this different to pop? 5.3 Sequence unpacking requires that the list of variables on the left have the same number of elements as the length of the sequence -- requires that the list of variables on the left has (grammar) -- requires the list of variables on the left to have (alternate) 5.4 >>> fruits = set(basket) -- the correct term for a collection of apples, oranges, etc, is "fruit" (a singular collective noun) 5.5 since lists can be modified in place using their append() and extend() methods -- and insert(), remove(), pop(), sort(), reverse(), del()... -- in any case, index or slice notation seems more intuitively associated with modification 5.6 -- zip() needs more explanation -- the print '%s' % x syntax is not explained -- xrange() is not explained 5.7 Comparisons may be combined by the Boolean operators and and or -- combined using the (style) In general, the return value of a short-circuit operator, when used as a general value and not as a Boolean, is the last evaluated argument. -- rewrite: When used as a general value and not as a Boolean, the return value of a short-circuit operator is the last evaluated argument. 5.8 Some examples of comparisons between sequences with the same types: -- Shouldn't this be "of the same type"? -- 1 and 1.0 are not of the same type! 6.1 -- Footnote 6.1 really applies to the *first* sentence, not the third. 6.1.1 Actually, modules are searched in the list of directories given by the variable sys.path which is initialized from the directory containing the input script (or the current directory), PYTHONPATH and the installation-dependent default. -- Rewrite: In general, modules are actually searched in the list of directories given by the variable sys.path (initialized to the directory containing the input script or the current directory), before looking in PYTHONPATH and the installation-dependent default. 6.2 either for efficiency or to provide access to operating system primitives such as system calls -- A module can call bundled binary code -- why does access to OS primitives require being built in? It seems efficiency is the *only* criterion here (excluding neatness). The variable sys.path is a list of strings that determine the interpreter's search path for modules. -- that determines (it is the list, singular, that is the subject) It is initialized to a default path taken from the environment variable PYTHONPATH, or from a built-in default if PYTHONPATH is not set. -- Incorrect. PYTHONPATH only forms *part* of the path (on Windows, at minimum); see 6.1.1. 6.3 ['__name__', 'fib', 'fib2'] -- The real dir(fibo) output is: ['__builtins__', '__doc__', '__file__', '__name__', 'fib', 'fib2'] ['__builtins__', '__doc__', '__file__', '__name__', 'fib', 'fib2'] -- The real dir() output is: ['__builtins__', '__doc__', '__name__', 'a', 'fib', 'fibo', 'sys'] -- Why import __builtin__ explicitly when one can just dir(__builtins__)? 6.4 The __init__.py files are required to make Python treat the directories as containing packages; this is done to prevent directories with a common name, such as "string", from unintentionally hiding valid modules that occur later on the module search path. -- Why would Python confuse a directory name with a module name? If not, how does this confusion take place? (Do you mean a non-package directory "string" could be seen as a package, obscuring a module "string"?) 7.1 # reverse quotes are convenient in interactive sessions: -- If using `` is discouraged, why mention it? print repr(x).rjust(2), repr(x*x).rjust(3), -- Why print repr() instead of str()? "x.ljust( n)[:n]".) -- extra/missing space next to first n >>> for name, phone in table.items(): -- Why not use table.iteritems()? 7.2 but it'll corrupt binary data like that in JPEGs or .EXE files. -- JPEG and EXE are rendered to different font sizes in HTML 8.2 stack backtrace -- The tutorial calls it a backtrace (twice); Python calls it a traceback. Why the confusion? 8.3 print "Oops! That was no valid number. Try again..." -- One space after "!", two spaces after "." -- why inconsistent? An except clause may name multiple exceptions as a parenthesized list, for example: -- Don't you mean tuple?! The except clause may specify a variable after the exception name (or list). -- Don't you mean tuple?! 8.4 If you need to determine whether an exception was raised but don't intend to handle it, a simpler form of the raise statement allows you to re-raise the exception: -- Raise was discussed several times in 8.3. Why not move switch sections 8.3 and 8.4? After all, an exception must be raised before it can be handled! 8.5 -- Shouldn't the chapter on classes come before the one on exceptions? This part would make more sense after learning about classes. print 'My exception occurred, value:', e.value -- Why not exploit the __str__ function that was defined? Print e, not e.value. 8.6 regardless of whether or not the use of the resource was successful. -- Delete "or not" (verbose) (because it would be unclear which clause should be executed) -- rewrite: (because it would be unclear which clause should be executed first) 9 a derived class can override any methods of its base class or classes, a method can call the method of a base class with the same name. -- classes, and a method (last phrase in a list) There are no special constructors or destructors. -- What about __init__, or is that a "normal" constructor? 9.2 Otherwise, all variables found outside of the innermost scope are read-only. -- Explain what happens when you try to assign to a read-only variable? Outside of functions -- Outside functions (grammar) Usually, the local scope references the local names of the (textually) current function. -- How else (other than textually) could a function theoretically be current (i.e., why is the word "textually" here at all)? 9.3.1 (the one in effect just before the class definitions was entered) -- Grammar, plural does not match previous singular: class definition was entered 9.3.2 Many classes like to create objects in a known initial state. -- But an empty object *is* in a known initial state; clarify. >>> class Complex: -- Is it wise to introduce a class that duplicates something built into the language? Novices might remember the wrong one! :-) 9.3.3 The other kind of instance attribute references is a method. -- rewrite singular: instance attribute reference In our example, this will return the string 'hello world'. -- This will fail: the most recent instance bound to x was Complex, not MyClass! 9.3.4 Usually, a method is called immediately: -- Replace "immediately" with "directly"? Elapsed time surely has nothing to do with this. a method object is created by packing (pointers to) the instance object and the function object just found together in an abstract object: -- Why bother packing the instance pointer? Surely that is redundant, since the instance is referenced in the method call: x.f()? 9.4 Conventionally, the first argument of a method is often called self. -- is called self. (redundancy?) 9.5 The syntax for a derived class definition looks as follows: -- grammar: looks like this -- or grammar: is as follows Instead of a base class name, an expression is also allowed. -- Instead? Surely a base class name is itself an expression? -- Surely modname.basename *is* a base class name, just fully qualified? it is searched in the base class -- grammar: it is sought in the base class the corresponding class attribute is searched, -- grammar: the corresponding class attribute is sought, a method of a base class that calls another method defined in the same base class, may in fact end up calling a method of a derived class that overrides it. -- grammar: delete the comma Note that this only works if the base class is defined or imported directly in the global scope. -- Surely something like modname.BaseClassName.methodname(self,arguments) would work if it is not directly in the global scope? 9.5.1 A class definition with multiple base classes looks as follows: -- grammar: looks like this -- or grammar: is as follows The only rule necessary to explain the semantics is the resolution rule used for class attribute references. -- This applies only to method references, right? (Since data references are "defined" only when acted on.) What is the precedence for __init__ functions? 9.6 9.6 Private Variables -- Shouldn't this be Private Identifiers? Is a method is a variable? and even variables stored in instances. private to this class on instances of other classes. -- rewrite: instances private to this class or instances of other classes. 9.8 raise Class, instance -- What's the point? In the example thereafter, these three have the same effect: raise c() raise B, c() raise instance -- wasn't this "new form" covered in 8.5 (raise MyClass(2*2))? class B: pass class C(B): -- this won't execute if pasted into a Python console, due to the lack of a blank line between classes. This is true of other examples in the tutorial. -- Why classes B, C and D, not A, B and C? When an error message is printed for an unhandled exception which is a class -- ? An exception IS a class: >>> KeyboardInterrupt -- What is an example of an exception that is NOT a class? When an error message is printed for an unhandled exception which is a class, the class name is printed, then a colon and a space, and finally the instance converted to a string using the built-in function str(). -- Unless the class is derived from Exception! 9.9 By now, you've probably noticed that most container objects can looped over using a for statement: -- Style: By now you have probably -- Actually, the only cases previously mentioned in this way have been sequences. Dictionaries used .keys(), and files used readline() and readlines(). for key in {'one':1, 'two':2}: print key -- This is not intuitive to guess! :-) for line in open("myfile.txt"): print line -- Why wasn't this mentioned in 7.2? This example shows how it all works: -- Traceback reveals non-default shell; replace with console trackback? 9.11 sum(i*i for i in range(10)) -- Why does this work?! The only parentheses are those required by sum() and range(); sum[i*i for i in range(10)] fails! unique_words = set(word for line in page for word in line.split()) -- does not work in isolation, requires support code >>> valedictorian = max((student.gpa, student.name) for student in graduates) -- does not work in isolation, requires support code 10.1 the builtin open() function which operates much differently. -- grammar: very differently 10.7 the internet -- the Internet import urllib2 -- What's wrong with urllib? server.sendmail('soothsayer... -- Caesar misspelt in parameter 2 -- Does not show empty dictionary {} returned by server.sendmail command 10.10 the relative performance between different approaches -- grammar: relative performance of different approaches the profile and pstats modules -- pstats hyperlink missing 10.11 self.assertRaises(TypeError, average, 20, 30, 70) -- Braces missing: self.assertRaises(TypeError, average, (20, 30, 70) ) 11.1 >>> locale.format("%d", x, grouping=True) Does not group digits with commas as shown in the example (locale is set the same!) >>> locale.format("%s%.*f", (conv['currency_symbol'], ... conv['int_frac_digits'], x), grouping=True) -- mixing regular currency and international currency locale conventions -- use currency_symbol and frac_digits -- or use int_currency_symbol and int_frac_digits 11.2 The makes it possible to substitute custom templates for XML files, plain text reports, and HMTL web reports. -- Grammar: This makes it... -- HTML is misspelt. 11.3 crc32, comp_size, uncomp_size, filenamesize, extra_size = fields -- extra space after equals sign 11.4 Threading is a technique for decoupling tasks which are not sequentially dependent. -- Not sure about this sentence. If tasks are not sequentially dependent, then they already *are* decoupled -- just not executing in parallel. -- Suggest replacing "decoupling" with "simultaneously performing" So, the preferred approach to task coordination -- Grammar: delete "So" and then using the Queue module -- consistency: and then use the Queue module 11.8 the user expects the results to match calculations done by hand -- What is the difference between this and control of precision? the two digit multiplicands -- 1.05 is a *three* digit multiplicand A available in the Unix and CygWin versions of the interpreter -- Correctly spelt "Cygwin" or the Tk-based environment, IDLE, distributed with Python -- Is the name IDLE considered parenthetical? If not, remove commas. The command line history recall which operates within DOS boxes on NT and some other DOS and Windows flavors -- Suggest "within console windows" rather than DOS boxes, as the command console is more than just a DOS box. A.2 Any line in the history buffer can be edited; an asterisk appears in front of the prompt to mark a line as modified. -- When I press Ctrl-P and type to modify the previous line, no asterisk appears (Solaris and FreeBSD). Is this not editing a line in the history buffer? A.3 # Note that PYTHONSTARTUP does *not* expand "~", so you have to put in the # full path to your home directory. -- Bash *will* expand the ~ to the full path. Why is the atexit statement separated from the save_history definition by an if statement? Surely the if statement should precede the definition? A.4 The completion mechanism might use the interpreter's symbol table. -- Isn't this what the current mechanism does (readline.parse_and_bind('tab: complete'))? B The Python prompt (implicitly) uses the builtin -- Why "implicitly"? How could it be different "explicitly"? this is not a bug in Python, it is not a bug in your code either -- insert "and" after comma [grammar] >>> print str(0.1) -- Strictly speaking, this is redundant: print already uses str()! Another consequence is that since 0.1 is not exactly 1/10, adding 0.1 to itself 10 times may not yield exactly 1.0, either: -- In fact, this is even stranger: 0.1 is represented as a number > 0.1, but adding it 10 times gives a total < 10! -- Adding "0.1 to itself" 10 times gives 0.2 ten times; suggest "summing 10 values of 0.1" instead. B.1 Representation error refers to that some (most, actually) decimal fractions -- refers to the fact that [grammar] Python floats to IEEE-754 "double precision". 754 doubles -- Change to "IEEE-754 doubles". >>> 2L**53 >>> 2L**56/10 >>> .1 * 2L**56 >>> 7205759403792794L * 10L**30 / 2L**56 -- The "L" extension does not appear to be necessary for the input -- why is it used? C The socket module uses the functions, getaddrinfo, and getnameinfo -- remove comma after "functions" [grammar] D byte code is also cached in the .pyc and .pyo files -- Delete "the" (compilation from source to byte code can be saved) -- rewrite: (recompilation from source to byte code can be avoided) coercion -- extra carriage return before definition complex number -- extra carriage return before definition By actually importing the __future__ module -- rewrite: By importing the __future__module itself LBYL -- wrong position (should appear before "list comprehension") as well asnested namespaces -- missing space which modules implement a function -- rewrite: which module implements a function not required be backward compatible -- "not required to be"