Author Ben.Wolfson
Recipients Ben.Wolfson, eric.araujo, eric.smith, mark.dickinson, petri.lehtinen, r.david.murray
Date 2011-06-03.17:08:59
SpamBayes Score 1.66533e-16
Marked as misclassified No
Message-id <>
Hm. As I interpret this:

    The str.format() function will have
    a minimalist parser which only attempts to figure out when it is
    "done" with an identifier (by finding a '.' or a ']', or '}',

The present implementation is at variance with both the documentation *and* the PEP, since the present implementation does not in fact figure out when it's "done" with an identifier that way. However, this statement is actually a very thin reed on which to make any decisions: a real authority shouldn't say "etc." like that! And, of course, we have to add an implicit "depending on what it's currently looking at" to the parenthetical, because the two strings "{0[a.b]}" and "{0[a].b}" are, and should be, treated differently. In particular, although one could "find" a '.' in the element_index in the former string, the "minimalist parser" should not (and does not) conclude that it's done with the identifier *there*:

>>> "{0[a.b]}".format({"a.b":1})

Instead it treats the '.' as just another character with no particular syntactic significance, the same way it does 'a' and 'b'. It's a shame that the PEP doesn't go into more detail than it does about this sort of thing.

The same should go for '}', when we're looking at an element_index field. It should be treated as just another character with no particular syntactic significance. At present that is not the case:

>>> "{0[a}b]}".format({"a}b":1})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: Missing ']' in format string

If the attached patch were used, the above expression would evaluate to '1' just as did the first one. Now, given the fact that the PEP actually says quite little about how this sort of thing is to be handled, and given (as demonstrated above with the case of the '.' character) that we can't take the little list it gives as indicating when it's done with an identifier regardless of context, I don't think this change would constitute a change *to the specification*; it does, admittedly, constitute an interpretation of the specification, but then, so does the present implementation, and the present implementation is at variance with the PEP *anyway*, as regards the characters ':' and '!'.

The paragraph prior to the one quoted by R. David Murray reads:

    Because keys are not quote-delimited, it is not possible to
    specify arbitrary dictionary keys (e.g., the strings "10" or
    ":-]") from within a format string.

I take it that this means (in the first place) that, because a sequence of digits is interpreted as a number, the following will fail:


And indeed it does. The second example is rather unfortunate, though: is the reason one can't use that key because it contains a colon? Or because it contains a right square bracket? Even if the present patch is accepted one couldn't use a right square bracket, since a parser that could figure out where to draw the lines in something like this:

'{0[foo ] bar]}'

would not be very minimalist. However, as I have noted previously, there is no reason to rule out colons and exclamation points in the element_index field. The PEP doesn't actually take up this question in detail. (It hardly does so at all.) However, according to what I think the most reasonable interpretation of the PEP is, the present implementation is at variance with the PEP. The present implementation is certainly at variance with the documentation, which represents to some extent an interpretation and specification of the PEP. 

Consequently, to the extent that changing a specification requires discussion on python-dev, it seems to me that the present implementation is already a de facto change to the specification, while accepting the attached patch would bring the implementation into *greater* accord with the specification---so that (to conclude cheekily) *not* accepting the patch is what should require discussion on python-dev. However, if it is thought necessary, I'll be happy to start the discussion.
Date User Action Args
2011-06-03 17:09:01Ben.Wolfsonsetrecipients: + Ben.Wolfson, mark.dickinson, eric.smith, eric.araujo, r.david.murray, petri.lehtinen
2011-06-03 17:09:01Ben.Wolfsonsetmessageid: <>
2011-06-03 17:09:00Ben.Wolfsonlinkissue12014 messages
2011-06-03 17:08:59Ben.Wolfsoncreate