Author Ben.Wolfson
Recipients Ben.Wolfson, eric.araujo, eric.smith, mark.dickinson, petri.lehtinen, r.david.murray
Date 2011-06-03.22:17:15
SpamBayes Score 0.0
Marked as misclassified No
Message-id <>
str.format doesn't intermingle character data and markup. The PEP is quite clear about the terms in this case, at least: the *argument* to str.format consists of character data (passed through unchanged) and markup (processed). That's what it means to say that "Character data is data which is transferred unchanged from the format string to the output string". In "My name is {0}", "My name is " is transferred unchanged from the format string to the output string when the string is formatted. We're talking about how the *markup* is defined.

The current implementation of str.format() finds matched pairs of braces and call what's inside "markup", then parse that markup.

This is false, as I demonstrated.

>>> d = {"{0}": "spam"}
>>> # a matched pair of braces. What's inside is considered markup.
>>> "{0}".format(d)
"{'{0}': 'spam'}"
>>> # a matched pair of braces. Inside is a matched pair of braces, and what's inside of that is not considered markup.
>>> "{0[{0}]}".format(d)

It's also true that other interpretations of the PEP are possible. I'm just not sure the benefit to be gained justifies changing all of the extant str.format() implementations, in addition to explaining the different behavior.

Well, the beauty of it is, you wouldn't have to explain the different behavior, because the patch makes it the case that the explanation already in the documentation is correct. It is currently not correct. That's why I found out about this current state of affairs: I read the documentation's explanation and believed it, and only after digging into the code did I understand the actual behavior.

It is also not a difficult change to make, would be backwards-compatible (anyway I rather doubt anyone was relying on a "{0[:]}".format(whatever) raising an exception [1]), and relaxes a restriction that is not well motivated by the text of the PEP, is not consistently applied in the implementation (see above), and is confusing and limits the usefulness of the format method. It is true that I don't know where else, beyond the implementation in string_format.h, modifications would need to be made, but I'd be willing to undertake the task.

[1] and given that the present implementation does that, it's already noncompliant with the PEP, regardless of what one makes of curly braces.
Date User Action Args
2011-06-03 22:17:16Ben.Wolfsonsetrecipients: + Ben.Wolfson, mark.dickinson, eric.smith, eric.araujo, r.david.murray, petri.lehtinen
2011-06-03 22:17:16Ben.Wolfsonsetmessageid: <>
2011-06-03 22:17:15Ben.Wolfsonlinkissue12014 messages
2011-06-03 22:17:15Ben.Wolfsoncreate