Title: format_spec for sequence joining
Type: enhancement Stage: resolved
Components: Versions:
Status: closed Resolution: later
Dependencies: Superseder:
Assigned To: Nosy List: eric.smith, r.david.murray, tobia
Priority: normal Keywords:

Created on 2013-02-19 12:40 by tobia, last changed 2013-02-19 13:30 by r.david.murray. This issue is now closed.

Messages (2)
msg182376 - (view) Author: Tobia Conforto (tobia) Date: 2013-02-19 12:40
The format specification mini-language (format_spec) supported by format() and str.format() is a feature that allows passing short options to the classes of the values being formatted, to drive their string representation (__format__ method)

The most common operation done to sequences (lists, tuples, sets...) during conversion to string is arguably the string join operation, possibly coupled with a "nested" string formatting of the sequence items.

I propose the addition of a custom format_spec for sequences, that allows to easily specify a string for the join operation and optionally a nested format_spec to be passed along to format the sequence items.

Here is the proposed addition:

  seq_format_spec  ::= join_string [":" item_format_spec] | format_spec
  join_string      ::= '"' join_string_char* '"' | "'" join_string_char* "'"
  join_string_char ::= <any character except "{", "}", newline, or the quote>
  item_format_spec ::= format_spec

In words, if the format_spec for a sequence starts with a single or double quote, it will be interpreted as a join operation, optionally followed by another colon and the format_spec for the sequnce items.

If the format_spec does not start with ' or ", of if the quote is not balanced (does not appear again in the format_spec), then it's assumed to be a generic format string and the implementation would call super(). This ensures backwards compatibility with existing code that may be using object's __format__ implementation on various sequence objects.

Please note I'm NOT proposing a change in the language or in the implementation of format() and str.format(). This is just the addition of a __format__ method to lists, tuples, sets and other sequence classes. The choice of whether to do that in all those sequence classes or as an addition to object's __format__ is an implementation detail.


Basic usage: either {0:", "} or {0:', '} when used in a format operation will do this: ", ".join(str(x) for x in argument_0) in a more compact, possibly more efficient, and arguably easier to read syntax.

Nested (regular) format_spec: {0:", ":.1f} will join a list of floats using ", " as the separator and .1f as the format_spec for each float.

Nested join format_spec: {0:"\n":", "} will join a list of lists, using "\n" as the outer separator and ", " as the inner separator. This could go on indefinitely (but will rarely need to do so.)

I do not have a patch ready, but I can work on it and submit it for evaluation, if this enhancement is accepted.
msg182379 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2013-02-19 13:30
IMO, this is a python-ideas level suggestion.  Please propose it on that mailing list.  You can reopen the issue if you get a positive response there.
Date User Action Args
2013-02-19 13:30:21r.david.murraysetstatus: open -> closed

nosy: + eric.smith, r.david.murray
messages: + msg182379

resolution: later
stage: resolved
2013-02-19 12:40:02tobiacreate