Issue 45704: string.Formatter.parse does not handle auto-numbered positional fields

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/89867

classification

Title:	string.Formatter.parse does not handle auto-numbered positional fields
Type:	behavior	Stage:
Components:		Versions:	Python 3.11, Python 3.10, Python 3.9, Python 3.8, Python 3.7, Python 3.6

process

Status:	open	Resolution:
Dependencies:		Superseder:
Assigned To:		Nosy List:	SDesch, eric.smith
Priority:	normal	Keywords:

Created on 2021-11-03 11:55 by SDesch, last changed 2022-04-11 14:59 by admin.

Messages (9)
msg405610 - (view)	Author: Sascha Desch (SDesch)	Date: 2021-11-03 11:55
It appears when adding auto-numbered positional fields in python 3.1 `Formatter.parse` was not updated to handle them and currently returns an empty string as the field name. ``` list(Formatter().parse('hello {}')) # [('hello ', '', '', None)] ``` This does not align with `Formatter.get_field` which according to the docs: "Given field_name as returned by parse() (see above), convert it to an object to be formatted." When supplying an empty string to `.get_field()` you get a KeyError ``` Formatter().get_field("", [1, 2, 3], {}). # raises KeyError ```
msg405619 - (view)	Author: Eric V. Smith (eric.smith) *	Date: 2021-11-03 14:00
For reference, the documentation is at https://docs.python.org/3/library/string.html#custom-string-formatting I guess in your example it should return: [('hello ', '0', '', None)]
msg405624 - (view)	Author: Sascha Desch (SDesch)	Date: 2021-11-03 15:39
Yes it should return a string containing the index of the positional argument i.e. `"0"` so that it is compatible with `.get_field()`. Side note: It's a somewhat weird that `.get_field` expects a string while `.get_value` expects an int for positional arguments.
msg405627 - (view)	Author: Eric V. Smith (eric.smith) *	Date: 2021-11-03 16:12
> Side note: It's a somewhat weird that `.get_field` expects a string while `.get_value` expects an int for positional arguments. .parse is just concerned with parsing, so it works on and returns strings. .get_field takes strings because it is the thing that's trying to determine whether or not a field name looks like an integer or not. At least that's how I remember it.
msg405636 - (view)	Author: Sascha Desch (SDesch)	Date: 2021-11-03 18:00
Another thing that occurred to me is the question of what `.parse()` should do when a mix of auto-numbered and manually numbered fields is supplied e.g. `{}{1}`. As of now `.parse()` happily processes such inputs and some other piece of code deals with this and ultimately raises an exception that mixing manual with automatic numbering is not allowed. If `.parse()` supported automatic numbering it would have to be aware of this too I guess?
msg405757 - (view)	Author: Eric V. Smith (eric.smith) *	Date: 2021-11-04 22:32
The more I think about this, the more I think it's not .parse's job to fill in the field numbers, it's the job of whoever is calling it. Just as it's not .parse's job to give you an error if you switch back and forth between numbered and un-numbered fields. It's literally just telling you what's in the string as it breaks it apart, not assigning any further meaning to the parts. I guess I should have called it .lex, not .parse.
msg405796 - (view)	Author: Sascha Desch (SDesch)	Date: 2021-11-05 13:31
That definition of `.parse()` definitely makes sense. Do you then think this is out of scope for `Formatter` in general or just for `.parse()`?. Just for reference, this is what I currently use to get automatic numbering to work for my use case. ``` def parse_command_template(format_string): auto_numbering_error = ValueError( 'cannot switch from automatic field numbering to manual field specification') index = 0 auto_numbering = None for literal_text, field_name, spec, conversion in Formatter().parse(format_string): if field_name is not None: if field_name.isdigit(): if auto_numbering is True: raise auto_numbering_error auto_numbering = False if field_name == '': if auto_numbering is False: raise auto_numbering_error auto_numbering = True field_name = str(index) index += 1 yield literal_text, field_name, spec, conversion ```
msg405802 - (view)	Author: Eric V. Smith (eric.smith) *	Date: 2021-11-05 15:04
I think your code is rational. But since string.Formatter gets such little use, I'm not sure it's worth adding this to the stdlib. On the other hand, it could be used internal to string.Formatter. We'd need to pick a better name, though. And maybe it should return the field_name as an int.
msg405814 - (view)	Author: Eric V. Smith (eric.smith) *	Date: 2021-11-05 18:06
That is, return field_name as an int if it's an int, otherwise as a string.

History
Date	User	Action	Args
2022-04-11 14:59:52	admin	set	github: 89867
2021-11-05 18:06:11	eric.smith	set	messages: + msg405814
2021-11-05 15:04:35	eric.smith	set	messages: + msg405802
2021-11-05 13:31:48	SDesch	set	messages: + msg405796
2021-11-04 22:32:38	eric.smith	set	messages: + msg405757
2021-11-03 18:00:37	SDesch	set	messages: + msg405636
2021-11-03 16:12:53	eric.smith	set	messages: + msg405627
2021-11-03 15:39:46	SDesch	set	messages: + msg405624
2021-11-03 14:00:28	eric.smith	set	nosy: + eric.smith messages: + msg405619
2021-11-03 11:55:47	SDesch	create