Issue 7951: Should str.format allow negative indexes when used for __getitem__ access?

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/52199

classification

Title:	Should str.format allow negative indexes when used for __getitem__ access?
Type:	enhancement	Stage:	resolved
Components:	Documentation	Versions:	Python 3.11

process

Status:	closed	Resolution:	wont fix
Dependencies:		Superseder:
Assigned To:	terry.reedy	Nosy List:	Ilya Kamenshchikov, Todd.Rovito, docs@python, eric.araujo, eric.smith, flox, gosella, iritkatriel, kisielk, marco.buttu, mark.dickinson, mrabarnett, rhettinger, terry.reedy
Priority:	normal	Keywords:	easy, patch

Created on 2010-02-17 23:54 by eric.smith, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
format_negative_indexes-2.7.diff	gosella, 2010-06-18 19:50	patch against trunk
format_negative_indexes-3.2.diff	gosella, 2010-06-18 19:52	patch against 3.2
format_no_fields_with_negative_indexes-2.7.diff	gosella, 2010-06-25 18:48	Don't allow negative fields
7951NegativeIndexesForStringFormat3dot4.patch	Todd.Rovito, 2013-04-20 03:06		review

Messages (33)
msg99482 - (view)	Author: Eric V. Smith (eric.smith) *	Date: 2010-02-17 23:54
It surprised me that this doesn't work: >>> "{0[-1]}".format('fox') Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: string indices must be integers I was expecting it to be equivalent to: >>> "{0[2]}".format('fox') 'x' I don't think there's any particular reason this doesn't work. It would, however break the following code: >>> "{0[-1]}".format({'-1':'foo'}) 'foo' But note that this doesn't work currently: >>> "{0[1]}".format({'1':'foo'}) Traceback (most recent call last): File "<stdin>", line 1, in <module> KeyError: 1
msg99553 - (view)	Author: Matthew Barnett (mrabarnett) *	Date: 2010-02-19 01:43
On a related note, this doesn't work either: >>> "{-1}".format("x", "y", "z") Traceback (most recent call last): File "<pyshell#3>", line 1, in <module> "{-1}".format("x", "y", "z") KeyError: '-1' It could return "z". It also rejects a leading '+', but that would be optional anyway.
msg107766 - (view)	Author: Eric V. Smith (eric.smith) *	Date: 2010-06-13 23:49
Closed issue 8985 as a duplicate of this; merging nosy lists.
msg107776 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2010-06-14 10:57
I (reluctantly) agree it's surprising that "{0[-1]}".format(args) fails. And I suppose that if it were allowed then it would also make sense to consider "{-1}".format(args) as well, in order to preserve the equivalence between "{n}".format(args) and "{0[n]}".format(args). And then: >>> "{-0}".format(['calvin'], {'-0': 'hobbes'}) 'hobbes' would presumably produce 'calvin' instead of 'hobbes'... On '+': if "{0[-1]}" were allowed, I'm not sure whether the "+1" in "{0[+1]}".format(...) should also be interpreted as a list index. I don't really see the value of doing so apart from syntactic consistency: there are very few other places in Python that I'm aware of that accept -<one-or-more-digits> but not +<one-or-more-digits>. FWIW, my overall feeling is that the current rules are simple and adequate, and there's no great need to add this complication. I do wonder, though: How complicated would it be to make "{0[1]}".format({'1':'foo'}) a bit magical? That is, have the format method pass an integer to __getitem__ if the corresponding format argument is a sequence, and a string argument if it's a mapping (not sure what the criterion would be for distinguishing). Is this too much magic? Is it feasible implementation-wise? I don't think it's do-able for simple rather than compound field names: e.g., "{0}".format(args, *kwargs), since there we've got both a sequence and* a dict, so it's not clear whether to look at args[0] or kwargs['0']. (Unless either args or kwargs is empty, perhaps.) This is all getting a bit python-ideas'y, though. BTW, I notice that PEP 3101's "Simple field names are either names or numbers [...] if names, they must be valid Python identifiers" isn't actually true: >>> "{in-valid #identifier}".format(**{'in-valid #identifier': 42}) '42' Though I don't have a problem with this; indeed, I think this is preferable to checking for a valid identifier.
msg107781 - (view)	Author: Eric V. Smith (eric.smith) *	Date: 2010-06-14 11:23
Addressing just the last part of Mark's message right now: The PEP goes on to say: Implementation note: The implementation of this proposal is not required to enforce the rule about a simple or dotted name being a valid Python identifier. ... I rely on getattr lookup failing for dotted names, but for simple names there's no check at all. I agree it's desirable to leave this behavior.
msg107792 - (view)	Author: Matthew Barnett (mrabarnett) *	Date: 2010-06-14 15:30
Re: msg107776. If it looks like an integer (ie, can be converted to an integer by 'int') then it's positional, otherwise it's a key. An optimisation is to perform a quick check upfront to see whether it starts like an integer.
msg107793 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2010-06-14 15:32
Matthew: would that include allowing whitespace, then? >>> int('\t\n+56') 56
msg107801 - (view)	Author: Matthew Barnett (mrabarnett) *	Date: 2010-06-14 17:02
That's a good question. :-) Possibly just an optional sign followed by one or more digits. Another possibility that occurs to me is for it to default to positional if it looks like an integer, but allow quoting to force it to be a key: >>> "{0}".format("foo", {"0": "bar"}) 'foo' >>> "{'0'}".format("foo", {"0": "bar"}) 'bar' Or is that taking it too far?
msg107811 - (view)	Author: Eric V. Smith (eric.smith) *	Date: 2010-06-14 21:20
I can see the point of allowing negative indices for a consistency point, but is there really any practical problem that's currently causing people hardship that this would solve? As for the rest of it, I think it's just not worth the additional burden on CPython and other implementations.
msg107845 - (view)	Author: Matthew Barnett (mrabarnett) *	Date: 2010-06-15 01:25
Your original: "{0[-1]}".format('fox') is a worse gotcha than: "{-1}".format('fox') because you're much less likely to want to do the latter. It's one of those things that it would be nice to have fixed, or we could just add a warning to the documentation that it _might_ be fixed in the future, so people shouldn't rely on the current behaviour. :-)
msg108132 - (view)	Author: Germán L. Osella Massa (gosella)	Date: 2010-06-18 19:50
I finally managed to get the time to finish the patch that allows negative indexes inside square brackets so now they work with the same semantics as in a python expression: >>> '{0[-1]}'.format(['abc', 'def']) 'def' >>> '{0[-2]}'.format(['abc', 'def']) 'abc' >>> '{0[-1][-1]}'.format(['abc', ['def']]) 'def' They work auto-numbered fields too: >>> '{[-1]}'.format(['abc', 'def']) 'def' Also, a positive sign is now accepted as part of a valid integer: >>> '{0[+1]}'.format(['abc', 'def']) 'def' As a bonus, negatives indexes are also allowed to refer to positional arguments: >>> '{-1}'.format('abc', 'def') 'def' >>> '{-2}'.format('abc', 'def') 'abc' I'm attaching a patch against trunk. I added some tests for this functionality in test_str.py. By the way, this code doesn't work anymore: >>> "{[-1]}".format({'-1': 'X'}) Traceback (most recent call last): File "<stdin>", line 1, in <module> KeyError: -1L But now it behaves in the same way as: >>> "{[1]}".format({'1': 'X'}) Traceback (most recent call last): File "<stdin>", line 1, in <module> KeyError: 1L I didn't attempt to ignore whitespaces when trying to parse the index as an integer (to allow that "{ 0 }" can be treated as "{0}" and "{0[1]}" as "{ 0 [ 1 ] }") because I'm not sure if this behavior is desirable.
msg108133 - (view)	Author: Germán L. Osella Massa (gosella)	Date: 2010-06-18 19:55
I forgot to mention that I also made a patch against py3k (was the same code).
msg108143 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2010-06-18 21:38
Perhaps this ought to be discussed on python-ideas or python-dev for a bit. It is not entirely clear that this is a GoodThingToDo(tm) nor is it clear that we want other Python implementations to have to invest the same effort. The spirit of the language freeze suggests that we shouldn't add this unless we really need it. The goal was to let other implementations catch up, not to add to their list of incompatabilites.
msg108144 - (view)	Author: Eric V. Smith (eric.smith) *	Date: 2010-06-18 21:47
I agree with Raymond. I'm not convinced it allows you to write any code that you can't currently write, and I'm fairly sure it violates the moratorium. Implementing this would clearly put a burden on other implementations. Marking as "after moratorium".
msg108472 - (view)	Author: Kamil Kisiel (kisielk)	Date: 2010-06-23 18:40
While I agree this functionality isn't strictly necessary I think it makes sense from a semantic point of view. I ran in to this issue today while writing some code and I simply expected the negative syntax to work, given that the format string syntax otherwise very closely resembles standard array and attribute access. It would be nice to see this make it in eventually for consistency's sake.
msg108617 - (view)	Author: Germán L. Osella Massa (gosella)	Date: 2010-06-25 18:48
Well, using negative indexes for fields can be thought as a new feature with all the consequences mentioned before BUT negative indexes for accessing elements from a sequence, IMHO, is something that anyone would expected to work. That's why at first I thought it was a bug and I fill an issue about it. The code that parses the fields and the indexes is the same, so when I change it to accept negative indexes, it worked for both cases. I'm attaching a patch that checks if a negative index is used in a field and reverts to the old behavior in that case, allowing only negative indexes for accessing sequences ( "{-1}" will raise KeyError because it will be threated as '-1'). Perhaps in this way this issue could be partially fixed.
msg113447 - (view)	Author: Terry J. Reedy (terry.reedy) *	Date: 2010-08-09 18:42
I believe this is covered by the PEP3003 3.2 change moratorium.
msg113620 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2010-08-11 19:41
Fixing-up str formatting idiosyncracies does not fall under the moratorium and is helpful in getting 3.x to be usable. That being said, I'm not convinced that this is actually a helpful feature. Not all objects supporting __getitem__ offer support for negative indexing. Also, there's a case to be made that using negative indices in a formatting string is an anti-pattern, causing more harm than good.
msg113624 - (view)	Author: Matthew Barnett (mrabarnett) *	Date: 2010-08-11 20:01
I agree with Kamil and Germán. I would've expected negative indexes for sequences to work. Negative indexes for fields is a different matter.
msg115981 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2010-09-09 23:55
After more thought, I'm -1 on this. "Consistency" is a weak argument in favor of this. We need to be more use case drivenm and it there is no evidence that this is needed. Also, there is a reasonable concern that using negative indices in a format string would be a bad design pattern that should not be encouraged by the language. And, there is a maintenance burden (just getting it right in the first place; having other implementations try to get it right; and a burden to custom formatters to have to support negative indices). I do think we have a documentation issue. This thread shows a number of experienced Python programmers who get "surprised" or perceive "consistency issues" perhaps because there isn't a clear mental picture of Python's layer structure (separation of concerns) and where the responsibility lies for the supporting negative indices. For the record, here are a few notes on where negative index handling fits into the hierarchy: Negative index support is not guaranteed by the collections.Sequence ABC nor by the grammar (see the "subscript" rule in Grammar/Grammar). It does not appear in opcode handling (see BINARY_SUBSCR in Python/ceval.c) nor in the top abstract layer (see PyObject_GetItem() in abstract.c). Instead, the support for slicing and negative index handling appears at the concrete layer (see listindex() in Objects/listobject.c for example). We do guarantee negative index handling for builtin sequences and their subclasses (as long as they don't override __getitem__), and we provide a fast path for their execution (via an intermediate abstract layer function, PySequence_GetItem() in Objects/abstract.c), but other sequence-like objects are free to make their own decisions about slices and negative indices at the concrete layer. Knowing this, a person should not be "surprised" when one sequence has support for negative indices or slicing and another does not. The choice belongs to the implementer of the concrete class, not to the caller of "a[x]". There is no "consistency" issue here. IOW, we're not required to implement negative slice handling and are free to decide whether it is a good idea or not for the use-case of string formatting. There is some question about whether it is a bad practice for people to use negative indices for string formatting. If so, that would be a reason not to do it. And if available, it would only work for builtin sequences, but not sequence like items in general. There is also a concern about placing a burden on other implementations of Python (to match what we do in CPython) and on placing a burden on people writing their own custom formatters (to closely as possible mimic builtin formatters). If so, those would be reasons not to do it. my-two-cents, Raymond
msg116244 - (view)	Author: Éric Araujo (eric.araujo) *	Date: 2010-09-12 23:02
Thank you for the detailed argument, Raymond. I’m +1 on turning this into a doc bug.
msg187404 - (view)	Author: Todd Rovito (Todd.Rovito) *	Date: 2013-04-20 03:06
Here is a simple patch that simply explains negative indexes and negative slices are not supported for the string format documentation. Perhaps more documentation needs to be created else where to help explain why all collections do not need to support negative indexes and negative slices? If so please let me know and I will create it. But I think this patch at least clarifies for the use case of String format.
msg190102 - (view)	Author: Mark Lawrence (BreamoreBoy) *	Date: 2013-05-26 17:34
Todd's patch strikes me as fine. If something more detailed is needed I think it would be better to raise a separate issue.
msg215958 - (view)	Author: Terry J. Reedy (terry.reedy) *	Date: 2014-04-12 02:40
Either leading sign, '+' or '-', cause string interpretation, so I think 'unsigned integer' should be the term in the doc. >>> '{0[-1]}'.format({'-1': 'neg int key'}) 'neg int key' >>> '{0[+1]}'.format({'+1': 'neg int key'}) 'neg int key' >>> '{0[+1]}'.format([1,2,3]) Traceback (most recent call last): File "<pyshell#16>", line 1, in <module> '{0[+1]}'.format([1,2,3]) TypeError: list indices must be integers, not str
msg216038 - (view)	Author: Terry J. Reedy (terry.reedy) *	Date: 2014-04-13 23:43
The doc bug is that the grammar block uses 'integer' (linked to https://docs.python.org/3/reference/lexical_analysis.html#grammar-token-integer) in arg_name ::= [identifier \| integer] element_index ::= integer \| index_string when it should use 'decimalinteger' or even more exactly 'digit+'. The int() builtin uses the same relaxed rule when no base is given. >>> 011 SyntaxError: invalid token >>> int('011') 11 >>> '{[011]}'.format('abcdefghijlmn') 'm' One possibity is to replace 'integer' in the grammar block with 'digit+' and perhaps leave the text alone. Another is to replace 'integer' with 'index_number', to go with 'index_string, and add the production "index_number ::= digit+". My though for the latter is that 'index_number' would connect better with 'number' as used in the text. A further option would be to actually replace 'number' in the text with 'index_number'. PS to Todd. As much as possible, doc content changes should be separated from re-formatting. I believe the first block of your patch is purely a re-format
msg225505 - (view)	Author: Mark Lawrence (BreamoreBoy) *	Date: 2014-08-18 19:50
msg216038 suggests three options for the doc patch, does anybody have any preference or a better alternative?
msg266481 - (view)	Author: Marco Buttu (marco.buttu) *	Date: 2016-05-27 06:45
The error message is misleading: >>> s = '{names[-1]} loves {0[1]}' >>> s.format(('C', 'Python'), names=('Dennis', 'Guido')) Traceback (most recent call last): ... TypeError: tuple indices must be integers or slices, not str
msg340877 - (view)	Author: Éric Araujo (eric.araujo) *	Date: 2019-04-26 02:52
A side question: where is it defined that in `{thing[0]}`, 0 will be parsed as an integer? The PEP shows `{thing[name]}` and mentions that this is not Python but a smaller mini-language, with `name` always a string, no quotes needed or permitted.
msg340884 - (view)	Author: Eric V. Smith (eric.smith) *	Date: 2019-04-26 06:56
I'm not sure where (or if) it's defined in the Python docs, but in PEP 3101 it's in https://www.python.org/dev/peps/pep-3101/#simple-and-compound-field-names: "It should be noted that the use of 'getitem' within a format string is much more limited than its conventional usage. In the above example, the string 'name' really is the literal string 'name', not a variable named 'name'. The rules for parsing an item key are very simple. If it starts with a digit, then it is treated as a number, otherwise it is used as a string.".
msg347795 - (view)	Author: Ilya Kamenshchikov (Ilya Kamenshchikov) *	Date: 2019-07-13 10:17
Py3.6+ f-strings support any indexing as they actually evaluate python expressions. >>> a = ['Java', 'Python'] >>> var = f"Hello {a[-1]}" Hello Python
msg407301 - (view)	Author: Irit Katriel (iritkatriel) *	Date: 2021-11-29 17:18
Reproduced on 3.11, and the error message is a little weirder now: >>> "{0[-1]}".format('fox') Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: string indices must be integers, not 'str'
msg407328 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2021-11-29 22:01
I recommend not adding support for negative indexing to format() for accessing positional arguments. There is almost no reason to do this because it almost always makes the format string less readable, because the number of arguments is always known in advance, and because the arguments are almost always used entirely rather than selectively. Negative index support isn't a feature of the language. Instead, it is a feature provided on a class by class basis, if it makes sense for that class and for its use cases. We are not obliged to provide negative index support in places where it doesn't make sense or where it makes code less readable. For example, the islice() function doesn't support negative indices because it doesn't make sense there. Likewise, the Sequence ABC doesn't require negative index support or slice support.
msg407422 - (view)	Author: Eric V. Smith (eric.smith) *	Date: 2021-12-01 01:47
I'm closing this as "won't fix" for the negative indexing functionality. If someone wants to open an new documentation issue (and ideally provide a PR), that would be welcome.

History
Date	User	Action	Args
2022-04-11 14:56:57	admin	set	github: 52199
2021-12-01 01:47:31	eric.smith	set	status: open -> closed resolution: wont fix messages: + msg407422 stage: needs patch -> resolved
2021-11-29 22:01:36	rhettinger	set	messages: + msg407328
2021-11-29 17:18:19	iritkatriel	set	nosy: + iritkatriel messages: + msg407301 versions: + Python 3.11, - Python 3.4
2019-07-13 10:17:18	Ilya Kamenshchikov	set	nosy: + Ilya Kamenshchikov messages: + msg347795
2019-04-26 06:56:03	eric.smith	set	messages: + msg340884
2019-04-26 02:52:28	eric.araujo	set	messages: + msg340877
2018-03-27 21:43:24	serhiy.storchaka	link	issue33160 superseder
2016-05-28 21:14:38	BreamoreBoy	set	nosy: - BreamoreBoy
2016-05-27 06:45:13	marco.buttu	set	nosy: + marco.buttu messages: + msg266481
2014-08-18 19:50:11	BreamoreBoy	set	nosy: + BreamoreBoy messages: + msg225505
2014-04-13 23:43:30	terry.reedy	set	messages: + msg216038
2014-04-13 20:51:45	terry.reedy	set	assignee: docs@python -> terry.reedy
2014-04-12 14:06:01	eric.smith	set	assignee: eric.smith -> docs@python nosy: + docs@python
2014-04-12 02:40:49	terry.reedy	set	messages: + msg215958
2014-02-03 17:11:11	BreamoreBoy	set	nosy: - BreamoreBoy
2013-05-26 17:34:50	BreamoreBoy	set	nosy: + BreamoreBoy messages: + msg190102
2013-04-20 03:06:41	Todd.Rovito	set	files: + 7951NegativeIndexesForStringFormat3dot4.patch keywords: + patch messages: + msg187404 versions: + Python 3.4, - Python 3.2
2013-04-19 16:47:09	Todd.Rovito	set	nosy: + Todd.Rovito
2010-09-12 23:53:16	rhettinger	set	nosy: rhettinger, terry.reedy, mark.dickinson, eric.smith, kisielk, eric.araujo, mrabarnett, flox, gosella components: + Documentation, - Interpreter Core
2010-09-12 23:02:54	eric.araujo	set	messages: + msg116244
2010-09-09 23:55:05	rhettinger	set	messages: + msg115981
2010-09-09 19:30:26	flox	set	nosy: + flox
2010-08-11 20:01:35	mrabarnett	set	messages: + msg113624
2010-08-11 19:41:20	rhettinger	set	keywords: - patch, after moratorium messages: + msg113620 versions: + Python 3.2, - Python 3.3
2010-08-09 18:42:49	terry.reedy	set	nosy: + terry.reedy messages: + msg113447 versions: + Python 3.3, - Python 3.2
2010-06-25 18:48:22	gosella	set	files: + format_no_fields_with_negative_indexes-2.7.diff keywords: + patch messages: + msg108617
2010-06-23 18:40:52	kisielk	set	nosy: + kisielk messages: + msg108472
2010-06-18 21:47:18	eric.smith	set	keywords: + after moratorium, - patch messages: + msg108144
2010-06-18 21:38:22	rhettinger	set	nosy: + rhettinger messages: + msg108143
2010-06-18 19:55:15	gosella	set	messages: + msg108133
2010-06-18 19:52:53	gosella	set	files: + format_negative_indexes-3.2.diff
2010-06-18 19:50:39	gosella	set	files: + format_negative_indexes-2.7.diff keywords: + patch messages: + msg108132
2010-06-15 01:25:44	mrabarnett	set	messages: + msg107845
2010-06-14 21:20:28	eric.smith	set	messages: + msg107811
2010-06-14 17:02:58	mrabarnett	set	messages: + msg107801
2010-06-14 15:32:48	mark.dickinson	set	messages: + msg107793
2010-06-14 15:30:59	mrabarnett	set	messages: + msg107792
2010-06-14 11:23:34	eric.smith	set	messages: + msg107781
2010-06-14 10:57:40	mark.dickinson	set	messages: + msg107776
2010-06-13 23:50:42	eric.smith	set	stage: needs patch
2010-06-13 23:49:48	eric.smith	set	nosy: + mark.dickinson, eric.araujo, gosella messages: + msg107766
2010-06-13 23:48:57	eric.smith	link	issue8985 superseder
2010-06-13 23:47:16	eric.smith	set	versions: - Python 2.7
2010-02-19 01:43:30	mrabarnett	set	nosy: + mrabarnett messages: + msg99553
2010-02-17 23:54:17	eric.smith	create