msg99482 - (view) |
Author: Eric V. Smith (eric.smith) * |
Date: 2010-02-17 23:54 |
It surprised me that this doesn't work:
>>> "{0[-1]}".format('fox')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: string indices must be integers
I was expecting it to be equivalent to:
>>> "{0[2]}".format('fox')
'x'
I don't think there's any particular reason this doesn't work. It would, however break the following code:
>>> "{0[-1]}".format({'-1':'foo'})
'foo'
But note that this doesn't work currently:
>>> "{0[1]}".format({'1':'foo'})
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyError: 1
|
msg99553 - (view) |
Author: Matthew Barnett (mrabarnett) * |
Date: 2010-02-19 01:43 |
On a related note, this doesn't work either:
>>> "{-1}".format("x", "y", "z")
Traceback (most recent call last):
File "<pyshell#3>", line 1, in <module>
"{-1}".format("x", "y", "z")
KeyError: '-1'
It could return "z".
It also rejects a leading '+', but that would be optional anyway.
|
msg107766 - (view) |
Author: Eric V. Smith (eric.smith) * |
Date: 2010-06-13 23:49 |
Closed issue 8985 as a duplicate of this; merging nosy lists.
|
msg107776 - (view) |
Author: Mark Dickinson (mark.dickinson) * |
Date: 2010-06-14 10:57 |
I (reluctantly) agree it's surprising that "{0[-1]}".format(args) fails. And I suppose that if it were allowed then it would also make sense to consider "{-1}".format(*args) as well, in order to preserve the equivalence between "{n}".format(*args) and "{0[n]}".format(args). And then:
>>> "{-0}".format(*['calvin'], **{'-0': 'hobbes'})
'hobbes'
would presumably produce 'calvin' instead of 'hobbes'...
On '+': if "{0[-1]}" were allowed, I'm not sure whether the "+1" in "{0[+1]}".format(...) should also be interpreted as a list index. I don't really see the value of doing so apart from syntactic consistency: there are very few other places in Python that I'm aware of that accept -<one-or-more-digits> but not +<one-or-more-digits>.
FWIW, my overall feeling is that the current rules are simple and adequate, and there's no great need to add this complication.
I do wonder, though:
How complicated would it be to make "{0[1]}".format({'1':'foo'}) a bit magical? That is, have the format method pass an integer to __getitem__ if the corresponding format argument is a sequence, and a string argument if it's a mapping (not sure what the criterion would be for distinguishing). Is this too much magic? Is it feasible implementation-wise?
I don't think it's do-able for simple rather than compound field names: e.g., "{0}".format(*args, **kwargs), since there we've got both a sequence *and* a dict, so it's not clear whether to look at args[0] or kwargs['0']. (Unless either args or kwargs is empty, perhaps.) This is all getting a bit python-ideas'y, though.
BTW, I notice that PEP 3101's "Simple field names are either names or numbers [...] if names, they must be valid Python identifiers" isn't actually true:
>>> "{in-valid #identifier}".format(**{'in-valid #identifier': 42})
'42'
Though I don't have a problem with this; indeed, I think this is preferable to checking for a valid identifier.
|
msg107781 - (view) |
Author: Eric V. Smith (eric.smith) * |
Date: 2010-06-14 11:23 |
Addressing just the last part of Mark's message right now:
The PEP goes on to say:
Implementation note: The implementation of this proposal is
not required to enforce the rule about a simple or dotted name
being a valid Python identifier. ...
I rely on getattr lookup failing for dotted names, but for simple names there's no check at all. I agree it's desirable to leave this behavior.
|
msg107792 - (view) |
Author: Matthew Barnett (mrabarnett) * |
Date: 2010-06-14 15:30 |
Re: msg107776.
If it looks like an integer (ie, can be converted to an integer by 'int') then it's positional, otherwise it's a key. An optimisation is to perform a quick check upfront to see whether it starts like an integer.
|
msg107793 - (view) |
Author: Mark Dickinson (mark.dickinson) * |
Date: 2010-06-14 15:32 |
Matthew:
would that include allowing whitespace, then?
>>> int('\t\n+56')
56
|
msg107801 - (view) |
Author: Matthew Barnett (mrabarnett) * |
Date: 2010-06-14 17:02 |
That's a good question. :-)
Possibly just an optional sign followed by one or more digits.
Another possibility that occurs to me is for it to default to positional if it looks like an integer, but allow quoting to force it to be a key:
>>> "{0}".format("foo", **{"0": "bar"})
'foo'
>>> "{'0'}".format("foo", **{"0": "bar"})
'bar'
Or is that taking it too far?
|
msg107811 - (view) |
Author: Eric V. Smith (eric.smith) * |
Date: 2010-06-14 21:20 |
I can see the point of allowing negative indices for a consistency point, but is there really any practical problem that's currently causing people hardship that this would solve?
As for the rest of it, I think it's just not worth the additional burden on CPython and other implementations.
|
msg107845 - (view) |
Author: Matthew Barnett (mrabarnett) * |
Date: 2010-06-15 01:25 |
Your original:
"{0[-1]}".format('fox')
is a worse gotcha than:
"{-1}".format('fox')
because you're much less likely to want to do the latter.
It's one of those things that it would be nice to have fixed, or we could just add a warning to the documentation that it _might_ be fixed in the future, so people shouldn't rely on the current behaviour. :-)
|
msg108132 - (view) |
Author: Germán L. Osella Massa (gosella) |
Date: 2010-06-18 19:50 |
I finally managed to get the time to finish the patch that allows negative indexes inside square brackets so now they work with the same semantics as in a python expression:
>>> '{0[-1]}'.format(['abc', 'def'])
'def'
>>> '{0[-2]}'.format(['abc', 'def'])
'abc'
>>> '{0[-1][-1]}'.format(['abc', ['def']])
'def'
They work auto-numbered fields too:
>>> '{[-1]}'.format(['abc', 'def'])
'def'
Also, a positive sign is now accepted as part of a valid integer:
>>> '{0[+1]}'.format(['abc', 'def'])
'def'
As a bonus, negatives indexes are also allowed to refer to positional arguments:
>>> '{-1}'.format('abc', 'def')
'def'
>>> '{-2}'.format('abc', 'def')
'abc'
I'm attaching a patch against trunk. I added some tests for this functionality in test_str.py.
By the way, this code doesn't work anymore:
>>> "{[-1]}".format({'-1': 'X'})
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyError: -1L
But now it behaves in the same way as:
>>> "{[1]}".format({'1': 'X'})
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyError: 1L
I didn't attempt to ignore whitespaces when trying to parse the index as an integer (to allow that "{ 0 }" can be treated as "{0}" and "{0[1]}" as "{ 0 [ 1 ] }") because I'm not sure if this behavior is desirable.
|
msg108133 - (view) |
Author: Germán L. Osella Massa (gosella) |
Date: 2010-06-18 19:55 |
I forgot to mention that I also made a patch against py3k (was the same code).
|
msg108143 - (view) |
Author: Raymond Hettinger (rhettinger) * |
Date: 2010-06-18 21:38 |
Perhaps this ought to be discussed on python-ideas or python-dev for a bit. It is not entirely clear that this is a GoodThingToDo(tm) nor is it clear that we want other Python implementations to have to invest the same effort.
The spirit of the language freeze suggests that we shouldn't add this unless we really need it. The goal was to let other implementations catch up, not to add to their list of incompatabilites.
|
msg108144 - (view) |
Author: Eric V. Smith (eric.smith) * |
Date: 2010-06-18 21:47 |
I agree with Raymond. I'm not convinced it allows you to write any code that you can't currently write, and I'm fairly sure it violates the moratorium. Implementing this would clearly put a burden on other implementations.
Marking as "after moratorium".
|
msg108472 - (view) |
Author: Kamil Kisiel (kisielk) |
Date: 2010-06-23 18:40 |
While I agree this functionality isn't strictly necessary I think it makes sense from a semantic point of view. I ran in to this issue today while writing some code and I simply expected the negative syntax to work, given that the format string syntax otherwise very closely resembles standard array and attribute access.
It would be nice to see this make it in eventually for consistency's sake.
|
msg108617 - (view) |
Author: Germán L. Osella Massa (gosella) |
Date: 2010-06-25 18:48 |
Well, using negative indexes for fields can be thought as a new feature with all the consequences mentioned before BUT negative indexes for accessing elements from a sequence, IMHO, is something that anyone would expected to work. That's why at first I thought it was a bug and I fill an issue about it.
The code that parses the fields and the indexes is the same, so when I change it to accept negative indexes, it worked for both cases. I'm attaching a patch that checks if a negative index is used in a field and reverts to the old behavior in that case, allowing only negative indexes for accessing sequences ( "{-1}" will raise KeyError because it will be threated as '-1').
Perhaps in this way this issue could be partially fixed.
|
msg113447 - (view) |
Author: Terry J. Reedy (terry.reedy) * |
Date: 2010-08-09 18:42 |
I believe this is covered by the PEP3003 3.2 change moratorium.
|
msg113620 - (view) |
Author: Raymond Hettinger (rhettinger) * |
Date: 2010-08-11 19:41 |
Fixing-up str formatting idiosyncracies does not fall under the moratorium and is helpful in getting 3.x to be usable.
That being said, I'm not convinced that this is actually a helpful feature. Not all objects supporting __getitem__ offer support for negative indexing. Also, there's a case to be made that using negative indices in a formatting string is an anti-pattern, causing more harm than good.
|
msg113624 - (view) |
Author: Matthew Barnett (mrabarnett) * |
Date: 2010-08-11 20:01 |
I agree with Kamil and Germán. I would've expected negative indexes for sequences to work. Negative indexes for fields is a different matter.
|
msg115981 - (view) |
Author: Raymond Hettinger (rhettinger) * |
Date: 2010-09-09 23:55 |
After more thought, I'm -1 on this. "Consistency" is a weak argument in favor of this. We need to be more use case drivenm and it there is no evidence that this is needed. Also, there is a reasonable concern that using negative indices in a format string would be a bad design pattern that should not be encouraged by the language. And, there is a maintenance burden (just getting it right in the first place; having other implementations try to get it right; and a burden to custom formatters to have to support negative indices).
I do think we have a documentation issue. This thread shows a number of experienced Python programmers who get "surprised" or perceive "consistency issues" perhaps because there isn't a clear mental picture of Python's layer structure (separation of concerns) and where the responsibility lies for the supporting negative indices.
For the record, here are a few notes on where negative index handling fits into the hierarchy:
Negative index support is not guaranteed by the collections.Sequence ABC nor by the grammar (see the "subscript" rule in Grammar/Grammar). It does not appear in opcode handling (see BINARY_SUBSCR in Python/ceval.c) nor in the top abstract layer (see PyObject_GetItem() in abstract.c). Instead, the support for slicing and negative index handling appears at the concrete layer (see listindex() in Objects/listobject.c for example).
We do guarantee negative index handling for builtin sequences and their subclasses (as long as they don't override __getitem__), and we provide a fast path for their execution (via an intermediate abstract layer function, PySequence_GetItem() in Objects/abstract.c), but other sequence-like objects are free to make their own decisions about slices and negative indices at the concrete layer.
Knowing this, a person should not be "surprised" when one sequence has support for negative indices or slicing and another does not. The choice belongs to the implementer of the concrete class, not to the caller of "a[x]". There is no "consistency" issue here.
IOW, we're not required to implement negative slice handling and are free to decide whether it is a good idea or not for the use-case of string formatting. There is some question about whether it is a bad practice for people to use negative indices for string formatting. If so, that would be a reason not to do it. And if available, it would only work for builtin sequences, but not sequence like items in general. There is also a concern about placing a burden on other implementations of Python (to match what we do in CPython) and on placing a burden on people writing their own custom formatters (to closely as possible mimic builtin formatters). If so, those would be reasons not to do it.
my-two-cents,
Raymond
|
msg116244 - (view) |
Author: Éric Araujo (eric.araujo) * |
Date: 2010-09-12 23:02 |
Thank you for the detailed argument, Raymond. I’m +1 on turning this into a doc bug.
|
msg187404 - (view) |
Author: Todd Rovito (Todd.Rovito) * |
Date: 2013-04-20 03:06 |
Here is a simple patch that simply explains negative indexes and negative slices are not supported for the string format documentation. Perhaps more documentation needs to be created else where to help explain why all collections do not need to support negative indexes and negative slices? If so please let me know and I will create it. But I think this patch at least clarifies for the use case of String format.
|
msg190102 - (view) |
Author: Mark Lawrence (BreamoreBoy) * |
Date: 2013-05-26 17:34 |
Todd's patch strikes me as fine. If something more detailed is needed I think it would be better to raise a separate issue.
|
msg215958 - (view) |
Author: Terry J. Reedy (terry.reedy) * |
Date: 2014-04-12 02:40 |
Either leading sign, '+' or '-', cause string interpretation, so I think 'unsigned integer' should be the term in the doc.
>>> '{0[-1]}'.format({'-1': 'neg int key'})
'neg int key'
>>> '{0[+1]}'.format({'+1': 'neg int key'})
'neg int key'
>>> '{0[+1]}'.format([1,2,3])
Traceback (most recent call last):
File "<pyshell#16>", line 1, in <module>
'{0[+1]}'.format([1,2,3])
TypeError: list indices must be integers, not str
|
msg216038 - (view) |
Author: Terry J. Reedy (terry.reedy) * |
Date: 2014-04-13 23:43 |
The doc bug is that the grammar block uses 'integer' (linked to https://docs.python.org/3/reference/lexical_analysis.html#grammar-token-integer) in
arg_name ::= [identifier | integer]
element_index ::= integer | index_string
when it should use 'decimalinteger' or even more exactly 'digit+'. The int() builtin uses the same relaxed rule when no base is given.
>>> 011
SyntaxError: invalid token
>>> int('011')
11
>>> '{[011]}'.format('abcdefghijlmn')
'm'
One possibity is to replace 'integer' in the grammar block with 'digit+' and perhaps leave the text alone. Another is to replace 'integer' with 'index_number', to go with 'index_string, and add the production "index_number ::= digit+". My though for the latter is that 'index_number' would connect better with 'number' as used in the text. A further option would be to actually replace 'number' in the text with 'index_number'.
PS to Todd. As much as possible, doc content changes should be separated from re-formatting. I believe the first block of your patch is purely a re-format
|
msg225505 - (view) |
Author: Mark Lawrence (BreamoreBoy) * |
Date: 2014-08-18 19:50 |
msg216038 suggests three options for the doc patch, does anybody have any preference or a better alternative?
|
msg266481 - (view) |
Author: Marco Buttu (marco.buttu) * |
Date: 2016-05-27 06:45 |
The error message is misleading:
>>> s = '{names[-1]} loves {0[1]}'
>>> s.format(('C', 'Python'), names=('Dennis', 'Guido'))
Traceback (most recent call last):
...
TypeError: tuple indices must be integers or slices, not str
|
msg340877 - (view) |
Author: Éric Araujo (eric.araujo) * |
Date: 2019-04-26 02:52 |
A side question: where is it defined that in `{thing[0]}`, 0 will be parsed as an integer?
The PEP shows `{thing[name]}` and mentions that this is not Python but a smaller mini-language, with `name` always a string, no quotes needed or permitted.
|
msg340884 - (view) |
Author: Eric V. Smith (eric.smith) * |
Date: 2019-04-26 06:56 |
I'm not sure where (or if) it's defined in the Python docs, but in PEP 3101 it's in https://www.python.org/dev/peps/pep-3101/#simple-and-compound-field-names: "It should be noted that the use of 'getitem' within a format string is much more limited than its conventional usage. In the above example, the string 'name' really is the literal string 'name', not a variable named 'name'. The rules for parsing an item key are very simple. If it starts with a digit, then it is treated as a number, otherwise it is used as a string.".
|
msg347795 - (view) |
Author: Ilya Kamenshchikov (Ilya Kamenshchikov) * |
Date: 2019-07-13 10:17 |
Py3.6+ f-strings support any indexing as they actually evaluate python expressions.
>>> a = ['Java', 'Python']
>>> var = f"Hello {a[-1]}"
Hello Python
|
msg407301 - (view) |
Author: Irit Katriel (iritkatriel) * |
Date: 2021-11-29 17:18 |
Reproduced on 3.11, and the error message is a little weirder now:
>>> "{0[-1]}".format('fox')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: string indices must be integers, not 'str'
|
msg407328 - (view) |
Author: Raymond Hettinger (rhettinger) * |
Date: 2021-11-29 22:01 |
I recommend not adding support for negative indexing to format() for accessing positional arguments. There is almost no reason to do this because it almost always makes the format string less readable, because the number of arguments is always known in advance, and because the arguments are almost always used entirely rather than selectively.
Negative index support isn't a feature of the language. Instead, it is a feature provided on a class by class basis, if it makes sense for that class and for its use cases.
We are not obliged to provide negative index support in places where it doesn't make sense or where it makes code less readable. For example, the islice() function doesn't support negative indices because it doesn't make sense there. Likewise, the Sequence ABC doesn't require negative index support or slice support.
|
msg407422 - (view) |
Author: Eric V. Smith (eric.smith) * |
Date: 2021-12-01 01:47 |
I'm closing this as "won't fix" for the negative indexing functionality. If someone wants to open an new documentation issue (and ideally provide a PR), that would be welcome.
|
|
Date |
User |
Action |
Args |
2022-04-11 14:56:57 | admin | set | github: 52199 |
2021-12-01 01:47:31 | eric.smith | set | status: open -> closed resolution: wont fix messages:
+ msg407422
stage: needs patch -> resolved |
2021-11-29 22:01:36 | rhettinger | set | messages:
+ msg407328 |
2021-11-29 17:18:19 | iritkatriel | set | nosy:
+ iritkatriel
messages:
+ msg407301 versions:
+ Python 3.11, - Python 3.4 |
2019-07-13 10:17:18 | Ilya Kamenshchikov | set | nosy:
+ Ilya Kamenshchikov messages:
+ msg347795
|
2019-04-26 06:56:03 | eric.smith | set | messages:
+ msg340884 |
2019-04-26 02:52:28 | eric.araujo | set | messages:
+ msg340877 |
2018-03-27 21:43:24 | serhiy.storchaka | link | issue33160 superseder |
2016-05-28 21:14:38 | BreamoreBoy | set | nosy:
- BreamoreBoy
|
2016-05-27 06:45:13 | marco.buttu | set | nosy:
+ marco.buttu messages:
+ msg266481
|
2014-08-18 19:50:11 | BreamoreBoy | set | nosy:
+ BreamoreBoy messages:
+ msg225505
|
2014-04-13 23:43:30 | terry.reedy | set | messages:
+ msg216038 |
2014-04-13 20:51:45 | terry.reedy | set | assignee: docs@python -> terry.reedy |
2014-04-12 14:06:01 | eric.smith | set | assignee: eric.smith -> docs@python
nosy:
+ docs@python |
2014-04-12 02:40:49 | terry.reedy | set | messages:
+ msg215958 |
2014-02-03 17:11:11 | BreamoreBoy | set | nosy:
- BreamoreBoy
|
2013-05-26 17:34:50 | BreamoreBoy | set | nosy:
+ BreamoreBoy messages:
+ msg190102
|
2013-04-20 03:06:41 | Todd.Rovito | set | files:
+ 7951NegativeIndexesForStringFormat3dot4.patch keywords:
+ patch messages:
+ msg187404
versions:
+ Python 3.4, - Python 3.2 |
2013-04-19 16:47:09 | Todd.Rovito | set | nosy:
+ Todd.Rovito
|
2010-09-12 23:53:16 | rhettinger | set | nosy:
rhettinger, terry.reedy, mark.dickinson, eric.smith, kisielk, eric.araujo, mrabarnett, flox, gosella components:
+ Documentation, - Interpreter Core |
2010-09-12 23:02:54 | eric.araujo | set | messages:
+ msg116244 |
2010-09-09 23:55:05 | rhettinger | set | messages:
+ msg115981 |
2010-09-09 19:30:26 | flox | set | nosy:
+ flox
|
2010-08-11 20:01:35 | mrabarnett | set | messages:
+ msg113624 |
2010-08-11 19:41:20 | rhettinger | set | keywords:
- patch, after moratorium
messages:
+ msg113620 versions:
+ Python 3.2, - Python 3.3 |
2010-08-09 18:42:49 | terry.reedy | set | nosy:
+ terry.reedy
messages:
+ msg113447 versions:
+ Python 3.3, - Python 3.2 |
2010-06-25 18:48:22 | gosella | set | files:
+ format_no_fields_with_negative_indexes-2.7.diff keywords:
+ patch messages:
+ msg108617
|
2010-06-23 18:40:52 | kisielk | set | nosy:
+ kisielk messages:
+ msg108472
|
2010-06-18 21:47:18 | eric.smith | set | keywords:
+ after moratorium, - patch
messages:
+ msg108144 |
2010-06-18 21:38:22 | rhettinger | set | nosy:
+ rhettinger messages:
+ msg108143
|
2010-06-18 19:55:15 | gosella | set | messages:
+ msg108133 |
2010-06-18 19:52:53 | gosella | set | files:
+ format_negative_indexes-3.2.diff |
2010-06-18 19:50:39 | gosella | set | files:
+ format_negative_indexes-2.7.diff keywords:
+ patch messages:
+ msg108132
|
2010-06-15 01:25:44 | mrabarnett | set | messages:
+ msg107845 |
2010-06-14 21:20:28 | eric.smith | set | messages:
+ msg107811 |
2010-06-14 17:02:58 | mrabarnett | set | messages:
+ msg107801 |
2010-06-14 15:32:48 | mark.dickinson | set | messages:
+ msg107793 |
2010-06-14 15:30:59 | mrabarnett | set | messages:
+ msg107792 |
2010-06-14 11:23:34 | eric.smith | set | messages:
+ msg107781 |
2010-06-14 10:57:40 | mark.dickinson | set | messages:
+ msg107776 |
2010-06-13 23:50:42 | eric.smith | set | stage: needs patch |
2010-06-13 23:49:48 | eric.smith | set | nosy:
+ mark.dickinson, eric.araujo, gosella messages:
+ msg107766
|
2010-06-13 23:48:57 | eric.smith | link | issue8985 superseder |
2010-06-13 23:47:16 | eric.smith | set | versions:
- Python 2.7 |
2010-02-19 01:43:30 | mrabarnett | set | nosy:
+ mrabarnett messages:
+ msg99553
|
2010-02-17 23:54:17 | eric.smith | create | |