classification
Title: String format() has problems parsing numeric indexes
Type: behavior Stage:
Components: Library (Lib), Unicode Versions: Python 3.1, Python 3.2, Python 2.7, Python 2.6
process
Status: closed Resolution: duplicate
Dependencies: Superseder: Should str.format allow negative indexes when used for __getitem__ access?
View: 7951
Assigned To: Nosy List: eric.araujo, eric.smith, gosella, mark.dickinson
Priority: normal Keywords:

Created on 2010-06-12 20:13 by gosella, last changed 2010-09-12 23:02 by eric.araujo. This issue is now closed.

Messages (7)
msg107691 - (view) Author: Germán L. Osella Massa (gosella) Date: 2010-06-12 20:13
The str.format() method allows index lookup on an object that supports __getitem__(). However, negative indexes are not supported.

Examples (using Python 2.6.5):

>>> "{0[0]}".format([0, 1, 2])
'0'

>>> "{0[-1]}".format([0, 1, 2])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: list indices must be integers, not str

>>> u"{0[0]}".format([0, 1, 2])
u'0'
>>> u"{0[-1]}".format([0, 1, 2])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: list indices must be integers, not unicode

Also notice that spaces matter:

>>> "{0[ 0 ]}".format([0, 1, 2])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: list indices must be integers, not str

(The same thing happens on Python 3.1.2)

The problem is that the function get_integer() on Objects/stringlib/string_format.h don't expect spaces or a '-' char, only digits. If the index is not a continuous sequence of digits, it assumes that it is a key for a dict and the index is treated as a string, and that's the cause of the TypeError exception.

This code is the same from 2.6.5 up to trunk.

get_integer() is not very robust to parsing numbers. I'm not familiar with CPython but perhaps the same code used in int(str) can be applied here to take advantage of the better parsing that int() has.
msg107693 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-06-12 20:26
The behaviour is by design (I think), though perhaps the error messages could be improved.

Are you asking for negative indices and extra space to be accepted?  In that case this should be a feature request rather than a bug report.
msg107699 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2010-06-12 20:46
FYI, the version field is used to note versions where the bug will be fixed, not versions where it’s found. New features and bug fixes go to the dev branch (3.2), security and documentation fixes go to the stable branches (2.6 and 3.1). (2.7 is in release candidate, so it’s frozen.)
msg107705 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2010-06-12 21:23
get_integer uses the narrowest possible definition for integer indexes, in order to pass all other strings to mappings.

>>> '{0[ 0 ]} {0[-1]}'.format({' 0 ': 'foo', '-1': 'bar'})
'foo bar'

Remember, it has to guess what type of lookup to do based on whether the value inside [] looks like an integer or not.

From the PEP:
    Because keys are not quote-delimited, it is not possible to
    specify arbitrary dictionary keys (e.g., the strings "10" or
    ":-]") from within a format string.

I don't believe this restriction causes any practical problem.

I'm not sure the error could be improved. The code that's being called is essentially:

>>> [0, 1, 2]['-1']
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: list indices must be integers, not str
msg107739 - (view) Author: Germán L. Osella Massa (gosella) Date: 2010-06-13 17:07
I now see the rationale behind not accepting ' 10 ' == 10. But what about not accepting '-1' == -1?

I think is odd that negative numbers are not accepted as valid indexes. I'd expect that something like 

"First element is {0[0]} and last element is {0[-1]}".format([0,1,2,3])

would work but it didn't.

Should I create a new Issue requesting this as a new feature? Or could be reconsidered as a bug?

I could provide a simple patch that change get_integer() so it would accept negative integers as a valid number.

(This is the first issue I reported and English is not my native language so I apology for any mistake that I made)
msg107765 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2010-06-13 23:48
I'll consider this a duplicate. Issue 7951 is the existing feature request for this issue. I'll merge the nosy lists.
msg116243 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2010-09-12 23:02
Just for the record, I said something inexact in my previous message in this thread.  New features go to the dev branch (py3k, future 3.2), bug and doc fixes go into py3k and the stable branches (2.7 and 3.1 now), and the previous stable releases (2.5 and 2.6) only get security fixes.
History
Date User Action Args
2010-09-12 23:02:14eric.araujosetmessages: + msg116243
2010-06-13 23:48:57eric.smithsetresolution: rejected -> duplicate
superseder: Should str.format allow negative indexes when used for __getitem__ access?
messages: + msg107765
2010-06-13 17:07:15gosellasetmessages: + msg107739
2010-06-12 21:23:18eric.smithsetstatus: open -> closed
resolution: rejected
messages: + msg107705
2010-06-12 20:46:04eric.araujosetnosy: + eric.araujo
messages: + msg107699
2010-06-12 20:26:30mark.dickinsonsetnosy: + mark.dickinson, eric.smith
messages: + msg107693
2010-06-12 20:13:59gosellacreate