Non-informative error message in index() and remove() functions #57558

akheron · 2011-11-05T20:49:37Z

BPO	13349
Nosy	@brettcannon, @rhettinger, @terryjreedy, @ncoghlan, @vstinner, @ezio-melotti, @merwok, @Julian, @akheron, @serhiy-storchaka, @DimitrisJim, @eamanu
PRs	bpo-13349: Fix error reporting for index and remove methods #876 bpo-13349: Improve the error message for tuple.index #1309 bpo-13349: repr object in tuple.index ValueError message #1820
Dependencies	bpo-7330: PyUnicode_FromFormat: implement width and precision for %s, %S, %R, %V, %U, %A
Files	issue-13349.patch issue13349.patch.1: First update to issue patch. issue13349.patch.2 issue13349.patch.3 issue13349.patch.4 issue13349.patch.6 issue13349.patch.7

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2017-05-27.21:57:54.212>
created_at = <Date 2011-11-05.20:49:37.320>
labels = ['extension-modules', 'interpreter-core', 'type-bug', '3.7']
title = 'Non-informative error message in index() and remove() functions'
updated_at = <Date 2017-05-28.15:33:09.523>
user = 'https://github.com/akheron'

bugs.python.org fields:

activity = <Date 2017-05-28.15:33:09.523>
actor = 'Jim Fasarakis-Hilliard'
assignee = 'none'
closed = True
closed_date = <Date 2017-05-27.21:57:54.212>
closer = 'rhettinger'
components = ['Extension Modules', 'Interpreter Core']
creation = <Date 2011-11-05.20:49:37.320>
creator = 'petri.lehtinen'
dependencies = ['7330']
files = ['27884', '27907', '27940', '27949', '27975', '28265', '28267']
hgrepos = []
issue_num = 13349
keywords = ['patch']
message_count = 46.0
messages = ['147109', '147215', '147237', '147238', '147431', '147546', '147795', '147824', '147834', '147836', '147837', '147853', '147859', '174666', '174716', '174744', '174759', '174820', '174821', '174958', '175134', '175265', '175266', '175319', '175325', '175328', '175489', '175491', '177198', '177203', '188598', '289856', '290020', '290492', '290742', '290745', '290747', '290748', '294552', '294563', '294603', '294609', '294613', '294627', '294631', '294649']
nosy_count = 13.0
nosy_names = ['brett.cannon', 'rhettinger', 'terry.reedy', 'ncoghlan', 'vstinner', 'ezio.melotti', 'eric.araujo', 'Julian', 'petri.lehtinen', 'serhiy.storchaka', 'Sean.Ochoa', 'Jim Fasarakis-Hilliard', 'eamanu']
pr_nums = ['876', '1309', '1820']
priority = 'normal'
resolution = 'rejected'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue13349'
versions = ['Python 3.7']

akheron · 2011-11-05T20:49:37Z

For example:

>>> (1, 2, 3).index(4)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: tuple.index(x): x not in tuple

The "x not in tuple" error message should be replaced with "4 is not in tuple". list.index() already does this (see bpo-7252):

>>> [1, 2, 3].index(4)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: 4 is not in list

Although in bpo-7252 it was claimed that no other place in stdlib has this error message, I found many occurences:

Modules/_collectionsmodule.c: PyErr_SetString(PyExc_ValueError, "deque.remove(x): x not in deque"
Modules/_elementtree.c: "list.remove(x): x not in list"
Modules/_elementtree.c: "list.remove(x): x not in list"
Modules/arraymodule.c: PyErr_SetString(PyExc_ValueError, "array.index(x): x not in list");
Modules/arraymodule.c: PyErr_SetString(PyExc_ValueError, "array.remove(x): x not in list");
Objects/abstract.c: "sequence.index(x): x not in sequence");
Objects/listobject.c: PyErr_SetString(PyExc_ValueError, "list.remove(x): x not in list");
Objects/tupleobject.c: PyErr_SetString(PyExc_ValueError, "tuple.index(x): x not in tuple");

There's also documentation and tests that depend on this actual error message:

Doc/library/doctest.rst: ValueError: list.remove(x): x not in list
Lib/test/test_xml_etree.py: ValueError: list.remove(x): x not in list

bpo-7252 was fixed in r76058, and it's quite a lot of code. I think it could be done more easily using PyUnicode_FromFormat() and the %R format.

akheron · 2011-11-07T10:08:52Z

The good thing about this is ease of debugging. You can see which is the offending value that was not found.

On the other hand, the repr of a value might be very long:

>>> [].index(list(range(1000)))
ValueError: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, ...
(many lines of numbers)
997, 998, 999] is not in list

Also, all values don't have a very informal repr:

>>> class Foo: pass
...
>>> [].index(Foo())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: <__main__.Foo object at 0xb736f92c> is not in list

merwok · 2011-11-07T16:43:00Z

The good thing about this is ease of debugging.
Exactly! +1 for the idea.

On the other hand, the repr of a value might be very long:
You can restrict the length with % formats.

Also, all values don't have a very informal repr:
Not your problem. This change will still be much more useful than the current 'x'. Some reprs are very helpful, other ones give the ID so can be used in debugging, other ones are not helpful at all so authors will have to make them more helpful or debug their code otherwise.

merwok · 2011-11-07T16:49:00Z

There's also documentation and tests that depend on this actual error message:
Doc/library/doctest.rst: ValueError: list.remove(x): x not in list
Lib/test>/test_xml_etree.py: ValueError: list.remove(x): x not in list
That’s a well-known doctest problem. Just update the doc. Writing robust doctests is an art:

>>> str(someobject)
'output that can change'
>>> 'something I want' in str(someobject)  # more robust, but less useful if it fails
True

>>> something.index(spam)
traceback blah:
ValueError: output that can change
>>> try: something.index(spam)
... except ValueError: print('spam not in something')  # more robust, but ugly
spam not in something

ncoghlan · 2011-11-11T12:37:35Z

doctests by their very nature tend to overspecify things - that's why actual regression tests should be written with unittest, while doctest is kept for its originally intended purpose of testing examples included in docstrings and other documentation.

Still, there's also a reason why IGNORE_EXCEPTION_DETAIL is an available doctest option.

terryjreedy · 2011-11-13T03:11:10Z

Improving error messages has been a long, slow process as people are irritated enough to make a change. Please go ahead.

Guido has explicitly excluded exception detail from the language spec multiple times. Test that depend on details are broken. Doc examples are updated as such details change. (So if you do upgrade the message, change the tests and examples ;-).

akheron · 2011-11-17T12:11:29Z

Éric Araujo wrote:

> On the other hand, the repr of a value might be very long:
You can restrict the length with % formats.

Seems you can't with %R. I'd also like to somehow indicate that
the output is truncated.

terryjreedy · 2011-11-17T21:53:44Z

The test suite has code (functions) that restrict the length on AssertEqual (and other) failure messages. (I do not know the location though.) Perhaps that can be reused. This almost seems like something that should be more easily available.

akheron · 2011-11-18T07:26:15Z

I found safe_repr() from Lib/unittest/util.py. We would require a similar function, just implemented in C.

What is a good place to define such C helpers that could be used everywhere?

ezio-melotti · 2011-11-18T09:22:12Z

I found safe_repr() from Lib/unittest/util.py.

The functions in Lib/unittest/util.py shouldn't be used outside unittest.

We would require a similar function, just implemented in C.
What is a good place to define such C helpers that could be used everywhere?

You could start by adding it in the file where you need it. If it starts becoming useful elsewhere too, it can then be moved somewhere else. I would expect such function to be private, so we are free to move it whenever we need it.

akheron · 2011-11-18T09:26:41Z

Ezio Melotti wrote:

You could start by adding it in the file where you need it. If it
starts becoming useful elsewhere too, it can then be moved somewhere
else. I would expect such function to be private, so we are free to
move it whenever we need it.

Well, msg147109 already lists 6 files that would use this function.
Some of them are under Objects&, some are under Modules/.

ncoghlan · 2011-11-18T11:41:21Z

Please don't stress too much about providing an indication that the repr has been truncated - it's an error message, not part of the normal program output. Besides, the lack of a closing ')', ']', '}' or '>' will usually indicate something is amiss in long reprs.

More useful would be to raise a separate feature request about the lack of width and precision handling for string formatting in PyUnicode_FromFormatV - the common format code handling means it *accepts* the width and precision information for string codes, but then proceeds to completely ignore it. It should be using them to pad short strings (width) and truncate long ones (precision), just like printf() (only in terms of code points rather than bytes, since those codes work with Unicode text).

akheron · 2011-11-18T12:01:47Z

Nick Coghlan wrote:

Please don't stress too much about providing an indication that the
repr has been truncated - it's an error message, not part of the
normal program output. Besides, the lack of a closing ')', ']', '}'
or '>' will usually indicate something is amiss in long reprs.

Ok. This makes things easier.

More useful would be to raise a separate feature request about the
lack of width and precision handling for string formatting in
PyUnicode_FromFormatV - the common format code handling means it
*accepts* the width and precision information for string codes, but
then proceeds to completely ignore it. It should be using them to
pad short strings (width) and truncate long ones (precision), just
like printf() (only in terms of code points rather than bytes, since
those codes work with Unicode text).

Created bpo-13428.

SeanOchoa · 2012-11-03T18:41:57Z

Working on issue as part of Python Bug Day, Oct 2012.

SeanOchoa · 2012-11-03T22:37:04Z

It seems that this has been fixed. Using simple tests from msg147215:

~/hg/cpython/working $ env-py3.3/bin/python
Python 3.3.0 (default, Nov  3 2012, 15:28:29) 
[GCC 4.7.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> [].index(list(range(1000)))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: [0, 1, 2, 3, 4, 5, ... 995, 996, 997, 998, 999] is not in list
[60649 refs]
>>> class Foo: pass
... 
[60679 refs]
>>> [].index(Foo())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: <__main__.Foo object at 0x7f7fad31bc30> is not in list
[60677 refs]

terryjreedy · 2012-11-04T02:52:49Z

The original example was about tuples and displaying 'x' instead of '4'. Have that and *all* the others been fixed?

SeanOchoa · 2012-11-04T06:55:00Z

After discussing with folks on the #python-dev tonight, I learned that I was testing with a list and I should've been using a set. I'm working on a patch now, and I'm almost ready to have it reviewed.

SeanOchoa · 2012-11-04T17:29:22Z

Ready for review.

ezio-melotti · 2012-11-04T17:48:01Z

You should use assertRaises/assertRaisesRegex to check the error message and add more tests to cover all the changes. I also noticed an inconsistence between 'element not in container' and 'element is not in container' (note the extra 'is'). FWIW I prefer the version without 'is'.

SeanOchoa · 2012-11-06T07:45:16Z

Tests updated for coverage and to use assertRaisesRegex.

SeanOchoa · 2012-11-08T05:51:26Z

From Taggnostr on #python-dev:

1.) Use assertRaises+assertIn instead of assertRaisesRegex
2.) Add or change an existing test that you've already updated to use elements with a repr longer than 100 chars.

SeanOchoa · 2012-11-10T06:54:47Z

Truncating repr strings to 100chars will require the patch from bpo-7330. After applying the patch from that issue, my tests work fine.

SeanOchoa · 2012-11-10T07:29:36Z

Updated patch after taking into account Ezio's (aka Taggnostr on #python-dev) latest feedback.

SeanOchoa · 2012-11-11T05:01:55Z

Updates after feedback from Serhiy.

Julian · 2012-11-11T06:19:54Z

test_issuewhatever is a bit indescriptive.

It'd be easier for someone glancing at the test to see what it claims to be testing if there were a more descriptive name given, like perhaps test_exception_message (or something even more verbose).

serhiy-storchaka · 2012-11-11T07:58:31Z

Or test_index()/test_remove(). In this case positive test cases (and may be other exception types) should be added.

SeanOchoa · 2012-11-13T06:52:34Z

Lib/test/test_array.py
-- Moved index test (for this issue) to test_index (existing test method).
-- Added remove test (for this issue) to test_remove (existing test method)

Lib/test/test_deque.py
-- Moved remove test (for this issue) to test_index (existing test method).

Lib/test/test_list.py
-- Added test_remove instead of test_issue13349, and test_index as a new test method; both with positive tests and a negative test to for this issue.

Lib/test/test_tuple.py
-- Added test_remove instead of test_issue13349 as a new test method with a positive test and a negative test case for this issue to cover both items longer than 100chars and shorter items.

Lib/test/test_xml_etree.py
-- Added test_remove as a new test method to TreeBasicOperationTest testcase class instead of test_issue13349, with a positive test case and the negative test to cover this issue.

serhiy-storchaka · 2012-11-13T08:35:10Z

See also my comments to previous patch about repr() and error message checking.

SeanOchoa · 2012-12-09T06:30:46Z

Update based on Taggnostr's (Ezio on IRC, if I recall correctly) feedback from 11/12/2012 around 11:57 PST.

Added check for index result in positive tests.
Added assertIn check for remove result in positive tests.
Removed extra whitespace.
Formatted comments to be more concise.

SeanOchoa · 2012-12-09T07:16:19Z

Fixed issue with test name in Lib/test/test_tuple.py
Fixed issue with test_remove in Lib/test/test_list.py to assertNotIn instead of assertIn for positive case.

Confirmed with Ezio that issue bpo-7330 will need to be fixed/approved before this issue can be closed.

vstinner · 2013-05-06T21:36:30Z

Confirmed with Ezio that issue bpo-7330 will need to be fixed/approved before this issue can be closed.

FYI I just fixed this issue. PyUnicode_FromFormat() now supports width and precision in the format string.

DimitrisJim · 2017-03-19T15:31:33Z

@sean Ochoa, do you want to make this into a PR? The only tweak I would suggest would be to change all error messages to either be:

"object.method(repr(x)): element not in object"

or:

"repr(x) not in object"

also, this probably needs to be changed to version 3.7 now.

eamanu · 2017-03-23T02:44:13Z

I agree with Jim Fasarakis-Hilliard this message may be change in new 3.7 version

DimitrisJim · 2017-03-25T17:40:29Z

Additional instances of this:

function indexOf of operator.py.
function _PySequence_IterSearch of abstract.c

serhiy-storchaka · 2017-03-28T16:54:33Z

>>> [1, 2, 3].index("'")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: ""'"" not in list

Aren't too much quotes?

DimitrisJim · 2017-03-28T17:06:49Z

Yes, that does look like too much. My rationale for adding quotes around the value was in order to make it more clear in cases where the repr exceeds 100 characters.

Instead of:

   Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
   ValueError: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 2 not in list

Have it clearly distinguished by using "":

   Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
   ValueError: "[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 2" not in list

I'm not sure if there's a trivial way to not display so many quotes in the case you supplied, you're better suited to decide if this can be done somehow.

DimitrisJim · 2017-03-28T17:14:09Z

Could use %.100S instead of %.100R but I'm not sure of the downsides that might entail.

serhiy-storchaka · 2017-03-28T17:18:45Z

Error messages should use repr(). For example str() of bytes object emits a warning or error.

See bpo-26090 for the issue with truncated reprs.

rhettinger · 2017-05-26T15:16:44Z

See http://bugs.python.org/issue30477 for why I think this shouldn't be done. In short, the IndexError may be part of control flow (looping over successive matches) rather than an error (this patch adds a time/space cost with no benefit in those cases). The repr itself may be large resulting in an unusable error message. The repr may be expensive to compute for some objects.

brettcannon · 2017-05-26T18:15:14Z

A potential compromise would be if ValueError gained a 'value' attribute. That would allow for the exception message to be static but still provide the information for introspection later if desired to figure out exactly what object led to the cause of the ValueError. If I remember correctly previous objections to this idea was worry about memory, though, so it seems attaching more information about the trigger of a ValueError will inevitably lead to more cost somehow.

rhettinger · 2017-05-27T17:33:12Z

Thinking back over my 17 years of using Python, I really don't think this is needed at all. I can't think of a single situation where it would have helped.

In some cases, the change would be detrimental to readability, introducing clutter into an otherwise clear error message. The OP's example of "ValueError: 4 is not in list" is atypical. A more typical example would be "ValueError: <main.TensorElementPlaceholder object at 0x106489e48> is not in list".

As a Python instructor, I've become acutely aware that tracebacks already have usability issues due to visual clutter and an unpleasant overall appearance. IMO, this proposal would sometimes make it worse by adding more noise for the reader to filter out.

Brett's idea is an improvement to the original proposal. By attaching the object to the exception rather than squeezing its representation into the string, you avoid making the error message hard to read and avoid the speed/space costs from computing the __repr__ on arbitrary objects. That said, I don't think there is any programmatic benefit to having the object in the exception. When you call "a.remove(b)", you already have "b" and it would be pointless to re-extract it with "except ValueError as e: b = e.args[-1]".

I'm also concerned that the proposed changes sweep across the core affecting code that has been deployed and stable for almost two decades.

Given the dubious benefits and potential detriments, I vote for declining this proposal and favoring the stable deployed code that has worked fine for many years.

brettcannon · 2017-05-27T19:51:00Z

I'm fine with not changing this, but I did want to say that I don't think you're guaranteed to have access to the object that triggered the ValueError. If the value that is searched for is in some extension module where it's calculated using C code then you have no way of accessing that value unless the extension module was nice enough to provide the object in the exception, give the repr in the exception message, or document how exactly the transformation from input to what you searched for is.

rhettinger · 2017-05-27T21:57:54Z

I'm fine with not changing this

Thanks Brett. I'm going to mark this as closed/rejected. After five years, it is time to put it to rest and let it stop consuming our mental clock cycles.

I don't think you're guaranteed to have access to the object
that triggered the ValueError.

I'm not sure I follow what you're saying. To call index() or remove() you already have to have the object to search for. The ValueError is raised when the searched for object is not found, not by some unknown object inside the container.

ncoghlan · 2017-05-28T04:58:55Z

While I agree it's reasonable to keep arbitrary value repr's out of these messages by default, I do think it would be desirable for someone sufficiently motivated to file a separate RFE about adding the value as a structured exception attribute so that custom sys.excepthook implementations and other "unhandled runtime exception" loggers may choose to handle the situation differently.

This wouldn't be about the cases where code is handling the remove() or index() call directly, but rather about those where third party library code isn't handling missing values correctly, so the debugging scenario a developer is faced with looks more like this than it does the examples in the OP here:

>>> some_library_function()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 2, in some_library_function
  File "<stdin>", line 2, in some_other_library_function
ValueError: list.remove(x): x not in list

Getting that kind of traceback in the logs for a production failure in a distributed system is all kinds of frustrating since you *know* the interpreter knew what "x" was when the error happened, it just isn't passing that information along to the developer that's trying to figure out how and why it broke.

serhiy-storchaka · 2017-05-28T08:21:48Z

I concur with Raymond.

But shouldn't we change error messages that contains the repr of not found value? Affected collections are list, deque and range.

>>> [].index('spam')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: 'spam' is not in list
>>> import collections
>>> collections.deque().index('spam')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: 'spam' is not in deque

range.index() raises different error messages depending on the type of the value:

>>> range(10).index('spam')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: sequence.index(x): x not in sequence
>>> range(10).index(42)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: 42 is not in range

DimitrisJim · 2017-05-28T15:33:09Z

Thanks, I understand why this isn't the best idea now.

shouldn't we change error messages that contains the repr of not found value?

That is what I was thinking too, apart from removing the repr from other messages it will also unify reporting for all these methods.

akheron added extension-modules C modules in the Modules dir interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error labels Nov 5, 2011

merwok changed the title ~~Uninformal error message in index() and remove() functions~~ Non-informative error message in index() and remove() functions Nov 3, 2012

DimitrisJim mannequin added the 3.7 (EOL) end of life label Mar 28, 2017

rhettinger closed this as completed May 27, 2017

ezio-melotti transferred this issue from another repository Apr 10, 2022

serhiy-storchaka mentioned this issue May 5, 2022

Improve error message for list.remove #92305

Closed

gpshead mentioned this issue Feb 16, 2023

object.c inconsistently truncates tp_name in exception error messages, hampering debugging #101964

Open

Non-informative error message in index() and remove() functions #57558

Non-informative error message in index() and remove() functions #57558

Comments

akheron commented Nov 5, 2011

akheron commented Nov 5, 2011

akheron commented Nov 7, 2011

merwok commented Nov 7, 2011

merwok commented Nov 7, 2011

ncoghlan commented Nov 11, 2011

terryjreedy commented Nov 13, 2011

akheron commented Nov 17, 2011

terryjreedy commented Nov 17, 2011

akheron commented Nov 18, 2011

ezio-melotti commented Nov 18, 2011

akheron commented Nov 18, 2011

ncoghlan commented Nov 18, 2011

akheron commented Nov 18, 2011

SeanOchoa mannequin commented Nov 3, 2012

SeanOchoa mannequin commented Nov 3, 2012

terryjreedy commented Nov 4, 2012

SeanOchoa mannequin commented Nov 4, 2012

SeanOchoa mannequin commented Nov 4, 2012

ezio-melotti commented Nov 4, 2012

SeanOchoa mannequin commented Nov 6, 2012

SeanOchoa mannequin commented Nov 8, 2012

SeanOchoa mannequin commented Nov 10, 2012

SeanOchoa mannequin commented Nov 10, 2012

SeanOchoa mannequin commented Nov 11, 2012

Julian mannequin commented Nov 11, 2012

serhiy-storchaka commented Nov 11, 2012

SeanOchoa mannequin commented Nov 13, 2012

serhiy-storchaka commented Nov 13, 2012

SeanOchoa mannequin commented Dec 9, 2012

SeanOchoa mannequin commented Dec 9, 2012

vstinner commented May 6, 2013

DimitrisJim mannequin commented Mar 19, 2017

eamanu mannequin commented Mar 23, 2017

DimitrisJim mannequin commented Mar 25, 2017

serhiy-storchaka commented Mar 28, 2017

DimitrisJim mannequin commented Mar 28, 2017

DimitrisJim mannequin commented Mar 28, 2017

serhiy-storchaka commented Mar 28, 2017

rhettinger commented May 26, 2017

brettcannon commented May 26, 2017

rhettinger commented May 27, 2017

brettcannon commented May 27, 2017

rhettinger commented May 27, 2017

ncoghlan commented May 28, 2017

serhiy-storchaka commented May 28, 2017

DimitrisJim mannequin commented May 28, 2017