This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Option for comparing values instead of reprs in doctest
Type: enhancement Stage: resolved
Components: Library (Lib), Tests Versions: Python 3.7
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: Nosy List: Tomáš Petříček, jbakker, r.david.murray, rhettinger, serhiy.storchaka, tim.peters
Priority: normal Keywords:

Created on 2017-11-15 21:39 by Tomáš Petříček, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Messages (13)
msg306311 - (view) Author: Tomáš Petříček (Tomáš Petříček) Date: 2017-11-15 21:39
Would there be an interest and possibility to add an option to the doctest module to compare values instead of reprs? The idea is that the expected value would be eval'ed prior to comparison if this option was enabled.

Motivation for this option is to enable comparison of values (states) as defined in the respective classes and not by a representation which might vary ({'a', 'b'} vs {'b', 'a'}, 'value' vs "value" etc.).
msg306318 - (view) Author: Tomáš Petříček (Tomáš Petříček) Date: 2017-11-15 23:18
Related to https://bugs.python.org/issue3332
msg306319 - (view) Author: Tomáš Petříček (Tomáš Petříček) Date: 2017-11-15 23:31
This option can be seen as a more general case of the options already available which lift the requirement of exact representation match (True for 1, normalize whitespace etc.). It would enable easier testing of relevant behavior instead of repr's artifacts.

An implementation draft is at https://github.com/tpet/cpython/commit/e59cc2d2c854f5995c36a60410eca0e893a7e269

As the expected value has to be reconstructed from the string representation anyway, it seems reasonable to do that for both values (expected and got). Only minor modifications seem to be required in that case.
msg306320 - (view) Author: Tomáš Petříček (Tomáš Petříček) Date: 2017-11-15 23:34
The following tests then succeed:

def str_fun():
    """
    >>> str_fun()
    'foo'
    >>> str_fun()
    "foo"
    >>> str_fun()
    '''foo'''
    """
    return 'foo'


def dict_fun():
    """
    >>> dict_fun()
    {'foo': 1, 'bar': 2}
    >>> dict_fun()
    {'bar': 2, 'foo': 1}
    >>> dict_fun()
    dict(foo=1, bar=2)
    """
    return {'foo': 1, 'bar': 2}


if __name__ == '__main__':
    import doctest
    doctest.testmod(verbose=True, optionflags=doctest.ACCEPT_EQUAL_VALUES)
msg306322 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2017-11-16 00:13
I think it is an interesting idea.  Let see what other people think.
msg306503 - (view) Author: Jesse Bakker (jbakker) * Date: 2017-11-19 16:04
I think this would allow for inconsistency in docs (if implemented as suggested), as when actually running the code in the docs, one would get different results than suggested by the docs.

Maybe there is some other way (with different docs syntax) that would work well. Cannot think of anything from the top of my head, but maybe someone more creative can?
msg306504 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2017-11-19 17:01
I recommend not going down this path.  The intended purpose of doctest is to test examples in documentation.  In particular, those examples should match what a user would *see* when running the examples.  In essence, the proposal is to allow tests to pass even when the examples *don't* match what the user sees.

ISTM, the str_fun() example *should* fail.  It does not show *any* real interactive prompt session than can be reproduced by a user or anything that a user would ever see.   IMO, that would be a documentation anti-pattern.
msg306506 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-11-19 18:25
I concur with Raymond. Doctest should test the representation, not value.

But I think it would be nice to support insignificant variations of the representation. Tracebacks already are treated specially, and different doctest options allow to ignore particular details. Of course ignoring the whole content of the dictionary will be not very useful.

   >>> dict_fun() # doctest: +ELLIPSIS
   {...}

But maybe some option should make accepting some permutations in the output. E.g.

   >>> dict_fun() # doctest: +PERMUTATION
   {<'foo': 1>, <'bar': 2>}

should accept both "{'foo': 1, 'bar': 2}" and "{'bar': 2, 'foo': 1}".
msg306510 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2017-11-19 18:58
`doctest` is intended to be anal - there are few things more pointlessly confusing for a user than to see docs that don't match what they actually see when they run the doc's examples.  "Is it a bug?  Did I do it wrong?  Why can't they document what it actually does?! ..."

Things like +ELLIPSIS are intended for cases where the output is _known_ to vary across platforms or runs in ways that can't otherwise be easily hidden (like output that embeds the `id()` of an object), or where only a relatively tiny bit of enormous output is actually interesting.

When someone wants unittest's `assertEqual()`, they should use unittest ;-)  

Although that functionality is already easily handled; for example, here's the OP's first example rewritten to be independent of the dict's representation ordering:

>>> dict_fun() == {'foo': 1, 'bar': 2}
True

Now it's testing what you want to test:  that the results of the expressions on both sides of `==` compare equal.  And this is, to me, clearer on the face of it than introducing a new flag.
msg306515 - (view) Author: Tomáš Petříček (Tomáš Petříček) Date: 2017-11-19 20:27
I find the idea of combining documentation with examples and unit testing appealing.
I see that this was not the original purpose of doctest but it seems to me as a reasonable use case for doctest.

>>> dict_fun() == {'foo': 1, 'bar': 2}
True
Testing equality with single expression has the drawback that one cannot see what was wrong, i.e., what the actual value was.
The result of such a test when it fails is very uninformative.

I am not sure that I know any Python developer who would be confused by "string" matching 'string', or {'a': 1, 'b': 2} matching dict(a=1, b=2).
Why True matching 1 is less confusing than "abc" matching 'abc'?

"there are few things more pointlessly confusing for a user than to see docs that don't match what they actually see when they run the doc's examples"
This is a bit tricky because what user actually sees very much depends on what console is used to run these examples, e.g., it varies between python and ipython, python and python3 etc.
So the users will be confused by these variants anyway.
Is it actually defined for basic types like str, dict or set, how the repr should look like (besides that it should be possible to "eval" the expression to get value)?
msg306520 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2017-11-20 00:50
Tomáš, thank you for the suggestion, but we're going to decline for the reasons mentioned elsewhere in this thread.

That said, it would be perfectly reasonable to post your own variant or extension on PyPI ( http://pypi.python.org ) to test the waters, to let the idea mature, and to see whether there is any uptake by the community.
msg306522 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2017-11-20 02:02
Tomáš, of course you can combine testing methods any way you like.  Don't oversell this - there's nothing actually magical about comparing objects instead of strings ;-)

I'm only -0 on this.  It grates a bit against doctest's original intents, but I appreciate it could be quite useful at times.

About the lack of showing the values when "expr1 == expr2" is false, I don't care.  I can't recall any case where, e.g., assertEqual() showing both values was actually helpful.  To the contrary, it more often filled the screen with giant reprs that were worse than useless.  By its very nature, doctest comparing against an explicit string encourages tests with brief output.  When a test fails, no matter how it's reported non-trivial work to repair it usually follows.  By far the most important part is knowing _what_ failed.

"True matching 1" is a case of practicality beats purity:  a wart for sure, but standing out precisely because it's the only wart of its kind.  I doubt most users are even aware of it, and it's certainly not something most users need to know.

About different shells, it _is_ jarring to people at first that formatting differs among them.  But since the differences show up on every single line of input and output, the differences quickly stop diluting attention.

About how much of repr() output is defined, not really all that much.  The purpose of doctest was never to accept any conceivable implementation that met the letter of the reference manual, but to capture the output CPython actually produced.  That was intentional.  Over time, I count it as a Good Thing that "but what about doctests out there?" has acted as a pressure against gratuitous changes in repr() outputs, and nudged other implementations to make "who cares?" output decisions that matched CPython's.  Every silly difference incurs various costs, and doctest did aim to make the existence of those costs visible at once.  It's a fact of life that relatively few users read the reference manual, let alone understand it - and I don't hate them for that ;-)
msg306530 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-11-20 07:50
> Although that functionality is already easily handled; for example, here's 
the OP's first example rewritten to be independent of the dict's representation 
ordering:
> >>> dict_fun() == {'foo': 1, 'bar': 2}
> 
> True

Oh, right. I remember the headache caused by dict order randomization, but 
forgot about this option. Currently doctests are rarely used in CPython tests.
History
Date User Action Args
2022-04-11 14:58:54adminsetgithub: 76223
2017-11-20 07:50:04serhiy.storchakasetmessages: + msg306530
2017-11-20 02:02:50tim.peterssetmessages: + msg306522
2017-11-20 00:50:30rhettingersetstatus: open -> closed
resolution: rejected
messages: + msg306520

stage: resolved
2017-11-19 20:27:21Tomáš Petříčeksetmessages: + msg306515
2017-11-19 18:58:27tim.peterssetmessages: + msg306510
2017-11-19 18:25:35serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg306506
2017-11-19 17:01:01rhettingersetnosy: + rhettinger
messages: + msg306504
2017-11-19 16:46:55rhettingersetnosy: + tim.peters
2017-11-19 16:04:22jbakkersetnosy: + jbakker
messages: + msg306503
2017-11-16 00:13:27r.david.murraysetnosy: + r.david.murray
messages: + msg306322
2017-11-15 23:34:31Tomáš Petříčeksetmessages: + msg306320
2017-11-15 23:31:59Tomáš Petříčeksetmessages: + msg306319
2017-11-15 23:18:42Tomáš Petříčeksetmessages: + msg306318
2017-11-15 21:39:32Tomáš Petříčekcreate