classification
Title: subtests
Type: enhancement Stage: committed/rejected
Components: Library (Lib) Versions: Python 3.4
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Arfrever, Julian, abingham, barry, bfroehle, borja.ruiz, chris.jerdonek, eric.araujo, eric.snow, exarkun, ezio.melotti, flox, fperez, haypo, hpk, michael.foord, nchauvat, ncoghlan, pitrou, python-dev, r.david.murray, santa4nt, serhiy.storchaka, spiv, terry.reedy
Priority: normal Keywords: patch

Created on 2013-01-18 21:07 by pitrou, last changed 2014-02-19 18:20 by barry. This issue is now closed.

Files
File name Uploaded Description Edit
subtests.patch pitrou, 2013-01-18 21:08 review
subtests2.patch pitrou, 2013-01-18 23:47 review
subtests3.patch pitrou, 2013-01-19 20:13 review
subtests4.patch pitrou, 2013-02-03 11:29 review
subtests5.patch pitrou, 2013-02-10 16:13 review
subtests6.patch pitrou, 2013-03-16 20:09 review
Messages (62)
msg180221 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-01-18 21:07
subtests are a light alternative to parametered tests as in issue7897. They don't generate the tests for you, they simply allow to partition a given test case in several logical units. Meaning, when a subtest fails, the other subtests in the test will still run (and the failures will print their respective parameters).

Concretely, running the follow example:

class MyTest(unittest.TestCase):

    def test_b(self):
        """some test"""
        for i in range(2, 5):
            for j in range(0, 3):
                with self.subTest(i=i, j=j):
                    self.assertNotEqual(i % 3, j)


will give the following output:

======================================================================
FAIL: test_b (__main__.MyTest) (i=2, j=2)
some test
----------------------------------------------------------------------
Traceback (most recent call last):
  File "subtests.py", line 11, in test_b
    self.assertNotEqual(i % 3, j)
AssertionError: 2 == 2

======================================================================
FAIL: test_b (__main__.MyTest) (i=3, j=0)
some test
----------------------------------------------------------------------
Traceback (most recent call last):
  File "subtests.py", line 11, in test_b
    self.assertNotEqual(i % 3, j)
AssertionError: 0 == 0

======================================================================
FAIL: test_b (__main__.MyTest) (i=4, j=1)
some test
----------------------------------------------------------------------
Traceback (most recent call last):
  File "subtests.py", line 11, in test_b
    self.assertNotEqual(i % 3, j)
AssertionError: 1 == 1

----------------------------------------------------------------------
msg180222 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-01-18 21:08
Attaching patch.
msg180225 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-01-18 21:19
+1. I was going to suggest something similar for displaying the 
I was going to suggest something similar to display the clarification message in case of a fail:

    for arg, expected in [(...),...]:
        with self.somegoodname(msg="arg=%s"%arg):
            self.assertEqual(foo(arg), expected)

But your idea is even better.
msg180226 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-01-18 21:22
Since I was asked on IRC, an example of converting an existing test. It's quite trivial really:

diff --git a/Lib/test/test_codecs.py b/Lib/test/test_codecs.py
--- a/Lib/test/test_codecs.py
+++ b/Lib/test/test_codecs.py
@@ -630,9 +630,10 @@ class UTF16BETest(ReadTest, unittest.Tes
             (b'\xdc\x00\x00A', '\ufffdA'),
         ]
         for raw, expected in tests:
-            self.assertRaises(UnicodeDecodeError, codecs.utf_16_be_decode,
-                              raw, 'strict', True)
-            self.assertEqual(raw.decode('utf-16be', 'replace'), expected)
+            with self.subTest(raw=raw, expected=expected):
+                self.assertRaises(UnicodeDecodeError, codecs.utf_16_be_decode,
+                                  raw, 'strict', True)
+                self.assertEqual(raw.decode('utf-16be', 'replace'), expected)
 
     def test_nonbmp(self):
         self.assertEqual("\U00010203".encode(self.encoding),
msg180227 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2013-01-18 21:50
I like the idea, and I think this would be a useful addition to unittest.

OTOH while this would be applicable to most of the tests (almost every test has a "for" loop to check valid/invalid values, or a few related "subtests" in the same test method), I'm not sure I would use it too often.
The reason is that often there's a single bug that causes the failure(s), and most likely the error reported by unittest is the same (or anyway it's very similar) for all the subtests, so using subtests would just add more noise (at least for me).
However it might be useful in case of sporadic failures or whenever having the full set of failures would be useful to diagnose the problem.
msg180230 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-01-18 23:18
This looks very nice. For cases where you decide you don't want it, some
kind of "fail fast" mechanism would be helpful (e.g. when you've seen the
full set of failures, and are now just trying to resolve the first one)
msg180232 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-01-18 23:47
Updated patch make subtests play nice with unittest's failfast flag:
http://docs.python.org/dev/library/unittest.html#cmdoption-unittest-f
msg180233 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2013-01-19 00:33
Nice/elegant idea.  A couple comments:

(1) What will be the semantics of TestCase/subtest failures?  Currently, it looks like each subtest failure registers as an additional failure, meaning that the number of test failures can exceed the number of test cases.  For example (with a single test case with 2 subtests):

$ ./python.exe test_subtest.py 
FF
======================================================================
FAIL: test (__main__.MyTests) (i=0)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test_subtest.py", line 9, in test
    self.assertEqual(0, 1)
AssertionError: 0 != 1

======================================================================
FAIL: test (__main__.MyTests) (i=1)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test_subtest.py", line 9, in test
    self.assertEqual(0, 1)
AssertionError: 0 != 1

----------------------------------------------------------------------
Ran 1 test in 0.001s

FAILED (failures=2)

With the way I understand it, it seems like a subtest failure should register as a failure of the TestCase as a whole, unless the subtests should be enumerated and considered tests in their own right (in which case the total test count should reflect this).

(2) Related to (1), it doesn't seem like decorators like expectedFailure are being handled correctly.  For example:

    @unittest.expectedFailure
    def test(self):
        for i in range(2):
            with self.subTest(i=i):
                self.assertEqual(i, 0)

This results in:

    Ran 1 test in 0.001s

    FAILED (failures=1, unexpected successes=1)

In other words, it seems like the decorator is being applied to each subtest as opposed to the test case as a whole (though actually, I think the first should read "expected failures=1").  It seems like one subtest failing should qualify as an expected failure, or are the semantics such that expectedFailure means that every subtest must fail?
msg180234 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-01-19 00:52
Le samedi 19 janvier 2013 à 00:33 +0000, Chris Jerdonek a écrit :
> With the way I understand it, it seems like a subtest failure should
> register as a failure of the TestCase as a whole, unless the subtests
> should be enumerated and considered tests in their own right (in which
> case the total test count should reflect this).

It does register as a failure of the TestCase as a whole. Simply, a test
case can encounter several failures: for example, one in the test method
and one in the tearDown method.

Perhaps unittest should be made to show better reporting, e.g. show the
number of successful and failed tests. I suppose there is a reason for
the current behaviour, though.

> This results in:
> 
>     Ran 1 test in 0.001s
> 
>     FAILED (failures=1, unexpected successes=1)
> 
> In other words, it seems like the decorator is being applied to each
> subtest as opposed to the test case as a whole (though actually, I
> think the first should read "expected failures=1").  It seems like one
> subtest failing should qualify as an expected failure, or are the
> semantics such that expectedFailure means that every subtest must
> fail?

I don't know. I never use expectedFailure. There doesn't seem to be any
obviously preferable answer here.
msg180235 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2013-01-19 07:31
Either way, something isn't right about how it's integrated now.  With
the patch, this:

    @unittest.expectedFailure
    def test(self):
        with self.subTest():
            self.assertEqual(0, 1)

gives: FAILED (failures=1, unexpected successes=1)

And this:

    @unittest.expectedFailure
    def test(self):
        with self.subTest():
            self.assertEqual(0, 0)

gives: OK (unexpected successes=1)

But without the patch, this:

    @unittest.expectedFailure
    def test(self):
        self.assertEqual(0, 1)

gives: OK (expected failures=1)
msg180247 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-01-19 14:49
I think we're going to have to separate out two counts in the metrics - the total number of tests (the current counts), and the total number of subtests (the executed subtest blocks). (Other parameterisation solutions can then choose whether to treat each pair of parameters as a distinct test case or as a subtest - historical solutions would appear as distinct test cases, while new approaches might choose to use the subtest machinery).

The aggregation of subtest results to test case results would then be that the test case fails if either:
- an assertion directly in the test case fails
- an assertion fails in at least one subtest of that test case

The interpretation of "expected failure" in a world with subtests is then clear: as long as at least one subtest or assertion fails, the decorator is satisfied that the expected test case failure is occurred.
msg180253 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-01-19 19:20
The way expectedFailure is currently implemented (it's a decorator which knows nothing about test cases and test results, it only expects an exception to be raised by its callee), it's gonna be difficult to make it participate with subtests without breaking compatibility.

(that said, I also think it's a useless feature)
msg180255 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-01-19 19:28
> I think we're going to have to separate out two counts in the metrics
> - the total number of tests (the current counts), and the total number
> of subtests (the executed subtest blocks).

This is a reasonable proposal. On the other hand, it was already the
case that a single test could count for several failures (for example,
an exception in the test method and another in tearDown). Therefore, I
think it is independent from the subtest proposal and could be tackled
separately (which would also help keep this issue focussed :-)).
msg180259 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-01-19 20:13
New patch attached:
- makes _SubTest a TestCase subclass
- clarifies test skipping for subtests (skipTest() only skips the subtest)
- makes expected failures work as expected by resorting to a thread-local storage hack
msg180822 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2013-01-28 03:06
After thinking about this more, it seems this lets you do two orthogonal things:

1. Easily append data to failure messages coming from a block of asserts
2. Continue running a test case after a failure from a block of asserts

Both of these seem independently useful and more generally applicable, so it seems worth discussing them in isolation before exposing them only together.  Maybe there is a nice API or combination of API's that lets you do one or the other or both.

Also, for (1) above, I'm wondering about the choice to put the extra data in the id/shortDescription of a pseudo-TestCase instead of just adding it to the exception message, for example.  Adding it to the message seems more consistent with unittest's current API.  Including the info in a new name/id seems potentially to be misusing the concept of TestCase because the test names created from this API need not be unique, and the resulting tests are not addressable/runnable.

Incidentally, I noticed that the runnability of TestCases was removed from the documentation in an unreviewed change shortly after the last patch was posted:

-An instance of a :class:`TestCase`\ -derived class is an object that can
-completely run a single test method, together with optional set-up and tidy-up
-code.

(from http://hg.python.org/cpython/rev/d1e6a48dfb0d#l1.111 )

whereas subtest TestCases from the last patch are not runnable:

+class _SubTest(TestCase):
+    ...
+    def runTest(self):
+        raise NotImplementedError("subtests cannot be run directly")

A way around these issues would be to pass the original, runnable TestCase object to TestResult.errors, etc. instead of a pseudo-TestCase.  Alternatively, subtests could be made independently addressable and runnable, but that route seems more challenging and less certain.
msg180835 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-01-28 07:36
> 1. Easily append data to failure messages coming from a block of asserts
> 2. Continue running a test case after a failure from a block of asserts

> Both of these seem independently useful and more generally applicable,

I don't understand what you mean. 1 is pointless without 2 (if you let the exception bubble up, unittest already deals with it). 2 without 1 doesn't make much sense either (if you only want to silence an exception, a simple try...except will suffice).

> I'm wondering about the choice to put the extra data in the
> id/shortDescription of a pseudo-TestCase instead of just adding it to the
> exception message, for example.  Adding it to the message seems more
> consistent with unittest's current API.

If e.g. someone lists all failed tests without detailing the exception messages, it's better if subtests are disambiguished in that list.

Also, changing the exception message might be confusing as readers expect the displayed message to be exactly str(exc).

> Incidentally, I noticed that the runnability of TestCases was removed from 
> the documentation in an unreviewed change shortly after the last patch was > posted:

runTest() is still mentioned in the TestCase constructor:
http://docs.python.org/dev/library/unittest.html#unittest.TestCase

but indeed it was removed from the module overview, because it doesn't 
correspond to modern unit testing practices. I guess it could still be covered in the API docs.

> Alternatively, subtests could be made independently addressable and
> runnable, but that route seems more challenging and less certain.

I would say impossible, unless you know a way to run a `with` block in isolation ;-)
msg180839 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-01-28 08:47
Right, if you want independently addressable/runnable, then you're back to parameterised tests as discussed in issue7897.

What I like about Antoine's subtest idea is that I think it can be used to split the execution/reporting part of parameterised testing from the addressing/selection part.

That is, while *this* patch doesn't make subtests addressable, a future enhancement or third party test runner could likely do so. (It wouldn't work with arbitrary subtests, it would only be for data driven variants like those described in issue7897)
msg180891 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2013-01-29 01:47
>> 1. Easily append data to failure messages coming from a block of asserts
>> 2. Continue running a test case after a failure from a block of asserts
>
>> Both of these seem independently useful and more generally applicable,
>
> I don't understand what you mean. 1 is pointless without 2 (if you let the exception bubble up, unittest already deals with it). 2 without 1 doesn't make much sense either (if you only want to silence an exception, a simple try...except will suffice).

I'll explain.  (1) is useful without (2) because it lets you add information to the failure data for a group of asserts without having to use the "msg" parameter in every call to assert().  This is useful, for example, if you're testing a number of cases in a loop (with the current behavior of ending the test on first failure), and it's not clear from the default exception message which iteration of the loop failed.  Your original example is such a case (minus the part about continuing in case of failure).  This use case is basically the one addressed by Serhiy's suggestion in this message:

http://bugs.python.org/issue16997#msg180225

(2) is useful without (1) if you'd like to get information about more than one assertion failure in a TestCase (just as in your proposal), but the assertions aren't necessarily coming from a "parametrization" or different iterations of a loop.  With the proposed API, you'd do something like:

    with self.subTest():
        # First assertion
        ...
    with self.subTest():
        # Second assertion
        ...

The difference here is that I wouldn't call these "subtests," and you don't need parameter data to know which assertion is at fault.  The meaning here is more like "with self.continueTesting()".

> Also, changing the exception message might be confusing as readers expect the displayed message to be exactly str(exc).

If the API is more like self.assert*()'s "msg" parameter which appends data to the usual exception, then it will be the same as what people are already used to.  Also see the "longMessage" attribute (which defaults to True), which separates the extra message data from the default exception message:

http://docs.python.org/dev/library/unittest.html#unittest.TestCase.longMessage

>> Alternatively, subtests could be made independently addressable and
>> runnable, but that route seems more challenging and less certain.
>
> I would say impossible, unless you know a way to run a `with` block in isolation ;-)

I'm not advocating independent addressability/runnability of subtests or the following approach, but a naive way to do this would be to run the TestCase as usual, but skip over any subTest blocks if the parameter data isn't an exact match.
msg180893 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-01-29 07:20
> If the API is more like self.assert*()'s "msg" parameter which appends 
> data to the usual exception, then it will be the same as what people
> are already used to.

It might be a good idea to allow both this and the arbitrary parameter kwargs, then.

> I'm not advocating independent addressability/runnability of subtests
> or the following approach, but a naive way to do this would be to run
> the TestCase as usual, but skip over any subTest blocks if the
> parameter data isn't an exact match.

Well, I still don't know how to skip a `with` block (short of raising an exception that will terminate the entire test).
msg180899 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-01-29 10:54
I like the idea of the subTest API being something like:

    def subTest(self, _id, *, **params):

However, I'd still factor that in to the reported test ID, not into the exception message.
msg180977 - (view) Author: Michael Foord (michael.foord) * (Python committer) Date: 2013-01-30 17:52
I am concerned that this feature changes the TestResult API in a backwards incompatible way. There are (quite a few) custom TestResult  objects that just implement the API and don't inherit from TestResult. I'd like to try this new code with (for example) the test result provided by the junitxml project and see what happens.

I have a broader concern too. I have seen lots of requests for "test parameterization" and none *specifically* for sub tests. They are very similar mechanisms (with key differences), so I don't think we want *both* in unittest. If test parameterization is more powerful / flexible then I would rather see that *instead*. 

On the other hand if subtests are *better* then lets have them instead - but I'd like to see that discussion happen before we just jump into subtests.
msg180978 - (view) Author: Michael Foord (michael.foord) * (Python committer) Date: 2013-01-30 17:54
Note, some brief discussion on the "testing in python" mailing list:

http://lists.idyll.org/pipermail/testing-in-python/2013-January/005356.html
msg180980 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-01-30 18:11
> I am concerned that this feature changes the TestResult API in a
> backwards incompatible way. There are (quite a few) custom TestResult
> objects that just implement the API and don't inherit from TestResult.
> I'd like to try this new code with (for example) the test result
> provided by the junitxml project and see what happens.

Feel free to do it, of course :-)
In any case, we can add a hook to the TestCase subclass so that
specialized reporters can get at the parent TestCase.

(and I'm not sure how it's backwards incompatible: as long as you don't
use subtests, nothing should break - so existing software shouldn't be
affected)

> I have a broader concern too. I have seen lots of requests for "test
> parameterization" and none *specifically* for sub tests. They are very
> similar mechanisms (with key differences), so I don't think we want
> *both* in unittest. If test parameterization is more powerful /
> flexible then I would rather see that *instead*. 

The underlying idea of subtests is "if you want to parameterize a test,
here is a useful building block". Generic parameterization APIs are
cumbersome and actually *harder* to write (and read!) than the
corresponding `for` loop. With subtests, you just write your `for` loop
and use a subtest to isolate each iteration's failures.
msg180985 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2013-01-30 18:56
> I am concerned that this feature changes the TestResult API in a backwards incompatible way.

My suggestion to add the original TestCase object to TestResult.errors, etc. instead and add the extra failure data to the longDescription would address this concern, which is why I suggested it.  The former is what currently happens with multiple failures per TestCase (e.g. in runTest() and tearDown()).

> The underlying idea of subtests is "if you want to parameterize a test,
here is a useful building block".

The current API doesn't seem like a good building block because it bundles orthogonal features (i.e. to add loop failure data to a block of asserts you have to use the continuance feature).  Why not expose *those* as the building blocks?  The API can be something like--

    with self.addMessage(msg):
        # Add msg to the longDescription of any assertion failure within.
    
    with self.continueTest(msg=''):
        # Keep running the TestCase on any assertion failure within.

(The current subTest() is basically equivalent to continueTest() with a specialized message.  It could be added, too, if desired.)  Accepting a string message is more basic and flexible than allowing only a **kwargs dict, which seems a bit "cute" and specialized to me.
msg180986 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-01-30 19:04
> The current API doesn't seem like a good building block because it
> bundles orthogonal features (i.e. to add loop failure data to a block
> of asserts you have to use the continuance feature).  Why not expose
> *those* as the building blocks?  The API can be something like--
> 
>     with self.addMessage(msg):
>         # Add msg to the longDescription of any assertion failure
> within.
>     
>     with self.continueTest(msg=''):
>         # Keep running the TestCase on any assertion failure within.
> 
> (The current subTest() is basically equivalent to continueTest() with
> a specialized message.  It could be added, too, if desired.)
> Accepting a string message is more basic and flexible than allowing
> only a **kwargs dict, which seems a bit "cute" and specialized to me.

I've already replied to all this.
msg180989 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2013-01-30 20:00
> I've already replied to all this.

You didn't respond to the idea of exposing both features separately after saying you didn't understand what I meant and saying that they were pointless and didn't make sense.  So I explained and also proposed a specific API to make the suggestion clearer and more concrete.  These don't seem pointless to me at all.
msg180991 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-01-30 20:22
> You didn't respond to the idea of exposing both features separately
> after saying you didn't understand what I meant and saying that they
> were pointless and didn't make sense.  So I explained and also
> proposed a specific API to make the suggestion clearer and more
> concrete.

Well, suffice to say that I wasn't convinced at all. There are multiple
use cases for subtests in the Python test suite, but I can't think of
any for your proposed API separation. That's why I find it
uninteresting.

I'm making this proposal to solve a concrete issue, not in the interest
of minimalism. "Building block" was to be understood in that sense. Unit
testing is one of those areas where purity is a secondary concern.
msg181003 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-01-31 01:07
Right. I have *heaps* of tests that would be very easy to migrate to
Antoine's subtest API. A separate "addMessage" API could conceivably be
helpful for debugging, but it's not the obvious improvement to my existing
looping tests that subtests would be.
msg181262 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-02-03 11:29
This new patch adds some documentation.
msg181786 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-02-10 11:24
Any other comments on this? If not, I would like to commit soon.
msg181787 - (view) Author: Michael Foord (michael.foord) * (Python committer) Date: 2013-02-10 11:26
Please don't commit I think we still need a discussion as to whether subtests or paramaterized tests are a better approach. I certainly don't think we need both and there are a lot of people asking for parameterized tests. I also haven't had a chance to look at how this affects existing tools that extend unittest by implementing the TestResult api (how they will cope with test suites using subtests).
msg181788 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-02-10 11:41
> Please don't commit I think we still need a discussion as to whether
> subtests or paramaterized tests are a better approach. I certainly
> don't think we need both and there are a lot of people asking for
> parameterized tests.

I think they don't cater to the same crowd. I see parameterized tests as
primarily used by people who like adding formal complexity to their
tests in the name of architectural elegance (decorators, non-intuitive
constructs and other additional boilerplate). Subtests are meant to not
get in the way. IMHO, this makes them more suitable for stdlib
inclusion, while the testing-in-python people can still rely on their
additional frameworks.

Also, subtests would be immediately and trivially usable in our own test
suite.
msg181789 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-02-10 11:43
You can use subtests to build parameterized tests, you can't use
parameterized tests to build subtests. The standard library can also
be converted to using subtests *far* more readily than it could be
converted to parameterized tests. There's also the fact that creating
a decent parameterized tests without subtests support requires PEP
422.

If you're telling us we can only have one, then I choose subtests, and
third party test frameworks can layer parameterized tests on top.
However, I think you're making a mistaking by seeing them as
*competing* APIs, rather than seeing subtests as a superior
implementation strategy for the possible later introduction of a
higher level parameterized tests API.
msg181790 - (view) Author: Michael Foord (michael.foord) * (Python committer) Date: 2013-02-10 11:46
Subtests break the current unittest api of suite.countTests() and I fear they will also break tools that use the existing test result api to generate junit xml for continuous integration.

I would like to add a "parameterized test" mechanism to unittest - but preferably in a way that doesn't break all the existing tools.
msg181791 - (view) Author: Michael Foord (michael.foord) * (Python committer) Date: 2013-02-10 11:50
"However, I think you're making a mistaking by seeing them as
*competing* APIs, rather than seeing subtests as a superior
implementation strategy for the possible later introduction of a
higher level parameterized tests API."

Parameterized tests are done at test collection time and sub-tests at run time. You can't layer parameterization on top of subtests.
msg181792 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-02-10 11:57
> Subtests break the current unittest api of suite.countTests() and I
> fear they will also break tools that use the existing test result api
> to generate junit xml for continuous integration.

It depends how you define countTests(). sub-tests, as the name implies,
are not independent test cases. As for breaking tools, I don't really
understand what's special about junit xml. Why would a subtest failure
be different from a test failure in that regard?
msg181794 - (view) Author: Michael Foord (michael.foord) * (Python committer) Date: 2013-02-10 12:21
A comment from lifeless on IRC (Robert Collins):

[12:15:46] <lifeless>	 please consider automated analysis. How can someone tell which test actually failed ?
[12:15:55] <lifeless>	 How can they run just that test in future ?
msg181800 - (view) Author: Michael Foord (michael.foord) * (Python committer) Date: 2013-02-10 12:57
My concern is that this re-uses the existing TestResult.add* methods in a different way (including calling addError multiple times). This can break existing tools.

Fix suggested by lifeless on IRC. A sub test failure / success / exception calls the following TestResult method:

   addSubTest(test, sub_id, err_or_None)


If we're using a TestResult object that doesn't have these methods (an "old" result objects) then the test can *stop* (i.e. revert to the old behaviour before sub tests existed).
msg181811 - (view) Author: Michael Foord (michael.foord) * (Python committer) Date: 2013-02-10 15:01
And on the "superior implementation strategy", both nose and py.test used to have runtime test generation and both have deprecated them and moved to collection time parameterization. (But I guess we know better.)

You don't need PEP 422 for parameterization. The TestLoader creates multiple test cases for the parameter sets.
msg181814 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2013-02-10 15:28
I don't really have strong feelings about this, but I will just note as a data point that I implemented parameterized tests for the email package, and have no interest myself in subtests.  This is for exactly the collection time vs runtime reason that Michael is talking about (I always want to be able to run single tests by name).
msg181817 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-02-10 16:13
Here is a patch implementing Michael's and lifeless' proposed strategy.
msg181866 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2013-02-11 01:12
I'm still opposed to exposing these features only together.  Annotating the failure message with parametrization data is useful in its own right, but what if there are 500 subtests in a loop and you don't want 500 failures to be registered for that test case?  This is related to Ezio's comment near the top about adding too much noise.

addMessage was just one suggestion.  A different, functionally equivalent suggestion would be to add a "failFast" (default: False) keyword parameter to subTest() or alternatively a "maxFailures" (default: None) keyword parameter.
msg181871 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2013-02-11 03:37
It seems like the last patch (subtests5.patch) dropped the parameter data from the failure output as described in the docs.  For example, the example in the docs yields the following:

    FAIL: test_even (__main__.NumbersTest)

instead of the documented:

    FAIL: test_even (__main__.NumbersTest) (i=1)

subtests4.patch is okay.
msg181875 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-02-11 07:06
> It seems like the last patch (subtests5.patch) dropped the parameter
> data from the failure output as described in the docs.  For example,
> the example in the docs yields the following:
> 
>     FAIL: test_even (__main__.NumbersTest)

Weird, since there are unit tests for that. I'll take a look.
msg181881 - (view) Author: holger krekel (hpk) Date: 2013-02-11 08:01
On Sun, Feb 10, 2013 at 12:41 PM, Antoine Pitrou <report@bugs.python.org>wrote:

>
> Antoine Pitrou added the comment:
>
> > Please don't commit I think we still need a discussion as to whether
> > subtests or paramaterized tests are a better approach. I certainly
> > don't think we need both and there are a lot of people asking for
> > parameterized tests.
>
> I think they don't cater to the same crowd. I see parameterized tests as
> primarily used by people who like adding formal complexity to their
> tests in the name of architectural elegance (decorators, non-intuitive
> constructs and other additional boilerplate). Subtests are meant to not
> get in the way. IMHO, this makes them more suitable for stdlib
> inclusion, while the testing-in-python people can still rely on their
> additional frameworks.
>

Parametrized testing wasn't introduced because I or others like formal
complexity.  I invented the "yield" syntax that both pytest and nose still
support and it was enhanced for several years until it was deemed not fit
for a general purpose testing approach.  More specifically, if you have
functional parametrized tests, they each run relatively slow.   People
often want to re-run a single failing test or otherwise select tests which
use a certain fixture, or send tests to different CPUs to speed up
execution.   That's all not possible with subtests or am i missing
something?

That being said, I have plans to support some form of "subtests" for pytest
as well, as there are indeed cases where a more lightweight approach fits
well, especially for unit- or integration tests where one doesn't care if a
group of tests need to be re-run when working on fixing a failure to a
single subtest.  And where it's usually more about reporting, getting nice
debuggable output on failures.  I suspect the cases which Antoine refers
satisfy this pattern.

best,
holger
msg181882 - (view) Author: holger krekel (hpk) Date: 2013-02-11 08:09
On Sun, Feb 10, 2013 at 12:43 PM, Nick Coghlan <report@bugs.python.org>wrote:

>
> Nick Coghlan added the comment:
>
> You can use subtests to build parameterized tests, you can't use
> parameterized tests to build subtests.

I doubt you can implement parametrized tests via subtests especially for
functional testing and its fixture needs.

The standard library can also
> be converted to using subtests *far* more readily than it could be
> converted to parameterized tests.

I can see that and for this reason and the stdlib's use cases it might make
sense to support it.  The unittest module is also used in many other
contexts so it shouldn't be the only verification criterium.

best,
holger
msg181888 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-02-11 09:55
> what if there are 500 subtests in a loop and you don't want 500 failures to be
> registered for that test case?

Parametered tests have the same issue. In this case you simply don't use subtests
or test cases. On the other hand, the issue doesn't exist in most cases where you
iterate over three or four different cases.

> addMessage was just one suggestion.

It is quite orthogonal, actually, and could be added independently. Also, it is not clear how you would limit the addMessage to a subtest, rather than the whole test case.
We could re-use testtools' addDetail idea, although their content-type thing
is a bit heavyweight: I'd rather duck-type the API.

http://testtools.readthedocs.org/en/latest/for-test-authors.html#details
msg181930 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2013-02-11 20:04
>> what if there are 500 subtests in a loop and you don't want 500 failures to be
>> registered for that test case?
>
> Parametered tests have the same issue. In this case you simply don't use subtests
> or test cases.

Right, but then you lose out on both of the benefits documented for subtests:

+Without using a subtest, execution would stop after the first failure,
+and the error would be less easy to diagnose because the value of ``i``
+wouldn't be displayed::

Why tie these together?  I'm suggesting that we expose a way to benefit from the "easy to diagnose" portion without the "suspend stoppage" portion.  (The way we do this doesn't have to be one of the ways I'm suggesting, though I've suggested a few.)

>> addMessage was just one suggestion.
>
> It is quite orthogonal, actually, and could be added independently. Also, it is not clear how you would limit the addMessage to a subtest, rather than the whole test case.

It's not orthogonal because the way I suggested it, subTest() would be the composition of addMessage() and continueTest() context managers.  (addMessage limits itself by being a context manager just like subTest.)  So if we added addMessage() later, we would want to refactor subTest() to be using it.  The equivalence would be something like the following:

    with self.subTest(msg=msg, i=i):
        self.assertEqual(i % 2, 0)

    with self.continueTest():
        with self.addMessage(msg, i=i):
            self.assertEqual(i % 2, 0)

However, since it looks like we're going with changing the test case ID instead of putting the extra data only in the exception message (TestCase.longMessage) like I was suggesting before, I think adding a failFast=True or maxFailures=n parameter to subTest() has higher importance, e.g.

    with self.subTest(msg=msg, maxFailures=1, i=i):
        self.assertEqual(i % 2, 0)
msg181938 - (view) Author: Andrew Bennetts (spiv) Date: 2013-02-12 01:08
googletest (an xUnit style C++ test framework) has an interesting feature: in addition to assertions like ASSERT_EQ(x, y) that stop the test, it has EXPECT_EQ(x, y) etc that will cause the test to fail without causing it to stop.  I think this decoupling of “test failed” and “test execution stopped” is very useful.  (Note this also implies a single test can have multiple failures, or if you prefer that a single test can have multiple messages attached to explain why its state is 'FAILED'.)

I wouldn't like to see a duplication of all assert* methods as expect* methods, but there are alternatives.  A single self.expectThat method that takes a value and a matcher, for instance.

Or you could have a context manager:

with self.continueOnFailure():
    self.assertEqual(x, y)

In fact, I suppose that's more-or-less what the subtests patch offers?  Except the subtests feature seems to want to get involved in knowing about parameters and the like too, which feels weird to me.

Basically, I really don't like the “subtests” name, but if instead it's named something that directly says its only effect is that failures don't abort the test, then I'd be happy.
msg182924 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-02-25 06:43
My day job these days is to work on the Beaker test system (http://beaker-project.org).

I realised today that it actually includes a direct parallel to Antoine's proposed subtest concept: whereas in unittest, each test currently has exactly one result, in Beaker a given task may have *multiple* results. The overall result of the task is then based on the individual results (so if any result is a failure, the overall test is a failure).

"tasks" are the smallest addressable unit in deciding what to run, but each task may report multiple results, allowing fine-grained reporting of what succeeded and what failed.

That means there's a part of Antoine's patch I disagree with: the change to eliminate the derived "overall" result attached to the aggregate test. I think Beaker's model, where there's a single result for the overall task (so you can ignore the individual results if you don't care), and the individual results are reported separately (if you do care), will make it easier to provide something that integrates cleanly with existing test runners.

The complexity involved in attempting to get expectedFailure() to behave as expected is also a strong indication that there are still problems with the way these results are being aggregated.
msg183289 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-03-01 20:39
> That means there's a part of Antoine's patch I disagree with: the
> change to eliminate the derived "overall" result attached to the
> aggregate test.

The patch doesn't eliminate it, there are even tests for it.
(see the various call order tests)

> The complexity involved in attempting to get expectedFailure() to
> behave as expected is also a strong indication that there are still
> problems with the way these results are being aggregated.

No, the complexity stems from the fact that the expectedFailure decorator knows nothing about the test running machinery and instead blindly raises an exception.
msg183450 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-03-04 13:39
I think I have figured out what bothers me about the expectedfailure changes, and they actually relate to how expectedfailure was implemented in the first place: I had previously assumed that decorator was an *annotating* decorator - that it set an attribute on the function to indicate it was expected to fail, and the test execution machinery then took that into account when deciding how to interpret the test result.

The fact that it is instead a *wrapping* decorator is what made the addition of subtest support far more complicated than I expected.

However, I'm wondering if it might still be possible to avoid the need for a thread local context to handle the combination of expected failures and subtests when we have access to the test caseby adding the annotation that I expected to be there in the first place. Can't we:

1. Update the expectedfailure decorator to also set "_expecting_failure = True" on the wrapped test case function
2. Update unittest.TestCase.__init__ to do: "self._expecting_failure = hasattr(testMethod, '__func__') and getattr(testMethod.__func__, '_expecting_failure', False)"
3. Update unittest._SubTest.__init__ to do: "self._expecting_failure = test_case._expecting_failure"
4. Update testPartExecutor to check "if test_case._expecting_failure:" instead of the current check of the thread local context
msg183452 - (view) Author: Michael Foord (michael.foord) * (Python committer) Date: 2013-03-04 13:41
Getting rid of the thread local would be an improvement, and the change to how expected failures is done sounds good too.
msg183454 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-03-04 14:01
> However, I'm wondering if it might still be possible to avoid the
> need for a thread local context to handle the combination of
> expected failures and subtests when we have access to the test
> caseby adding the annotation that I expected to be there in the
> first place.

But that would break use cases where you use @expectedFailure on a
function called by the test method, not directly on the test method
itself. I don't really care about those use cases myself, but not
breaking them is the reason I chose not to change the @expectedFailure
implementation. I'll let Michael decide :-)
msg183468 - (view) Author: Michael Foord (michael.foord) * (Python committer) Date: 2013-03-04 16:12
That's a use case that I'm not very *interested* in supporting personally - however it may well be a use case that was "designed in" and that others have a need for (I didn't implement expectedFailure support). 

I think expectedFailure should be used sparingly and for a utility function to be *able* to turn a failure into a expectedFailure sounds actively dangerous.
msg183469 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-03-04 16:19
The docs are fairly explicit about the intended use case: "Mark the test as an expected failure. If the test fails when run, the test is not counted as a failure." (from http://docs.python.org/3/library/unittest#unittest.expectedFailure)

Nothing there about being able to call some other function and have doing so incidentally mark your test as an expected failure (which I actually consider highly *un*desirable behaviour)
msg183470 - (view) Author: Michael Foord (michael.foord) * (Python committer) Date: 2013-03-04 16:20
So, if it's not documented behaviour I think it's fine to lose it.
msg184349 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-03-16 20:09
Updated patch simplifying the expectedFailure implementation, as suggested by Michael and Nick and Michael.

(admire how test_socket was importing a private exception class!)
msg184781 - (view) Author: Roundup Robot (python-dev) Date: 2013-03-20 19:21
New changeset 5c09e1c57200 by Antoine Pitrou in branch 'default':
Issue #16997: unittest.TestCase now provides a subTest() context manager to procedurally generate, in an easy way, small test instances.
http://hg.python.org/cpython/rev/5c09e1c57200
msg184782 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-03-20 19:24
Finally committed :) Thanks everyone for the reviews and suggestions.
msg211634 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2014-02-19 18:13
We just discovered that this change breaks testtools.  I will file a new bug on that.
msg211637 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2014-02-19 18:20
See issue #20687
History
Date User Action Args
2014-02-19 18:20:52barrysetmessages: + msg211637
2014-02-19 18:13:34barrysetmessages: + msg211634
2013-03-20 19:24:01pitrousetstatus: open -> closed
resolution: fixed
messages: + msg184782

stage: patch review -> committed/rejected
2013-03-20 19:21:02python-devsetnosy: + python-dev
messages: + msg184781
2013-03-17 14:48:36hayposetnosy: + haypo
2013-03-16 20:09:05pitrousetfiles: + subtests6.patch

messages: + msg184349
2013-03-04 16:30:24barrysetnosy: + barry
2013-03-04 16:20:34michael.foordsetmessages: + msg183470
2013-03-04 16:19:47ncoghlansetmessages: + msg183469
2013-03-04 16:12:59michael.foordsetmessages: + msg183468
2013-03-04 14:01:44pitrousetmessages: + msg183454
2013-03-04 13:41:28michael.foordsetmessages: + msg183452
2013-03-04 13:39:22ncoghlansetmessages: + msg183450
2013-03-04 07:06:50terry.reedysetnosy: + terry.reedy
2013-03-01 20:39:45pitrousetmessages: + msg183289
2013-02-25 06:43:13ncoghlansetmessages: + msg182924
2013-02-12 01:08:05spivsetmessages: + msg181938
2013-02-11 20:40:19Yaroslav.Halchenkosetnosy: - Yaroslav.Halchenko
2013-02-11 20:34:17brett.cannonsetnosy: - brett.cannon
2013-02-11 20:04:57chris.jerdoneksetmessages: + msg181930
2013-02-11 09:55:10pitrousetmessages: + msg181888
2013-02-11 08:09:05hpksetmessages: + msg181882
2013-02-11 08:01:12hpksetmessages: + msg181881
2013-02-11 07:06:45pitrousetmessages: + msg181875
2013-02-11 03:53:18brian.curtinsetnosy: - brian.curtin
2013-02-11 03:37:34chris.jerdoneksetmessages: + msg181871
2013-02-11 01:12:57chris.jerdoneksetmessages: + msg181866
2013-02-10 16:13:17pitrousetfiles: + subtests5.patch

messages: + msg181817
2013-02-10 15:28:10r.david.murraysetmessages: + msg181814
2013-02-10 15:01:57michael.foordsetmessages: + msg181811
2013-02-10 13:13:52floxsetnosy: + flox
2013-02-10 12:57:59michael.foordsetmessages: + msg181800
2013-02-10 12:21:14michael.foordsetmessages: + msg181794
2013-02-10 11:57:54pitrousetmessages: + msg181792
2013-02-10 11:50:11michael.foordsetmessages: + msg181791
2013-02-10 11:46:25michael.foordsetmessages: + msg181790
2013-02-10 11:43:19ncoghlansetmessages: + msg181789
2013-02-10 11:41:37pitrousetmessages: + msg181788
2013-02-10 11:26:45michael.foordsetmessages: + msg181787
2013-02-10 11:24:26pitrousetmessages: + msg181786
2013-02-03 11:29:39pitrousetfiles: + subtests4.patch

messages: + msg181262
2013-01-31 01:07:56ncoghlansetmessages: + msg181003
2013-01-30 20:22:29pitrousetmessages: + msg180991
2013-01-30 20:00:39chris.jerdoneksetmessages: + msg180989
2013-01-30 19:04:46pitrousetmessages: + msg180986
2013-01-30 18:56:42chris.jerdoneksetmessages: + msg180985
2013-01-30 18:11:17pitrousetmessages: + msg180980
2013-01-30 17:54:36michael.foordsetmessages: + msg180978
2013-01-30 17:52:21michael.foordsetmessages: + msg180977
2013-01-29 10:54:02ncoghlansetmessages: + msg180899
2013-01-29 07:20:56pitrousetmessages: + msg180893
2013-01-29 01:47:58chris.jerdoneksetmessages: + msg180891
2013-01-28 08:47:07ncoghlansetmessages: + msg180839
2013-01-28 07:36:10pitrousetmessages: + msg180835
2013-01-28 03:06:08chris.jerdoneksetmessages: + msg180822
2013-01-20 12:17:29Arfreversetnosy: + Arfrever
2013-01-19 21:40:22florian-rathgebersetnosy: - florian-rathgeber
2013-01-19 20:13:34pitrousetfiles: + subtests3.patch

messages: + msg180259
2013-01-19 19:28:05pitrousetmessages: + msg180255
2013-01-19 19:20:43brett.cannonsetnosy: + brett.cannon
2013-01-19 19:20:28pitrousetmessages: + msg180253
2013-01-19 14:49:31ncoghlansetmessages: + msg180247
2013-01-19 07:31:32chris.jerdoneksetmessages: + msg180235
2013-01-19 00:52:04pitrousetmessages: + msg180234
2013-01-19 00:33:30chris.jerdoneksetmessages: + msg180233
2013-01-18 23:47:48pitrousetfiles: + subtests2.patch

messages: + msg180232
2013-01-18 23:18:11ncoghlansetmessages: + msg180230
2013-01-18 21:50:18ezio.melottisetmessages: + msg180227
2013-01-18 21:22:53pitrousetmessages: + msg180226
2013-01-18 21:19:19serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg180225
2013-01-18 21:08:20pitrousetstage: patch review
2013-01-18 21:08:12pitrousetfiles: + subtests.patch
keywords: + patch
messages: + msg180222
2013-01-18 21:07:11pitroucreate