This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: unittest subTest failure causes result to be omitted from listing
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.11
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: ezio.melotti, lukasz.langa, martin.panter, michael.foord, pitrou, r.david.murray, rbcollins, serhiy.storchaka, zach.ware
Priority: normal Keywords: patch

Created on 2015-12-17 05:49 by zach.ware, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 28082 merged serhiy.storchaka, 2021-08-31 07:55
Messages (12)
msg256580 - (view) Author: Zachary Ware (zach.ware) * (Python committer) Date: 2015-12-17 05:49
The title can barely be called accurate; the description of the problem isn't easy to condense to title length.  Here's the issue:

$ cat subtest_test.py 
import os
import unittest

class TestClass(unittest.TestCase):

    def test_subTest(self):
        for t in map(int, os.environ.get('tests', '1')):
            with self.subTest(t):
                if t > 1:
                    raise unittest.SkipTest('skipped')
                self.assertTrue(t)

if __name__ == '__main__':
    unittest.main()
$ ./python.exe subtest_test.py 
.
----------------------------------------------------------------------
Ran 1 test in 0.000s

OK
$ tests=01 ./python.exe subtest_test.py 

======================================================================
FAIL: test_subTest (__main__.TestClass) (<subtest>)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "subtest_test.py", line 12, in test_subTest
    self.assertTrue(t)
AssertionError: 0 is not true

----------------------------------------------------------------------
Ran 1 test in 0.001s

FAILED (failures=1)
$ tests=012 ./python.exe subtest_test.py 
s
======================================================================
FAIL: test_subTest (__main__.TestClass) (<subtest>)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "subtest_test.py", line 12, in test_subTest
    self.assertTrue(t)
AssertionError: 0 is not true

----------------------------------------------------------------------
Ran 1 test in 0.001s

FAILED (failures=1, skipped=1)



Note that on the first run, the short summary is ".", as expected.  The second is "", when one of the subTests fails, but then the third is "s", when one subtest fails but another is skipped.  This also extends to verbose mode:

$ ./python.exe subtest_test.py -v
test_subTest (__main__.TestClass) ... ok

----------------------------------------------------------------------
Ran 1 test in 0.001s

OK
$ tests=01 ./python.exe subtest_test.py -v
test_subTest (__main__.TestClass) ... 
======================================================================
FAIL: test_subTest (__main__.TestClass) (<subtest>)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "subtest_test.py", line 12, in test_subTest
    self.assertTrue(t)
AssertionError: 0 is not true

----------------------------------------------------------------------
Ran 1 test in 0.001s

FAILED (failures=1)
$ tests=012 ./python.exe subtest_test.py -v
test_subTest (__main__.TestClass) ... skipped 'skipped'

======================================================================
FAIL: test_subTest (__main__.TestClass) (<subtest>)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "subtest_test.py", line 12, in test_subTest
    self.assertTrue(t)
AssertionError: 0 is not true

----------------------------------------------------------------------
Ran 1 test in 0.001s

FAILED (failures=1, skipped=1)


Note the first run shows "... ok", the second "... ", and the third "... skipped 'skipped'"


I'm unsure what the solution should be.  There should at least be some indication that the test finished, but should mixed results be reported as 'm' ("mixed results" in verbose mode), or should failure/error take precedence, or should every different result be represented?
msg256588 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-12-17 07:45
Okay, so you have a test with subtests. You have presented three cases:

1. Single subtest which passes. No problem I assume.

2. Two subtests: 1st fails, 2nd passes. This is how subtests are normally used, so I guess there is no problem. Is that right?

3. After two subtests have already run (one of which failed), SkipTest is raised. I guess you want the test results to be reported better in this case.

What is the use case? Why not skip the test before any subtests are started?
msg256590 - (view) Author: Zachary Ware (zach.ware) * (Python committer) Date: 2015-12-17 08:05
Martin Panter added the comment:
> Okay, so you have a test with subtests. You have presented three cases:
>
> 1. Single subtest which passes. No problem I assume.

Or several subtests which pass.  No problems.

> 2. Two subtests: 1st fails, 2nd passes. This is how subtests are normally used, so I guess there is no problem. Is that right?

Any of multiple subtests fail, and there is no indication in the
"summary line" (the line that is usually "..........................",
a dot for each successful test).  When a a regular test fails, an F
(or an E, if the raised exception was anything but
self.failureException) is added to the line; when any subtests fail,
nothing is added.  If you have 10 tests methods that use subtests, and
any subtest in each method fails, your summary line will be blank.  In
verbose mode, you'd get "test_one ... test_two ... test_three ... ..."
(note lack of newlines) instead of the expected "test_one ...
FAILURE\ntest_two ... FAILURE\ntest_three ... FAILURE\n..." (note the
newlines).

> 3. After two subtests have already run (one of which failed), SkipTest is raised. I guess you want the test results to be reported better in this case.
>
> What is the use case? Why not skip the test before any subtests are started?

Only the subtest is skipped (which should be valid, or documented as
not valid), and the order of the subtests doesn't matter:

$ tests=210 ./python.exe subtest_test.py -v
test_subTest (__main__.TestClass) ... skipped 'skipped'

======================================================================
FAIL: test_subTest (__main__.TestClass) (<subtest>)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "subtest_test.py", line 14, in test_subTest
    self.assertTrue(t)
AssertionError: 0 is not true

----------------------------------------------------------------------
Ran 1 test in 0.001s

FAILED (failures=1, skipped=1)

But, the summary makes it seem as though the entire test was skipped.

Hopefully this makes it a bit clearer :)
msg256604 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-12-17 13:25
I believe this was discussed at the time subTest was added and deemed an acceptable tradeoff for a simpler implementation.  I'm not sure it is, but I'm not prepared to write code to fix it :)  I'm bothered every time I see this, but I have to admit that the tracebacks are the most important feedback and you do get those.
msg256631 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-12-18 00:33
Yes now I understand. If a subtest fails, there is no status update (not even a newline in verbose mode), and each subtest skip triggers a separate status update.

My gut feeling is that any subtest failure should be counted as the whole test failing. I’m not sure how the failure vs error cases should be handled. Maybe error should trump failure.

Judging by <https://bugs.python.org/issue16997#msg180259>, Antoine intended for SkipTest to skip subtests. But I’m not sure that be reported as the whole test being skipped.
msg257146 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2015-12-28 22:52
> My gut feeling is that any subtest failure should be counted as the
> whole test failing. I’m not sure how the failure vs error cases should
> be handled. Maybe error should trump failure.

I think the priority should be error > failure > skip > pass.
IOW, pass should never be reported if any of the other 3 happen, skip should be reported only if there are no errors/failures, and errors should trump failures.
msg261834 - (view) Author: Robert Collins (rbcollins) * (Python committer) Date: 2016-03-15 23:04
The basic model is this:
 - a test can have a single outcome [yes, the api is ambiguous, but there it is]
 - subtests let you identify multiple variations of a single test (note the id tweak etc) and *may* be reported differently

We certainly must not report the test as a whole passing if any subtest did not pass.

Long term I want to remove the error/failure partitioning of exceptions; its not actually useful.

The summary for the test, when subtests are used, should probably enumerate the states.

test_foo (3 passed, 2 skipped, 1 failure)

in much the same way the run as a whole is enumerated.
msg400573 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2021-08-30 06:18
Were not subtests proposed as a more flexible replacement of parametrized tests? I think that every subtest should be counted as a separate test case: in verbose mode it should output a separate line and in non-verbose mode it should output a separate character.
msg400700 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2021-08-31 08:16
PR 28082 is a draft that implements this idea. Skipped and failed (but not successfully passed) subtests are now reported separately, as a character (sFE) or a line ("skipped", "FAIL", "ERROR"). The description of the subtest is included in the line. For example:

$ tests=.sFE ./python test_issue25894.py 
sFE
======================================================================
ERROR: test_subTest (__main__.TestClass) [3] (t='E')
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/serhiy/py/cpython/test_issue25894.py", line 15, in test_subTest
    raise Exception('error')
    ^^^^^^^^^^^^^^^^^^^^^^^^
Exception: error

======================================================================
FAIL: test_subTest (__main__.TestClass) [2] (t='F')
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/serhiy/py/cpython/test_issue25894.py", line 13, in test_subTest
    self.fail('failed')
    ^^^^^^^^^^^^^^^^^^^
AssertionError: failed

----------------------------------------------------------------------
Ran 1 test in 0.001s

FAILED (failures=1, errors=1, skipped=1)

$ tests=.sFE ./python test_issue25894.py -v
test_subTest (__main__.TestClass) ... 
  test_subTest (__main__.TestClass) [1] (t='s') ... skipped 'skipped'
  test_subTest (__main__.TestClass) [2] (t='F') ... FAIL
  test_subTest (__main__.TestClass) [3] (t='E') ... ERROR

======================================================================
ERROR: test_subTest (__main__.TestClass) [3] (t='E')
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/serhiy/py/cpython/test_issue25894.py", line 15, in test_subTest
    raise Exception('error')
    ^^^^^^^^^^^^^^^^^^^^^^^^
Exception: error

======================================================================
FAIL: test_subTest (__main__.TestClass) [2] (t='F')
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/serhiy/py/cpython/test_issue25894.py", line 13, in test_subTest
    self.fail('failed')
    ^^^^^^^^^^^^^^^^^^^
AssertionError: failed

----------------------------------------------------------------------
Ran 1 test in 0.001s

FAILED (failures=1, errors=1, skipped=1)

As a side effect, the test description is also repeated for every error in the test cleanup code (in tearDown() and doCleanup()).

Similar changes should be added also in RegressionTestResult. If apply issue45057 first they will be much simpler.

Issue29152 can be related. If call addError() and addFailure() from addSubTest(), PR 28082 should be rewritten.
msg401101 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2021-09-05 17:56
Any suggestions for the format of output? Currently PR 28082 formats lines for subtest skipping, failure or error with a 2-space identation. Lines for skipping, failure or error in tearDown() or functions registered with addCallback() do not differ from a line skipping, failure or error in the test method itself.

I am not sure about backporting this change. On one hand, it fixes an old flaw in the unittest output. On other hand, the change affects not only subtests and can confuse programs which parse the unittest output because the test descriptions can occur multiple times.
msg401586 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2021-09-10 15:55
New changeset f0f29f328d8b4568e8c0d4c55c7d120d96f80911 by Serhiy Storchaka in branch 'main':
bpo-25894: Always report skipped and failed subtests separately (GH-28082)
https://github.com/python/cpython/commit/f0f29f328d8b4568e8c0d4c55c7d120d96f80911
msg401587 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2021-09-10 16:00
> Any suggestions for the format of output? Currently PR 28082 formats lines for subtest skipping, failure or error with a 2-space identation. Lines for skipping, failure or error in tearDown() or functions registered with addCallback() do not differ from a line skipping, failure or error in the test method itself.

I'm fine with that. Ultimately I don't think differentiating subtest status from method status is that important.

> I am not sure about backporting this change.

Since we're too late for 3.10.0 and it isn't a trivially small change that changes test output, I think we shouldn't be backporting it. There are tools that parse this output in editors. I'm afraid that it's too risky since we haven't given the community enough time to test for output difference.

I'm marking this as resolved. Thanks for your patch, Serhiy! If you feel strongly about backporting, we'd have to reopen and mark this as release blocker.
History
Date User Action Args
2022-04-11 14:58:25adminsetgithub: 70082
2021-09-10 16:00:39lukasz.langasetstatus: open -> closed
resolution: fixed
messages: + msg401587

stage: patch review -> resolved
2021-09-10 15:55:08lukasz.langasetnosy: + lukasz.langa
messages: + msg401586
2021-09-05 17:56:47serhiy.storchakasetmessages: + msg401101
versions: + Python 3.11, - Python 3.5, Python 3.6
2021-08-31 08:16:13serhiy.storchakasetmessages: + msg400700
2021-08-31 07:55:56serhiy.storchakasetkeywords: + patch
stage: test needed -> patch review
pull_requests: + pull_request26525
2021-08-30 06:18:16serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg400573
2016-03-15 23:04:23rbcollinssetmessages: + msg261834
2015-12-28 22:52:36ezio.melottisetmessages: + msg257146
2015-12-18 00:33:05martin.pantersetmessages: + msg256631
2015-12-17 13:25:16r.david.murraysetnosy: + r.david.murray
messages: + msg256604
2015-12-17 08:05:27zach.waresetmessages: + msg256590
2015-12-17 07:45:12martin.pantersetnosy: + martin.panter
messages: + msg256588
2015-12-17 05:49:35zach.warecreate