classification
Title: find by dichotomy the failing test
Type: enhancement Stage: resolved
Components: Tests Versions: Python 3.5
process
Status: closed Resolution: duplicate
Dependencies: Superseder: regrtest refleak: implement bisection feature
View: 29512
Assigned To: Nosy List: georg.brandl, michael.foord, r.david.murray, serhiy.storchaka, terry.reedy, vstinner, xdegaye
Priority: normal Keywords: patch

Created on 2014-10-11 08:31 by xdegaye, last changed 2017-07-11 08:19 by vstinner. This issue is now closed.

Files
File name Uploaded Description Edit
subtest_in_range.diff xdegaye, 2014-10-11 08:39 review
regrest_XY_options.patch xdegaye, 2014-10-21 10:27 review
regrest_XY_options_2.patch xdegaye, 2014-10-22 22:02 review
Messages (15)
msg229067 - (view) Author: Xavier de Gaye (xdegaye) * (Python triager) Date: 2014-10-11 08:31
This issue stems from issue 22588.

See message 228968 for the rationale:
Automatize the dichotomy process used to to identify memory leaks, crash, reference leak, resource leak, etc. in a failing test.
msg229068 - (view) Author: Xavier de Gaye (xdegaye) * (Python triager) Date: 2014-10-11 08:32
See msg 228968 for the rationale.
msg229069 - (view) Author: Xavier de Gaye (xdegaye) * (Python triager) Date: 2014-10-11 08:39
With the attached patch (the patch does reintroduce the bug in 'test_incref_decref_API' of issue 22588 for testing purposes), it is possible to find the failing subtest rapidly:

After identifying the failing test, print the list of subtests in this test and their number (35 subsets):
$ export SUBTEST_RANGE="[]"
$ ./python -m test -m test__testcapi test_capi

Then run:
$ ./python -m test -m test__testcapi -R 23:23 test_capi

after modifying, each time, the range of subtests to execute, with:
$ export SUBTEST_RANGE="range(1,18)"    # tests 1-17   result: fail
$ export SUBTEST_RANGE="range(1,9)"     # tests 1-8    result: pass
$ export SUBTEST_RANGE="range(9,13)"    # tests 9-12   result: fail
$ export SUBTEST_RANGE="range(9,11)"    # tests 9-10   result: fail
$ export SUBTEST_RANGE="[9]"            # so it must be test #9, check it now

The strong limitation with this solution is that the subTest context manager must now be enclosed in a 'try except unittest.SkipTest' clause and that the context manager is used more than 100 times in the test suite.
msg229070 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2014-10-11 09:07
I don't think this feature is generally useful enough to be included.  

* Since you need to modify the test code anyway (adding the try-except), it is probably just as much work to do the selection there.

* Why only add selection of subtests, and not of all tests?

In addition, the patch would have to be reworked: eval()ing an environment variable is not acceptable.

BTW, I think the commonly used term for this is "bisection".
msg229072 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-10-11 09:50
I requested the feature because I regulary need bisect (once a month, or
maybe two months).
msg229073 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2014-10-11 10:03
> I requested the feature because I regulary need bisect (once a month, or maybe two months).

Always within subtests?
msg229095 - (view) Author: Xavier de Gaye (xdegaye) * (Python triager) Date: 2014-10-11 16:43
> With the attached patch (the patch does reintroduce the bug in 'test_incref_decref_API' of issue 22588 for testing purposes)

Sorry for not being more explicit and for being lazy doing a copy paste from msg 229022:

* this is not a patch, obviously a patch would not attempt to reintroduce a bug (as stated above), this is an experiment that you must 'patch' in order to test it (I did mention 'testing purposes' in this messagei, see above)
* this was witten in msg 229022 in answer to "Does anyone know how to automatize the dichotomy process ?" from Victor
* it only deals with subsets as a partial attempt to answer this question mostly because this is were most of the time is spent when investigating issue 22588
msg229096 - (view) Author: Xavier de Gaye (xdegaye) * (Python triager) Date: 2014-10-11 16:56
@ Georg Brandl
> I don't think this feature is generally useful enough to be included.
>
> * Since you need to modify the test code anyway (adding the try-except), it is probably just as much work to do the selection there.


You seem to be confusing the feature itself with the implementation.
This feature is generally useful.
The fact that there is an acceptable implementation is another matter (and subtest_in_range.diff is not an implementation).
msg229097 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2014-10-11 17:08
> You seem to be confusing the feature itself with the implementation.

> The fact that there is an acceptable implementation is another matter
> (and subtest_in_range.diff is not an implementation).

You yourself were calling it a "solution".

A feature proposal should either have a description of the new feature, or a patch.  Since there is no feature description, I reviewed the patch as being an implementation of the desired feature.

> This feature is generally useful.

As described in the issue, I disagree.  If the intended feature is more comprehensive, please supply an adequate description or a patch that implements it.
msg229099 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2014-10-11 17:11
> Sorry for not being more explicit and for being lazy doing a copy paste from msg 229022:

I see, this was split off another issue (which was already closed).  I agree that a bit more verbosity in the initial description would have prevented confusion :)
msg229627 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2014-10-18 02:59
I believe the goal, and a better title, is "Automate leak discovery within a group of tests".  Bisection or dichotomy is a means, not a goal, and should not be part of the title.

"Leak discovery" means 'find a test within the group that has a leak (which we know or suspect might be occurring)'. 

'Test' can mean multiple modules, one module, one TestCase class, one test_xxx method, one subtest, or one assert.  What might it mean in 'group of tests'?  Pinning a leak on a particular statement likely has to be done manually; eliminate that.  On the other hand, it is feasible (see below) to automatically blame a particular test method and perhaps a particular subtest.  So interprete 'tests' as 'test methods or subtests'.

Unittest can run all test methods in a module, all test methods in a test class, or an individual test method.  Idle already has a module browser (Class Browser), based on stdlib's pyclbr and Idle's TreeWidget, that could be adapted (perhaps using ttk.Treeview instead) to produce a module-test_class-test tree.  Add buttons to select a particular leak test and others to select manual or auto nagivation of the tree and we might have something worthwhile.  It could go in Tools/Scripts or PyPI.

I have not used subtests yet since they are not available on 2.7, but I can imagine semi-automated editing, saving to a temp file, and running.  I would need experience with real subtest bisection to comment on the patch.
msg229761 - (view) Author: Xavier de Gaye (xdegaye) * (Python triager) Date: 2014-10-21 10:27
The attached patch adds the '-X' and '-Y' options to the regrtest tool, allowing to select a range of tests and a range of their subtests.  The patch is missing the test cases for the moment.

Limitation:
Does not work very well with nested subtest (nested subtests are currently only used by the unittest test suite itself): subtest numbers are flattened across all the nested subtests.

Here is the sequence of commands that would have allowed to find the subtest responsible for the crash in test_capi at issue 22588, without modifying any code:
./python -m test -X "[]" test_capi                  # Get the test count: 18 tests.
./python -m test -X "range(1,10)" -R 23:23 test_capi            # pass
./python -m test -X "range(10,15)" -R 23:23 test_capi           # pass
./python -m test -X "range(15,17)" -R 23:23 test_capi           # pass
./python -m test -X "[17]" -R 23:23 test_capi                   # pass
./python -m test -X "[18]" -R 23:23 test_capi                   # fail
./python -m test -X "[18]" -Y "[]" test_capi        # Test 18 has 35 subtests.
./python -m test -X "[18]" -Y "range(1,19)" -R 23:23 test_capi  # fail
./python -m test -X "[18]" -Y "range(1,10)" -R 23:23 test_capi  # fail
./python -m test -X "[18]" -Y "range(1,6)" -R 23:23 test_capi   # pass
./python -m test -X "[18]" -Y "range(6,8)" -R 23:23 test_capi   # pass
./python -m test -X "[18]" -Y "[8]" -R 23:23 test_capi          # pass
./python -m test -X "[18]" -Y "[9]" -R 23:23 test_capi          # fail

Output of the last command:
Test# 18: test__testcapi
subTest# 9: internal, {'name': 'test_incref_decref_API'}
msg229838 - (view) Author: Xavier de Gaye (xdegaye) * (Python triager) Date: 2014-10-22 22:02
This new version of the patch uses a specific exception to skip tests and fixes a bug when invoking the overriden and wrapped subTest method.
msg298137 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-07-11 07:07
Wasn't the similar feature implemented in issue29512?
msg298144 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-07-11 08:19
> See message 228968 for the rationale:

http://bugs.python.org/issue22588#msg228968

Ok, this use case is now well supported by the new test.bisect tool. Yes, this issue is a duplicate of the issue #29512. I forgot this old issue!
History
Date User Action Args
2017-07-11 08:19:37vstinnersetstatus: open -> closed
superseder: regrtest refleak: implement bisection feature
messages: + msg298144

resolution: duplicate
stage: test needed -> resolved
2017-07-11 07:07:27serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg298137
2014-10-22 22:02:29xdegayesetfiles: + regrest_XY_options_2.patch

messages: + msg229838
2014-10-21 10:27:25xdegayesetfiles: + regrest_XY_options.patch

messages: + msg229761
2014-10-18 02:59:59terry.reedysetnosy: + terry.reedy

messages: + msg229627
stage: test needed
2014-10-11 17:11:30georg.brandlsetmessages: + msg229099
2014-10-11 17:08:16georg.brandlsetmessages: + msg229097
2014-10-11 16:56:21xdegayesetmessages: + msg229096
2014-10-11 16:43:26xdegayesetmessages: + msg229095
2014-10-11 14:37:44r.david.murraysetnosy: + r.david.murray
2014-10-11 10:03:04georg.brandlsetmessages: + msg229073
2014-10-11 09:50:44vstinnersetmessages: + msg229072
2014-10-11 09:07:22georg.brandlsetnosy: + georg.brandl, michael.foord
messages: + msg229070
2014-10-11 08:39:26xdegayesetfiles: + subtest_in_range.diff
keywords: + patch
messages: + msg229069
2014-10-11 08:32:44xdegayesetmessages: + msg229068
2014-10-11 08:31:29xdegayecreate