classification
Title: list duplicate test names with patchcheck
Type: behavior Stage: patch review
Components: Library (Lib), Tests Versions: Python 3.9
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: brett.cannon, chris.jerdonek, eric.araujo, ezio.melotti, gregory.p.smith, lukasz.langa, miss-islington, ned.deily, rbcollins, rhettinger, vstinner, xdegaye
Priority: normal Keywords: easy, patch

Created on 2012-09-28 10:15 by xdegaye, last changed 2019-09-10 14:18 by ned.deily.

Files
File name Uploaded Description Edit
duplicate_test_names.patch xdegaye, 2012-09-28 10:15 review
duplicate_code_names.py xdegaye, 2012-10-01 16:49
std_lib_duplicates.txt xdegaye, 2012-10-01 16:50
duplicate_code_names_2.py xdegaye, 2013-09-28 15:28
ignored_duplicates xdegaye, 2013-09-28 15:28
duplicate_code_names_3.py xdegaye, 2019-04-13 16:34
Pull Requests
URL Status Linked Edit
PR 12827 merged gregory.p.smith, 2019-04-14 17:18
PR 12828 merged miss-islington, 2019-04-14 17:32
PR 12886 open xdegaye, 2019-04-20 16:25
PR 12940 closed xdegaye, 2019-04-24 15:49
PR 12950 closed xdegaye, 2019-04-25 15:04
Messages (31)
msg171428 - (view) Author: Xavier de Gaye (xdegaye) * (Python triager) Date: 2012-09-28 10:15
See also issue 16056 for the current list of duplicate test names in
the std lib.

The attached patch improves patchcheck.py to list duplicate test
names when running 'make patchcheck'. This patch to the default
branch can also be applied asis to the 2.7 branch.

An example of patchcheck output with the patch applied:

==================
$ make patchcheck
./python ./Tools/scripts/patchcheck.py
Getting the list of files that have been added/changed ... 1 file
Fixing whitespace ... 0 files
Fixing C file whitespace ... 0 files
Fixing docs whitespace ... 0 files
Duplicate test names ... 1 test:
  TestErrorHandling.test_get_only in file Lib/test/test_heapq.py
Docs modified ... NO
Misc/ACKS updated ... NO
Misc/NEWS updated ... NO
configure regenerated ... not needed
pyconfig.h.in regenerated ... not needed

Did you run the test suite?

==================
msg171499 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2012-09-28 16:19
Nice feature to do without adding a dependency on a lint tool!
msg171505 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2012-09-28 16:52
I would like to see this written in a way that would let one run it globally or on a single file independent of a patch (e.g. an independent script from which patchcheck could import certain functions).  Or is that what you explicitly didn't want Éric? :)

This would let one do a report or global check as was done for issue 16056.  It would also make it a bit easier to check manually that the script is checking for duplicates correctly.

Also, some suggestions:

+def testmethod_names(code, name=[]):

It might be clearer to use the name=None form.

+    test_files = [fn for fn in python_files if
+                  fn.startswith(os.path.join('Lib', 'test'))]

Are you getting the test files in test/ subdirectories of subpackages?  I think checking that the file name starts with "test_" might be sufficient to get all test files.

+    if name[-1].startswith('test_'):

I believe 'test' is the prefix that unittest uses.  I'm pretty sure we have some tests that don't start with 'test_'.
msg171509 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2012-09-28 17:31
> I would like to see this written in a way that would let one
> run it globally or on a single file independent of a patch

+1
It can be added to Tools/scripts and imported by patchcheck.

> I'm pretty sure we have some tests that don't start with 'test_'.

IIRC those are just test helpers that are not executed directly.
OTOH I don't see why looking for test_*, every py file might contain duplicate names so they should all be checked.
msg171511 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2012-09-28 17:38
Here are a couple examples of test method names that don't begin with "test_":

    def testLoadTk(self):
    def testLoadTkFailure(self):

http://hg.python.org/cpython/file/f1094697d7dc/Lib/tkinter/test/test_tkinter/test_loadtk.py#l9
msg171512 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2012-09-28 17:38
sqlite3 tests use CheckThing style (urgh).
msg171513 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2012-09-28 17:45
> Here are a couple examples of test method names that don't begin with "test_":

I thought you were talking about test files.  I still don't see why looking for test_* methods, every class might contain duplicate method names, so they should all be checked.
msg171519 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2012-09-28 18:53
> I thought you were talking about test files.

Oh, I see why you said that then.  To find the test files themselves, this logic was used in the patch:

+                  fn.startswith(os.path.join('Lib', 'test'))]

Regarding your question for the general case, I'm not sure if there is ever a use case for duplicate method names.  Is there?
msg171536 - (view) Author: Xavier de Gaye (xdegaye) * (Python triager) Date: 2012-09-28 19:52
Note that using the module code object to find duplicates does not
allow for selecting among the different code types: function, nested
function, method or class.


Duplicates are extensively used within the std lib:

Running find_duplicate_test_names.py, the initial script from issue
16056, on the whole std lib instead of just Lib/test, after updating
the script to list all the duplicates (except <lambda>, <genexp>,
...) with:

    if not name[-1].startswith('<'):
        yield '.'.join(name)

prints 347 (on a total of 1368 std lib .py files) duplicate
functions, methods or classes.


To eliminate module level functions (but not nested functions), the
script is run now with the following change:

    if len(name) > 2 and not name[-1].startswith('<'):
        yield '.'.join(name)

and lists 188 duplicate nested functions, methods or classes.  In
this list there are 131 duplicates in .py files located in the
subdirectory of a "test" directory.
msg171544 - (view) Author: Xavier de Gaye (xdegaye) * (Python triager) Date: 2012-09-28 20:49
Using the python class browser (pyclbr.py) in conjunction with the
search for duplicates in the module code object would allow to
restrict the listing of duplicates to functions and methods or even
just to methods (depending on the feature requirements), without
listing the duplicate classes and duplicate nested functions. With
an associated performance cost.
msg171548 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2012-09-28 22:58
It doesn't necessary have to be limited to methods, anything duplicate might turn out to be a bug.  If the script doesn't mix scopes there shouldn't be too many false positives, and if they are it shouldn't be a big deal if they are reported on the changed file by `make patchcheck`.

> I'm not sure if there is ever a use case for duplicate
> method names.  Is there?

Nothing that can't be done in a more elegant way afaict.

It might make sense for variables though, where you have e.g.:

foo = do_something(x)
foo = do_something_more(foo)
msg171558 - (view) Author: Xavier de Gaye (xdegaye) * (Python triager) Date: 2012-09-29 09:31
> I'm not sure if there is ever a use case for duplicate method
> names.  Is there?

property getter, setter, and deleter methods do have the same name.
msg171559 - (view) Author: Xavier de Gaye (xdegaye) * (Python triager) Date: 2012-09-29 10:41
> Here are a couple examples of test method names that don't begin
> with "test_":
>
>     def testLoadTk(self):
>     def testLoadTkFailure(self):

Also Lib/test/test_smtplib.py test method names start with 'test'
instead of 'test_' although the 'Regression tests package for Python'
documentation states: "The test methods in the test module should
start with test_".
msg171576 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2012-09-29 16:16
For informational purposes, here is where unittest defaults to the prefix "test" for finding test methods:

http://hg.python.org/cpython/file/f11649b21603/Lib/unittest/loader.py#l48

sqlite3 is able to use "Check" because it manages its own test discovery.  For example--

http://hg.python.org/cpython/file/f11649b21603/Lib/sqlite3/test/regression.py#l306
msg171734 - (view) Author: Xavier de Gaye (xdegaye) * (Python triager) Date: 2012-10-01 16:49
The attached script, named duplicate_code_names.py, takes a file
name list as argument and prints duplicate code names found in these
files ordered by function, class, method and nested class or
function.

The script output on the whole std lib (see the result in the
attached file std_lib_duplicates.txt):

$ time ./python Tools/scripts/duplicate_code_names.py $(find Lib -name "*py") > std_lib_duplicates.txt
Lib/test/badsyntax_future4.py: compile error: from __future__ imports must occur at the beginning of the file (badsyntax_future4.py, line 3)
Lib/test/badsyntax_future6.py: compile error: from __future__ imports must occur at the beginning of the file (badsyntax_future6.py, line 3)
Lib/test/badsyntax_future3.py: compile error: future feature rested_snopes is not defined (badsyntax_future3.py, line 3)
Lib/test/badsyntax_future9.py: compile error: not a chance (badsyntax_future9.py, line 3)
Lib/test/bad_coding.py: compile error: unknown encoding for 'Lib/test/bad_coding.py': uft-8
Lib/test/badsyntax_future8.py: compile error: future feature * is not defined (badsyntax_future8.py, line 3)
Lib/test/badsyntax_3131.py: compile error: invalid character in identifier (badsyntax_3131.py, line 2)
Lib/test/badsyntax_future7.py: compile error: from __future__ imports must occur at the beginning of the file (badsyntax_future7.py, line 3)
Lib/test/bad_coding2.py: compile error: encoding problem for 'Lib/test/bad_coding2.py': utf-8
Lib/test/badsyntax_pep3120.py: compile error: invalid or missing encoding declaration for 'Lib/test/badsyntax_pep3120.py'
Lib/test/badsyntax_future5.py: compile error: from __future__ imports must occur at the beginning of the file (badsyntax_future5.py, line 4)
Lib/lib2to3/tests/data/different_encoding.py: compile error: invalid syntax (different_encoding.py, line 3)
Lib/lib2to3/tests/data/py2_test_grammar.py: compile error: invalid token (py2_test_grammar.py, line 31)
Lib/lib2to3/tests/data/bom.py: compile error: invalid syntax (bom.py, line 2)
Lib/lib2to3/tests/data/crlf.py: compile error: invalid syntax (crlf.py, line 1)
Lib/__phello__.foo.py: __phello__.foo not a valid module name

real    6m14.854s
user    6m14.455s
sys     0m0.392s


FWIW running the same command with python 3.2 takes about 2.5
minutes instead of more than 6 minutes (importlib ?).
msg198525 - (view) Author: Xavier de Gaye (xdegaye) * (Python triager) Date: 2013-09-28 15:28
duplicate_code_names_2.py uses tokenize to print duplicate code names
within the same scope, excluding property setter/getter/deleter
duplicates, excluding duplicates of nested classes or functions, and
ignoring duplicates listed in a file (run with --help for more
details).  With the attached ignored_duplicates file, it prints the
following output on the root of the current default branch (in about 1
mn on an old laptop):

$ ./duplicate_code_names_2.py --ignore ignored_duplicates .
Duplicate function or class names:
./Lib/test/_test_multiprocessing.py:3047 _TestProcess
./Lib/test/test_os.py:1290 Win32ErrorTests

Duplicate method names:
./Lib/ctypes/test/test_functions.py:316 FunctionTestCase.test_errors
./Lib/distutils/tests/test_cmd.py:80 CommandTestCase.test_ensure_string_list
./Lib/lib2to3/tests/test_fixers.py:1467 Test_dict.test_14
./Lib/lib2to3/tests/test_fixers.py:1472 Test_dict.test_15
./Lib/lib2to3/tests/test_fixers.py:1477 Test_dict.test_17
./Lib/lib2to3/tests/test_fixers.py:1482 Test_dict.test_18
./Lib/lib2to3/tests/test_fixers.py:1487 Test_dict.test_19
./Lib/test/test_complex.py:104 ComplexTest.test_truediv
./Lib/test/test_dis.py:250 DisTests.test_big_linenos
./Lib/test/test_dis.py:294 DisTests.test_dis_object
./Lib/test/test_ftplib.py:537 TestFTPClass.test_mkd
./Lib/test/test_heapq.py:366 TestErrorHandling.test_get_only
./Lib/test/test_import.py:255 ImportTests.test_import_name_binding
./Lib/test/test_regrtest.py:210 ParseArgsTestCase.test_findleaks
./Lib/test/test_smtplib.py:249 DebuggingServerTests.testNotImplemented
./Lib/test/test_webbrowser.py:161 OperaCommandTest.test_open_new
./Lib/unittest/test/testmock/testmock.py:1381 MockTest.test_attribute_deletion
./Lib/xml/dom/minidom.py:379 Attr._get_name
./Mac/Tools/Doc/setup.py:123 DocBuild.makeHelpIndex
msg198586 - (view) Author: Xavier de Gaye (xdegaye) * (Python triager) Date: 2013-09-29 11:45
The following issues have been entered for all the above duplicate
names found by duplicate_code_names_2.py:

issue 19112, issue 19113, issue 19114, issue 19115, issue 19116,
issue 19117, issue 19118, issue 19119, issue 19122, issue 19123,
issue 19125, issue 19126, issue 19127, issue 19128

except the following which should be added to ignored_duplicates:

    ./Lib/test/test_os.py:1290 Win32ErrorTests
msg229625 - (view) Author: Robert Collins (rbcollins) * (Python committer) Date: 2014-10-18 00:40
FWIW testtools rejects test suites with duplicate test ids; I'm considering adding that feature into unittest itself. We'd need an option to make it warn rather than error I think, but if we did that we wouldn't need a separate script at all.
msg340168 - (view) Author: Xavier de Gaye (xdegaye) * (Python triager) Date: 2019-04-13 16:34
Upgrading the script to account for Python changes. This is now duplicate_code_names_3.py

$ ./python ./duplicate_code_names_3.py --ignore ignored_duplicates Lib/test
Duplicate method names:
Lib/test/test_dataclasses.py:1406 TestCase.test_helper_asdict_builtin_containers
Lib/test/test_dataclasses.py:1579 TestCase.test_helper_astuple_builtin_containers
Lib/test/test_dataclasses.py:700 TestCase.test_not_tuple
Lib/test/test_dataclasses.py:3245 TestReplace.test_recursive_repr_two_attrs
Lib/test/test_genericclass.py:161 TestClassGetitem.test_class_getitem
Lib/test/test_gzip.py:764 TestCommandLine.test_compress_infile_outfile
Lib/test/test_heapq.py:376 TestErrorHandling.test_get_only
Lib/test/test_importlib/test_util.py:755 PEP3147Tests.test_source_from_cache_path_like_arg
Lib/test/test_logging.py:328 BuiltinLevelsTest.test_regression_29220
Lib/test/test_sys_setprofile.py:363 ProfileSimulatorTestCase.test_unbound_method_invalid_args
Lib/test/test_sys_setprofile.py:354 ProfileSimulatorTestCase.test_unbound_method_no_args
Lib/test/test_utf8_mode.py:198 UTF8ModeTests.test_io_encoding

False positives have been removed from the output of the above command and so, all the above methods are effectively duplicates that must be fixed.
msg340219 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2019-04-14 17:32
New changeset cd466559c4a312b3c1223a774ad4df19fc4f0407 by Gregory P. Smith in branch 'master':
bpo-16079: fix duplicate test method name in test_gzip. (GH-12827)
https://github.com/python/cpython/commit/cd466559c4a312b3c1223a774ad4df19fc4f0407
msg340221 - (view) Author: miss-islington (miss-islington) Date: 2019-04-14 17:50
New changeset 9f9e029bd2223ecba46eaefecadf0ac252d891f2 by Miss Islington (bot) in branch '3.7':
bpo-16079: fix duplicate test method name in test_gzip. (GH-12827)
https://github.com/python/cpython/commit/9f9e029bd2223ecba46eaefecadf0ac252d891f2
msg340252 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-04-15 09:52
This script should be part of Python and run in the pre-commit CI like Travis CI!
msg340286 - (view) Author: Xavier de Gaye (xdegaye) * (Python triager) Date: 2019-04-15 15:41
False positives must be added to the 'ignored_duplicates' file in order to have duplicate_code_names.py exit with success. Not sure whether this may be considered as an annoyance by commiters, if TRAVIS would fail when duplicate_code_names.py fails.

All the function or class names duplicates have been removed as false positives from the list in my previous post and this single false positive has been removed from the duplicate method names:

Lib/test/test_socket.py:4115 InterruptedTimeoutBase.setAlarm

All the duplicates are method names as reported now or five years ago in msg 198586, so this script should get another command line option to only report duplicate method names.
msg340295 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2019-04-15 16:36
Agreed, making duplicate method definitions a CI failure is the desired end state once our test suite is cleaned up and it doesn't have false positives.

FYI - pylint also implements this check quite reliably as function-redefined via its pylint.checkers.base.BasicErrorChecker._check_redefinition() method.

https://github.com/PyCQA/pylint/blob/2.2/pylint/checkers/base.py#L843
msg340323 - (view) Author: Xavier de Gaye (xdegaye) * (Python triager) Date: 2019-04-16 09:20
Thanks for the link Gregory. I will write a script based on ast and check its output against pylint and against the current script based on tokenize.

The travis() function of Tools/scripts/patchcheck.py may be modified to import this script and run it only on files modified by the PR. This may allow the pre-commit duplicate check to be installed without waiting for the python test suite to be cleaned if the existing duplicates are temporarily added to the ignored_duplicates file (assuming an issue has been entered for each one of those existing duplicates with a note saying to remove the entry in ignored_duplicates when the issue is fixed). Indeed, issues #19113 and #19119 are still open after they have been entered 5 years ago.
msg340584 - (view) Author: Xavier de Gaye (xdegaye) * (Python triager) Date: 2019-04-20 16:18
List of issues entered for all the current duplicate method definitions:

#19113 #19119 #36678 #36679
#36680 #36681 #36682 #36683
msg340585 - (view) Author: Xavier de Gaye (xdegaye) * (Python triager) Date: 2019-04-20 16:41
PR 12886 adds a check on duplicate method definitions to the travis() function of patchcheck.py.

False positives must be entered to Tools/scripts/duplicates_ignored.txt.

The existing duplicates have been entered to this file (with the corresponding bpo issue number in comment) and no duplicates are found currently by duplicate_meth_defs.py when run with '--ignore duplicates_ignored.txt'. It is expected that these entries would be removed from this file when their issues are closed.
msg340687 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2019-04-23 00:53
Should the unittest module grow a feature to scan for duplicate methods?  I imagine that duplicate methods are a common problem.

Possibly, inheriting from unittest can be accompanied by a metaclass that has __prepare__ with special dictionary that detects and warns about duplicates.
msg340785 - (view) Author: Xavier de Gaye (xdegaye) * (Python triager) Date: 2019-04-24 15:27
List of issues entered for all the current duplicate method definitions in 2.7:

#19113 #36711 #36712 #36713
msg340786 - (view) Author: Xavier de Gaye (xdegaye) * (Python triager) Date: 2019-04-24 15:30
Not sure the unittest module is the right place to implement these checks.
The following issues deal with duplicates that are not unittest methods:

#19127 #19128 #36711
msg351668 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2019-09-10 14:18
The open proposed PR for this issue has been languishing unreviewed for several months now.  Since the proposal is really a request to change our development process, I'm nosying Brett and Łukasz (3.9 RM).  In any case, if we would decide to add this to our CI, I thine we should only start with the master branch so I'm closing the 3.7 and 2.7 backport PRs.
History
Date User Action Args
2019-09-10 14:18:25ned.deilysetnosy: + ned.deily, brett.cannon, lukasz.langa

messages: + msg351668
versions: + Python 3.9, - Python 2.7, Python 3.7, Python 3.8
2019-04-25 15:04:13xdegayesetpull_requests: + pull_request12876
2019-04-24 15:49:12xdegayesetpull_requests: + pull_request12864
2019-04-24 15:30:39xdegayesetmessages: + msg340786
2019-04-24 15:27:03xdegayesetmessages: + msg340785
2019-04-23 00:53:02rhettingersetnosy: + rhettinger
messages: + msg340687
2019-04-20 16:41:45xdegayesetmessages: + msg340585
2019-04-20 16:25:57xdegayesetpull_requests: + pull_request12812
2019-04-20 16:18:20xdegayesetmessages: + msg340584
2019-04-16 09:20:54xdegayesetmessages: + msg340323
2019-04-15 16:36:06gregory.p.smithsetmessages: + msg340295
2019-04-15 15:41:57xdegayesetmessages: + msg340286
2019-04-15 09:52:30vstinnersetmessages: + msg340252
2019-04-14 17:50:56miss-islingtonsetnosy: + miss-islington
messages: + msg340221
2019-04-14 17:35:03gregory.p.smithsettype: enhancement -> behavior
2019-04-14 17:32:39miss-islingtonsetpull_requests: + pull_request12753
2019-04-14 17:32:10gregory.p.smithsetnosy: + gregory.p.smith
messages: + msg340219
2019-04-14 17:23:31gregory.p.smithsetcomponents: + Tests
2019-04-14 17:21:29gregory.p.smithsetkeywords: + easy
versions: + Python 3.7, Python 3.8, - Python 3.3, Python 3.4
2019-04-14 17:18:34gregory.p.smithsetstage: patch review
pull_requests: + pull_request12752
2019-04-13 16:34:40xdegayesetfiles: + duplicate_code_names_3.py

messages: + msg340168
2014-10-18 00:40:47rbcollinssetnosy: + rbcollins
messages: + msg229625
2013-09-29 22:22:49vstinnersetnosy: + vstinner
2013-09-29 11:45:16xdegayesetmessages: + msg198586
2013-09-28 15:28:48xdegayesetfiles: + ignored_duplicates
2013-09-28 15:28:26xdegayesetfiles: + duplicate_code_names_2.py

messages: + msg198525
2013-07-13 05:41:14terry.reedysetversions: + Python 3.4, - Python 3.2
2012-10-01 16:50:35xdegayesetfiles: + std_lib_duplicates.txt
2012-10-01 16:49:53xdegayesetfiles: + duplicate_code_names.py

messages: + msg171734
2012-09-29 16:16:22chris.jerdoneksetmessages: + msg171576
2012-09-29 10:41:11xdegayesetmessages: + msg171559
2012-09-29 09:31:55xdegayesetmessages: + msg171558
2012-09-28 22:58:56ezio.melottisetmessages: + msg171548
2012-09-28 20:49:30xdegayesetmessages: + msg171544
2012-09-28 19:52:51xdegayesetmessages: + msg171536
2012-09-28 18:53:01chris.jerdoneksetmessages: + msg171519
2012-09-28 17:45:48ezio.melottisetmessages: + msg171513
2012-09-28 17:38:48eric.araujosetmessages: + msg171512
2012-09-28 17:38:09chris.jerdoneksetmessages: + msg171511
2012-09-28 17:31:27ezio.melottisetmessages: + msg171509
2012-09-28 16:52:06chris.jerdoneksetmessages: + msg171505
2012-09-28 16:19:08eric.araujosetnosy: + eric.araujo
messages: + msg171499
2012-09-28 10:15:37xdegayecreate