Issue 27112: tokenize.__all__ list is incomplete

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/71299

classification

Title:	tokenize.__all__ list is incomplete
Type:	enhancement	Stage:	patch review
Components:	Library (Lib)	Versions:	Python 3.6

process

Status:	open	Resolution:
Dependencies:		Superseder:
Assigned To:		Nosy List:	Unit03, martin.panter, serhiy.storchaka
Priority:	normal	Keywords:	patch

Created on 2016-05-24 20:56 by Unit03, last changed 2022-04-11 14:58 by admin.

Files
File name	Uploaded	Description	Edit
tokenize_all.patch	Unit03, 2016-05-24 20:56		review
tokenize_all.v2.patch	Unit03, 2016-09-05 18:36		review
tokenize_all.v2.1.patch	Unit03, 2016-09-05 18:36		review

Messages (9)
msg266275 - (view)	Author: Jacek Kołodziej (Unit03) *	Date: 2016-05-24 20:56
That's a child issue of #23883, created to propose a patch fixing tokenize module's __all__ list. Changes in tests go farther: I've changed import from from tokenize import ... to import tokenize and adjusted all its usages accordingly. The module must be imported in order to test its __all__ list through test.support.check__all__ helper and just adding this single import would either force us to either do * import tokenize as tokenize_module * or from tokenize import tokenize as tokenize_function I think going third way: with just "import tokenize" and changing its uses in the rest of tests result in celarer code, but of course I guess this may be too much for a single patch.
msg266304 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2016-05-25 05:46
The blacklist is too long. I think it would be better to use white list.
msg266309 - (view)	Author: Jacek Kołodziej (Unit03) *	Date: 2016-05-25 06:27
I disagree: * blacklist has 48 entries now, whitelist would have 72 ones * whitelisting requires adding new public names to two places instead of one * test.support.check__all__ currently don't support whitelisting, it would need to be added
msg266312 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2016-05-25 06:44
* White list consists mostly from token.__all__. * All names in black list are implementation details. Names in white list are stable and already repeated in docs. * test.support.check__all__ shouldn't be used in this case. See other tests for public API that doesn't use it.
msg266340 - (view)	Author: Martin Panter (martin.panter) *	Date: 2016-05-25 13:00
I think it might be better to add a separate “import tokenize as tokenize_module”. Or just “import tokenize” inside the test function. IMO changing all the names adds too much churn with minimal benefit. Maybe also should restore the “from unittest import . . .” line. But keep an “import unittest” line somewhere so the __main__ bit at the bottom works. Perhaps it would be good to merge MiscTestCase into the existing TestMisc class.
msg266390 - (view)	Author: Jacek Kołodziej (Unit03) *	Date: 2016-05-25 19:29
> * All names in black list are implementation details. Names in white list are stable and already repeated in docs. Assumption here is that implementation details shouldn't "look" public - they should have names starting with "_"; I think blacklisting names in these tests encourages good practice - if something "looks" public, either: * it should be documented and placed in __all__ * renamed to something that doesn't look public anymore * in some special cases - be explicitely blacklisted in test. But ok, assuming we go with whitelisting and plain self.assertCountEqual: > * White list consists mostly from token.__all__. Should I then do: import token expected = token.__all__ + ["COMMENT", "NL", "ENCODING", "TokenInfo", "TokenError", "detect_encoding", "untokenize", "open", "tokenize"] ? > IMO changing all the names adds too much churn with minimal benefit. I wouldn't call it minimal, it has some positive impact on readability, see last line from The Zen of Python. :) Of course, final call is yours. Single import of unittest is such a small change I would rather keep it. I fully agree existing TestMisc class is a good place for this test, though.
msg266579 - (view)	Author: Martin Panter (martin.panter) *	Date: 2016-05-29 00:06
Changing the names to tokenize.<name> does solve the problem of the tokenize module versus the tokenize() fuction, so I can accept this way since you prefer it. So I think that just leaves what to do with the actual test case. I don’t think it matters too much, but I would lean toward ensuring the test fails if someone adds a new implementation detail without an underscore prefix. It is also good to be explicit that the ISTERMINAL() etc functions are special cases. On the other hand, neither the original patch nor Jacek’s proposal for “expected = token.__all__ + ...” would pick up the fact that the tok_name dictionary is another special case copied from the “token” module. (See also Issue 25324.)
msg274418 - (view)	Author: Jacek Kołodziej (Unit03) *	Date: 2016-09-05 18:36
> I would lean toward ensuring the test fails if someone adds a new implementation detail without an underscore prefix. It is also good to be explicit that the ISTERMINAL() etc functions are special cases. Original patch meets these requirements. I've updated it with moving the test__all__ method to TestMisc class as suggested (tokenize_all.v2.patch). I'm also attaching the alternative version (tokenize_all.v2.1.patch) that uses self.assertCountEqual instead of support.check__all__ and whitelisting as Serhiy suggested; this version of test doesn't meet the requirements above. Yes, neither one challenge the tok_name (#25324) problem, I'm not really sure whether is should, though. I'll try to solve it with separate patch if I find some time.
msg275268 - (view)	Author: Martin Panter (martin.panter) *	Date: 2016-09-09 06:29
Thanks Jacek. I would prefer to go with v2.patch.

History
Date	User	Action	Args
2022-04-11 14:58:31	admin	set	github: 71299
2016-09-09 06:29:57	martin.panter	set	messages: + msg275268 stage: patch review
2016-09-05 18:36:54	Unit03	set	files: + tokenize_all.v2.1.patch messages: + msg274418
2016-09-05 18:36:44	Unit03	set	files: + tokenize_all.v2.patch
2016-05-29 00:06:11	martin.panter	set	messages: + msg266579
2016-05-25 19:29:28	Unit03	set	messages: + msg266390
2016-05-25 13:16:53	martin.panter	link	issue23883 dependencies
2016-05-25 13:00:11	martin.panter	set	nosy: + martin.panter messages: + msg266340
2016-05-25 06:44:08	serhiy.storchaka	set	messages: + msg266312
2016-05-25 06:27:54	Unit03	set	messages: + msg266309
2016-05-25 05:46:06	serhiy.storchaka	set	nosy: + serhiy.storchaka messages: + msg266304
2016-05-24 20:56:35	Unit03	create