classification
Title: tokenize.__all__ list is incomplete
Type: enhancement Stage: patch review
Components: Library (Lib) Versions: Python 3.6
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Unit03, martin.panter, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2016-05-24 20:56 by Unit03, last changed 2016-09-09 06:29 by martin.panter.

Files
File name Uploaded Description Edit
tokenize_all.patch Unit03, 2016-05-24 20:56 review
tokenize_all.v2.patch Unit03, 2016-09-05 18:36 review
tokenize_all.v2.1.patch Unit03, 2016-09-05 18:36 review
Messages (9)
msg266275 - (view) Author: Jacek Kołodziej (Unit03) * Date: 2016-05-24 20:56
That's a child issue of #23883, created to propose a patch fixing tokenize module's __all__ list.

Changes in tests go farther: I've changed import from

    from tokenize import ...

to

    import tokenize

and adjusted all its usages accordingly.

The module must be imported in order to test its __all__ list through test.support.check__all__ helper and just adding this single import would either force us to either do
* import tokenize as tokenize_module
* or from tokenize import tokenize as tokenize_function

I think going third way: with just "import tokenize" and changing its uses in the rest of tests result in celarer code, but of course I guess this may be too much for a single patch.
msg266304 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-05-25 05:46
The blacklist is too long. I think it would be better to use white list.
msg266309 - (view) Author: Jacek Kołodziej (Unit03) * Date: 2016-05-25 06:27
I disagree:
* blacklist has 48 entries now, whitelist would have 72 ones
* whitelisting requires adding new public names to two places instead of one
* test.support.check__all__ currently don't support whitelisting, it would need to be added
msg266312 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-05-25 06:44
* White list consists mostly from token.__all__.
* All names in black list are implementation details. Names in white list are stable and already repeated in docs.
* test.support.check__all__ shouldn't be used in this case. See other tests for public API that doesn't use it.
msg266340 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-05-25 13:00
I think it might be better to add a separate “import tokenize as tokenize_module”. Or just “import tokenize” inside the test function. IMO changing all the names adds too much churn with minimal benefit.

Maybe also should restore the “from unittest import . . .” line. But keep an “import unittest” line somewhere so the __main__ bit at the bottom works.

Perhaps it would be good to merge MiscTestCase into the existing TestMisc class.
msg266390 - (view) Author: Jacek Kołodziej (Unit03) * Date: 2016-05-25 19:29
> * All names in black list are implementation details. Names in white list are stable and already repeated in docs.

Assumption here is that implementation details shouldn't "look" public - they should have names starting with "_"; I think blacklisting names in these tests encourages good practice - if something "looks" public, either:
* it should be documented and placed in __all__
* renamed to something that doesn't look public anymore
* in some special cases - be explicitely blacklisted in test.

But ok, assuming we go with whitelisting and plain self.assertCountEqual:

> * White list consists mostly from token.__all__.

Should I then do:

    import token
    expected = token.__all__ + ["COMMENT", "NL", "ENCODING", "TokenInfo", "TokenError", "detect_encoding", "untokenize", "open", "tokenize"]

?

> IMO changing all the names adds too much churn with minimal benefit.

I wouldn't call it minimal, it has some positive impact on readability, see last line from The Zen of Python. :) Of course, final call is yours.

Single import of unittest is such a small change I would rather keep it.

I fully agree existing TestMisc class is a good place for this test, though.
msg266579 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-05-29 00:06
Changing the names to tokenize.<name> does solve the problem of the tokenize module versus the tokenize() fuction, so I can accept this way since you prefer it.

So I think that just leaves what to do with the actual test case. I don’t think it matters too much, but I would lean toward ensuring the test fails if someone adds a new implementation detail without an underscore prefix. It is also good to be explicit that the ISTERMINAL() etc functions are special cases.

On the other hand, neither the original patch nor Jacek’s proposal for “expected = token.__all__ + ...” would pick up the fact that the tok_name dictionary is another special case copied from the “token” module. (See also Issue 25324.)
msg274418 - (view) Author: Jacek Kołodziej (Unit03) * Date: 2016-09-05 18:36
> I would lean toward ensuring the test fails if someone adds a new implementation detail without an underscore prefix. It is also good to be explicit that the ISTERMINAL() etc functions are special cases.

Original patch meets these requirements. I've updated it with moving the test__all__ method to TestMisc class as suggested (tokenize_all.v2.patch).

I'm also attaching the alternative version (tokenize_all.v2.1.patch) that uses self.assertCountEqual instead of support.check__all__ and whitelisting as Serhiy suggested; this version of test doesn't meet the requirements above.

Yes, neither one challenge the tok_name (#25324) problem, I'm not really sure whether is should, though. I'll try to solve it with separate patch if I find some time.
msg275268 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-09-09 06:29
Thanks Jacek. I would prefer to go with v2.patch.
History
Date User Action Args
2016-09-09 06:29:57martin.pantersetmessages: + msg275268
stage: patch review
2016-09-05 18:36:54Unit03setfiles: + tokenize_all.v2.1.patch

messages: + msg274418
2016-09-05 18:36:44Unit03setfiles: + tokenize_all.v2.patch
2016-05-29 00:06:11martin.pantersetmessages: + msg266579
2016-05-25 19:29:28Unit03setmessages: + msg266390
2016-05-25 13:16:53martin.panterlinkissue23883 dependencies
2016-05-25 13:00:11martin.pantersetnosy: + martin.panter
messages: + msg266340
2016-05-25 06:44:08serhiy.storchakasetmessages: + msg266312
2016-05-25 06:27:54Unit03setmessages: + msg266309
2016-05-25 05:46:06serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg266304
2016-05-24 20:56:35Unit03create