tokenize.all list is incomplete #71299

Unit03 · 2016-05-24T20:56:36Z

BPO	27112
Nosy	@vadmium, @serhiy-storchaka
Files	tokenize_all.patch tokenize_all.v2.patch tokenize_all.v2.1.patch

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2016-05-24.20:56:35.561>
labels = ['type-feature', 'library']
title = 'tokenize.__all__ list is incomplete'
updated_at = <Date 2016-09-09.06:29:57.326>
user = 'https://bugs.python.org/Unit03'

bugs.python.org fields:

activity = <Date 2016-09-09.06:29:57.326>
actor = 'martin.panter'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Library (Lib)']
creation = <Date 2016-05-24.20:56:35.561>
creator = 'Unit03'
dependencies = []
files = ['42976', '44373', '44374']
hgrepos = []
issue_num = 27112
keywords = ['patch']
message_count = 9.0
messages = ['266275', '266304', '266309', '266312', '266340', '266390', '266579', '274418', '275268']
nosy_count = 3.0
nosy_names = ['martin.panter', 'serhiy.storchaka', 'Unit03']
pr_nums = []
priority = 'normal'
resolution = None
stage = 'patch review'
status = 'open'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue27112'
versions = ['Python 3.6']

Linked PRs

gh-71299: Fix __all__ in tokenize #105907

Unit03 · 2016-05-24T20:56:35Z

That's a child issue of bpo-23883, created to propose a patch fixing tokenize module's __all__ list.

Changes in tests go farther: I've changed import from

    from tokenize import ...

to

    import tokenize

and adjusted all its usages accordingly.

The module must be imported in order to test its __all__ list through test.support.check__all__ helper and just adding this single import would either force us to either do

import tokenize as tokenize_module
or from tokenize import tokenize as tokenize_function

I think going third way: with just "import tokenize" and changing its uses in the rest of tests result in celarer code, but of course I guess this may be too much for a single patch.

serhiy-storchaka · 2016-05-25T05:46:06Z

The blacklist is too long. I think it would be better to use white list.

Unit03 · 2016-05-25T06:27:54Z

I disagree:

blacklist has 48 entries now, whitelist would have 72 ones
whitelisting requires adding new public names to two places instead of one
test.support.check__all__ currently don't support whitelisting, it would need to be added

serhiy-storchaka · 2016-05-25T06:44:09Z

White list consists mostly from token.__all__.
All names in black list are implementation details. Names in white list are stable and already repeated in docs.
test.support.check__all__ shouldn't be used in this case. See other tests for public API that doesn't use it.

vadmium · 2016-05-25T13:00:11Z

I think it might be better to add a separate “import tokenize as tokenize_module”. Or just “import tokenize” inside the test function. IMO changing all the names adds too much churn with minimal benefit.

Maybe also should restore the “from unittest import . . .” line. But keep an “import unittest” line somewhere so the __main__ bit at the bottom works.

Perhaps it would be good to merge MiscTestCase into the existing TestMisc class.

Unit03 · 2016-05-25T19:29:28Z

All names in black list are implementation details. Names in white list are stable and already repeated in docs.

Assumption here is that implementation details shouldn't "look" public - they should have names starting with "_"; I think blacklisting names in these tests encourages good practice - if something "looks" public, either:

it should be documented and placed in __all__
renamed to something that doesn't look public anymore
in some special cases - be explicitely blacklisted in test.

But ok, assuming we go with whitelisting and plain self.assertCountEqual:

White list consists mostly from token.__all__.

Should I then do:

    import token
    expected = token.__all__ + ["COMMENT", "NL", "ENCODING", "TokenInfo", "TokenError", "detect_encoding", "untokenize", "open", "tokenize"]

?

IMO changing all the names adds too much churn with minimal benefit.

I wouldn't call it minimal, it has some positive impact on readability, see last line from The Zen of Python. :) Of course, final call is yours.

Single import of unittest is such a small change I would rather keep it.

I fully agree existing TestMisc class is a good place for this test, though.

vadmium · 2016-05-29T00:06:11Z

Changing the names to tokenize.<name> does solve the problem of the tokenize module versus the tokenize() fuction, so I can accept this way since you prefer it.

So I think that just leaves what to do with the actual test case. I don’t think it matters too much, but I would lean toward ensuring the test fails if someone adds a new implementation detail without an underscore prefix. It is also good to be explicit that the ISTERMINAL() etc functions are special cases.

On the other hand, neither the original patch nor Jacek’s proposal for “expected = token.__all__ + ...” would pick up the fact that the tok_name dictionary is another special case copied from the “token” module. (See also bpo-25324.)

Unit03 · 2016-09-05T18:36:54Z

I would lean toward ensuring the test fails if someone adds a new implementation detail without an underscore prefix. It is also good to be explicit that the ISTERMINAL() etc functions are special cases.

Original patch meets these requirements. I've updated it with moving the test__all__ method to TestMisc class as suggested (tokenize_all.v2.patch).

I'm also attaching the alternative version (tokenize_all.v2.1.patch) that uses self.assertCountEqual instead of support.check__all__ and whitelisting as Serhiy suggested; this version of test doesn't meet the requirements above.

Yes, neither one challenge the tok_name (bpo-25324) problem, I'm not really sure whether is should, though. I'll try to solve it with separate patch if I find some time.

vadmium · 2016-09-09T06:29:57Z

Thanks Jacek. I would prefer to go with v2.patch.

Co-authored-by: Unit03

hauntsaninja · 2023-09-07T05:52:30Z

Looks fixed by #105907, thanks everyone!

Unit03 mannequin added stdlib Python modules in the Lib dir type-feature A feature request or enhancement labels May 24, 2016

ezio-melotti transferred this issue from another repository Apr 10, 2022

lysnikolaou added a commit to lysnikolaou/cpython that referenced this issue Jun 19, 2023

pythongh-71299: Fix __all__ in tokenize

65fdd4c

Co-authored-by: Unit03

bedevere-bot mentioned this issue Jun 19, 2023

gh-71299: Fix __all__ in tokenize #105907

Merged

lysnikolaou added a commit that referenced this issue Jun 19, 2023

gh-71299: Fix __all__ in tokenize (#105907)

ab3823a

Co-authored-by: Unit03

hauntsaninja closed this as completed Sep 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tokenize.all list is incomplete #71299

tokenize.all list is incomplete #71299

Unit03 mannequin commented May 24, 2016 •

edited by bedevere-bot

Unit03 mannequin commented May 24, 2016

serhiy-storchaka commented May 25, 2016

Unit03 mannequin commented May 25, 2016

serhiy-storchaka commented May 25, 2016

vadmium commented May 25, 2016

Unit03 mannequin commented May 25, 2016

vadmium commented May 29, 2016

Unit03 mannequin commented Sep 5, 2016

vadmium commented Sep 9, 2016

hauntsaninja commented Sep 7, 2023

tokenize.__all__ list is incomplete #71299

tokenize.__all__ list is incomplete #71299

Comments

Unit03 mannequin commented May 24, 2016 • edited by bedevere-bot

Linked PRs

Unit03 mannequin commented May 24, 2016

serhiy-storchaka commented May 25, 2016

Unit03 mannequin commented May 25, 2016

serhiy-storchaka commented May 25, 2016

vadmium commented May 25, 2016

Unit03 mannequin commented May 25, 2016

vadmium commented May 29, 2016

Unit03 mannequin commented Sep 5, 2016

vadmium commented Sep 9, 2016

hauntsaninja commented Sep 7, 2023

tokenize.all list is incomplete #71299

tokenize.all list is incomplete #71299

Unit03 mannequin commented May 24, 2016 •

edited by bedevere-bot