Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delete module-level loop variables when no longer needed #90723

Closed
sobolevn opened this issue Jan 28, 2022 · 6 comments
Closed

Delete module-level loop variables when no longer needed #90723

sobolevn opened this issue Jan 28, 2022 · 6 comments
Labels
3.9 only security fixes 3.10 only security fixes 3.11 bug and security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@sobolevn
Copy link
Member

BPO 46565
Nosy @terryjreedy, @zware, @serhiy-storchaka, @sobolevn, @AlexWaygood
PRs
  • bpo-46565: del loop vars that are leaking into module namespaces #30993
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2022-01-28.16:47:31.465>
    labels = ['type-bug', 'library', '3.9', '3.10', '3.11']
    title = 'Delete module-level loop variables when no longer needed'
    updated_at = <Date 2022-02-03.09:20:22.802>
    user = 'https://github.com/sobolevn'

    bugs.python.org fields:

    activity = <Date 2022-02-03.09:20:22.802>
    actor = 'serhiy.storchaka'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Library (Lib)']
    creation = <Date 2022-01-28.16:47:31.465>
    creator = 'sobolevn'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 46565
    keywords = ['patch']
    message_count = 6.0
    messages = ['412006', '412017', '412056', '412059', '412065', '412429']
    nosy_count = 5.0
    nosy_names = ['terry.reedy', 'zach.ware', 'serhiy.storchaka', 'sobolevn', 'AlexWaygood']
    pr_nums = ['30993']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue46565'
    versions = ['Python 3.9', 'Python 3.10', 'Python 3.11']

    @sobolevn
    Copy link
    Member Author

    Some variables created as for X in ... leak into module's namespace, where the loop is defined.

    I wrote a simple flake8 plugin to find names that are used in ast.Module in ast.For, but not under if __name__ == '__main__' and are not used in del afterwards.

    Here's what I got:

    I think, that we need to remove these names. Why?

    1. They complicate typeshed typing, we have to annotate them in typeshed, or write custom ignore rules for our test suite. Ref: https://github.com/python/typeshed/blob/56aa2088aada530400b6fdddf0f1d17ca3aaa86f/tests/stubtest_allowlists/py3_common.txt#L448

    2. They are in dir(), example:

    >>> import inspect
    >>> 'k' in dir(inspect)
    True
    
    1. They are listed in help(), let's use json.encoder as an example:
    DATA
        ESCAPE = re.compile('[\\x00-\\x1f\\\\"\\b\\f\\n\\r\\t]')
        ESCAPE_ASCII = re.compile('([\\\\"]|[^\\ -~])')
        ESCAPE_DCT = {'\x00': r'\u0000', '\x01': r'\u0001', '\x02': r'\u0002',...
        HAS_UTF8 = re.compile(b'[\x80-\xff]')
        INFINITY = inf
        i = 31
    
    1. We also have to exclude them sometimes in tests, like
      support.check__all__(self, inspect, not_exported=("k", "v", "mod_dict", "modulesbyfile"))

    I think that adding del X somewhere in these modules is a good thing:

    1. Not hard to backport
    2. Fixes multiple issues above
    3. Does not store useless objects in memory
    4. Does not confuse people
    5. Some modules already delete unused intermediate vars, so it is not something new to CPython, for example: multiprocessing.process
      del _MainProcess

    PR is on its way!

    @sobolevn sobolevn added 3.9 only security fixes 3.10 only security fixes 3.11 bug and security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Jan 28, 2022
    @AlexWaygood
    Copy link
    Member

    +1 for the proposed PR. Loop variables leaking into the global namespace creates an extra burden for typeshed when we have to wade through a long list of objects that our tests report are present at runtime but not in the typeshed stubs. It's not the end of the world, but it takes time, and is annoying.

    As Nikita explains, these leaky variables also make the output of help() much less useful.

    The fix isn't hard, and shouldn't, in my opinion, be particularly controversial.

    @terryjreedy
    Copy link
    Member

    I am opposed at this time. Leaving loop variables available is an intended feature of python.

    After reading point 1, I was tempted to say that you are making a fetish of typing or making the tail wag the dog. I mention this because others might have similar reactions. After reading points 2 and 3, I am much more favorable and would allow changes in idlelib if any were needed. A cleaner dir and help listing affects everyone.

    I changed the title to be more 'neutral'. 'Leak' is perjorative. "we need to remove these names" is a bit misleading as it implies total removal, which is not the proposal.

    As it is, the PR applies a style standard on the stdlib that is not in PEP-8. I recommend that you start by proposing an addition to PEP-8.
    "Unless X, global loop variables should be explicitly deleted as soon as not needed. Or use a comprehension." (I checked and 'loop' does not currently appear in PEP-8, and none of the 5 examples I checked could use a comprehension.)

    If you do, I recommend starting with dir and help, with typing third.

    You might post the idea on pydev and ask how much and what sort of discussion is needed.

    @terryjreedy terryjreedy changed the title Multiple modules leak for loop variables into module's namespace Delete module-level loop variables when no longer needed Jan 29, 2022
    @sobolevn
    Copy link
    Member Author

    Thanks for the better wording, Terry!

    Leaving loop variables available is an intended feature of python.

    Just to be clear: it sure is! But, sometimes we don't want to polute a global namespace with this variable. A common practice across CPython's source is to use del or comprehensions. That's exactly what my PR does: it finds places where people forgot to do that and adds a cleanup.

    As it is, the PR applies a style standard on the stdlib that is not in PEP-8. I recommend that you start by proposing an addition to PEP-8.

    Interesting idea! But, I think that this is an orthogonal non-blocking task, which might be much harder compared to this one :)

    I will add this to my backlog.

    You might post the idea on pydev and ask how much and what sort of discussion is needed.

    Good starting point, agreed!

    @AlexWaygood
    Copy link
    Member

    I agree that the typeshed issue is less important than the output of dir() and help(). I also agree that we shouldn't make a fetish of typing.

    However, I see the typeshed issue less as an issue specific to typing, and more as an example that illustrates a general problem third-party tools have. For example, having these temporary variables leak into global namespaces also makes IDE autocompletion less valuable, and makes life harder for tools that auto-generate documentation.

    Perhaps a PEP-8 revision might be warranted in due course — but in the meantime, are there any downsides to this proposed change? I believe we have pointed out several upsides.

    I don't see this as a style issue first and foremost: the PR is attempting to solve a genuine problem for some end-users of Python. It is not simply making cosmetic changes.

    @serhiy-storchaka
    Copy link
    Member

    New changeset 0cbdd21 by Nikita Sobolev in branch 'main':
    bpo-46565: del loop vars that are leaking into module namespaces (GH-30993)
    0cbdd21

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.9 only security fixes 3.10 only security fixes 3.11 bug and security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants