This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Delete module-level loop variables when no longer needed
Type: behavior Stage: patch review
Components: Library (Lib) Versions: Python 3.11, Python 3.10, Python 3.9
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: AlexWaygood, serhiy.storchaka, sobolevn, terry.reedy, zach.ware
Priority: normal Keywords: patch

Created on 2022-01-28 16:47 by sobolevn, last changed 2022-04-11 14:59 by admin.

Pull Requests
URL Status Linked Edit
PR 30993 merged sobolevn, 2022-01-28 16:49
Messages (6)
msg412006 - (view) Author: Nikita Sobolev (sobolevn) * (Python triager) Date: 2022-01-28 16:47
Some variables created as `for X in ...` leak into module's namespace, where the loop is defined.

I wrote a simple `flake8` plugin to find names that are used in `ast.Module` in `ast.For`, but not under `if __name__ == '__main__'` and are not used in `del` afterwards.

Here's what I got:

- Lib/inspect.py:157
- Lib/locale.py:746
- Lib/sysconfig.py:186
- Lib/tokenize.py:141 - 151
- Lib/multiprocessing/process.py:427
- Lib/multiprocessing/managers.py:55 
- Lib/json/encoder.py:30
- Lib/http/cookiejar.py:93 
- Lib/email/contentmanager.py:73
- Lib/email/contentmanager.py:79
- Lib/email/contentmanager.py:247
- Lib/email/quoprimime.py:60
- Lib/email/quoprimime.py:149
- Lib/_compat_pickle
- Lib/lib2to3/pgen2/grammar.py

I think, that we need to remove these names. Why?
1. They complicate typeshed typing, we have to annotate them in typeshed, or write custom ignore rules for our test suite. Ref: https://github.com/python/typeshed/blob/56aa2088aada530400b6fdddf0f1d17ca3aaa86f/tests/stubtest_allowlists/py3_common.txt#L448

2. They are in `dir()`, example:

```
>>> import inspect
>>> 'k' in dir(inspect)
True
```

3. They are listed in `help()`, let's use `json.encoder` as an example:

```
DATA
    ESCAPE = re.compile('[\\x00-\\x1f\\\\"\\b\\f\\n\\r\\t]')
    ESCAPE_ASCII = re.compile('([\\\\"]|[^\\ -~])')
    ESCAPE_DCT = {'\x00': r'\u0000', '\x01': r'\u0001', '\x02': r'\u0002',...
    HAS_UTF8 = re.compile(b'[\x80-\xff]')
    INFINITY = inf
    i = 31
```

4. We also have to exclude them sometimes in tests, like https://github.com/python/cpython/blob/db77bcd6092f3c174ae855522411ab83854d65a8/Lib/test/test_inspect.py#L111


I think that adding `del X` somewhere in these modules is a good thing:
1. Not hard to backport
2. Fixes multiple issues above
3. Does not store useless objects in memory
4. Does not confuse people
5. Some modules already delete unused intermediate vars, so it is not something new to CPython, for example: `multiprocessing.process` https://github.com/python/cpython/blob/db77bcd6092f3c174ae855522411ab83854d65a8/Lib/multiprocessing/process.py#L419

PR is on its way!
msg412017 - (view) Author: Alex Waygood (AlexWaygood) * (Python triager) Date: 2022-01-28 17:05
+1 for the proposed PR. Loop variables leaking into the global namespace creates an extra burden for typeshed when we have to wade through a long list of objects that our tests report are present at runtime but not in the typeshed stubs. It's not the end of the world, but it takes time, and is annoying.

As Nikita explains, these leaky variables also make the output of `help()` much less useful.

The fix isn't hard, and shouldn't, in my opinion, be particularly controversial.
msg412056 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2022-01-29 06:50
I am opposed at this time.  Leaving loop variables available is an intended feature of python.

After reading point 1, I was tempted to say that you are making a fetish of typing or making the tail wag the dog. I mention this because others might have similar reactions.  After reading points 2 and 3, I am much more favorable and would allow changes in idlelib if any were needed.  A cleaner dir and help listing affects everyone.

I changed the title to be more 'neutral'.  'Leak' is perjorative.  "we need to remove these names" is a bit misleading as it implies total removal, which is not the proposal.

As it is, the PR applies a style standard on the stdlib that is not in PEP 8.  I recommend that you start by proposing an addition to PEP-8. 
 "Unless X, global loop variables should be explicitly deleted as soon as not needed.  Or use a comprehension."  (I checked and 'loop' does not currently appear in PEP-8, and none of the 5 examples I checked could use a comprehension.)

If you do, I recommend starting with dir and help, with typing third.

You might post the idea on pydev and ask how much and what sort of discussion is needed.
msg412059 - (view) Author: Nikita Sobolev (sobolevn) * (Python triager) Date: 2022-01-29 08:37
Thanks for the better wording, Terry!

> Leaving loop variables available is an intended feature of python.

Just to be clear: it sure is! But, sometimes we don't want to polute a global namespace with this variable. A common practice across CPython's source is to use `del` or comprehensions. That's exactly what my PR does: it finds places where people forgot to do that and adds a cleanup.

>  As it is, the PR applies a style standard on the stdlib that is not in PEP 8.  I recommend that you start by proposing an addition to PEP-8.

Interesting idea! But, I think that this is an orthogonal non-blocking task, which might be much harder compared to this one :) 

I will add this to my backlog.

> You might post the idea on pydev and ask how much and what sort of discussion is needed.

Good starting point, agreed!
msg412065 - (view) Author: Alex Waygood (AlexWaygood) * (Python triager) Date: 2022-01-29 09:29
I agree that the typeshed issue is less important than the output of dir() and help(). I also agree that we shouldn't make a fetish of typing.

However, I see the typeshed issue less as an issue specific to typing, and more as an example that illustrates a general problem third-party tools have. For example, having these temporary variables leak into global namespaces also makes IDE autocompletion less valuable, and makes life harder for tools that auto-generate documentation.

Perhaps a PEP8 revision might be warranted in due course — but in the meantime, are there any downsides to this proposed change? I believe we have pointed out several upsides.

I don't see this as a style issue first and foremost: the PR is attempting to solve a genuine problem for some end-users of Python. It is not simply making cosmetic changes.
msg412429 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2022-02-03 09:20
New changeset 0cbdd2131195b0d313762968f604e80a3e65ca9f by Nikita Sobolev in branch 'main':
bpo-46565: `del` loop vars that are leaking into module namespaces (GH-30993)
https://github.com/python/cpython/commit/0cbdd2131195b0d313762968f604e80a3e65ca9f
History
Date User Action Args
2022-04-11 14:59:55adminsetgithub: 90723
2022-02-03 09:20:22serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg412429
2022-01-29 09:29:08AlexWaygoodsetmessages: + msg412065
2022-01-29 08:37:14sobolevnsetmessages: + msg412059
2022-01-29 06:50:25terry.reedysetnosy: + terry.reedy

messages: + msg412056
title: Multiple modules leak `for` loop variables into module's namespace -> Delete module-level loop variables when no longer needed
2022-01-28 17:05:59AlexWaygoodsetmessages: + msg412017
2022-01-28 16:54:02zach.waresetnosy: + zach.ware
2022-01-28 16:49:49sobolevnsetkeywords: + patch
stage: patch review
pull_requests: + pull_request29173
2022-01-28 16:49:25sobolevnsetnosy: + AlexWaygood
2022-01-28 16:47:31sobolevncreate