classification
Title: Improve the __main__ module documentation
Type: enhancement Stage: resolved
Components: Documentation Versions: Python 3.11, Python 3.10
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: docs@python Nosy List: cameron, docs@python, gvanrossum, iritkatriel, jack__d, lukasz.langa, maggyero, miss-islington, ncoghlan, steven.daprano, terry.reedy
Priority: normal Keywords: patch

Created on 2020-01-25 14:00 by maggyero, last changed 2021-09-09 07:48 by ncoghlan. This issue is now closed.

Files
File name Uploaded Description Edit
less_prescriptive.diff jack__d, 2021-08-31 22:31
Pull Requests
URL Status Linked Edit
PR 14487 closed maggyero, 2020-01-25 14:00
PR 26883 merged jack__d, 2021-06-23 19:25
PR 27932 merged miss-islington, 2021-08-24 17:01
Messages (17)
msg360682 - (view) Author: Géry (maggyero) * Date: 2020-01-25 14:00
This PR will apply the following changes on the [`__main__` module documentation](https://docs.python.org/3.7/library/__main__.html):

- correct the phrase "run as script" by "run from the file system" (as used in the [`runpy`](https://docs.python.org/3/library/runpy.html) documentation) since "run as script" does not mean the intended `python foo.py` but `python -m foo` (cf. [PEP 338](https://www.python.org/dev/peps/pep-0338/));
- replace the phrase "run with `-m`" by "run from the module namespace" (as used in the [`runpy`](https://docs.python.org/3/library/runpy.html) documentation) since the module can be equivalently run with `runpy.run_module('foo')` instead of `python -m foo`;
- make the block comment [PEP 8](https://www.python.org/dev/peps/pep-0008/#comments)-compliant (located before the `if` block, capital initialised, period ended);
- add a missing case for which a package's \_\_main\_\_.py is executed (when the package is run from the file system: `python foo/`).
msg360695 - (view) Author: Steven D'Aprano (steven.daprano) * (Python committer) Date: 2020-01-25 16:10
There are some serious problems with the PR.

You state that these two phrases are from the runpy documentation:

* "run from the module namespace"
* "run from the file system"

but neither of those phrases appear in the runpy documentation here:

https://docs.python.org/3/library/runpy.html

You also say:

> "run as script" does not mean the intended `python foo.py` 
> but `python -m foo`

but this is incorrect, and I think based on a misunderstanding of PEP 338. The title of PEP 338, "Executing modules as scripts", is not exclusive: the PEP is about the -m mechanism for *locating the module* in order to run it as a script. It doesn't imply that `python spam.py` should no longer be considered to be running a script.

In common parlance, "run as a script" certainly does include the case where you specify the module by filename `python spam.py` as well as the -m case where you specify it as a module name and let the interpreter locate the file. In other words, both

    python pathname/spam.py
    python -m spam

are correctly described as "running spam.py as a script" (and other variations). They differ in how the script is specified, but both mechanisms treat the spam.py file as a script and run it.

See for example https://duckduckgo.com/?q=how+to+run+a+python+script for examples of common usage.

Consequently, it is simply wrong to say that the intended usage of "run a script" is the -m mechanism.

The PR changes the term "scope" to "environment", but I think that is wrong. An environment is potentially greater than a scope. `__main__` is a module namespace, hence a scope. The environment includes things outside of that scope, such as the builtins, environment variables, the current working directory, the python path, etc. We don't talk about modules being an environment, but as making up a scope.

The PR introduces the phrase "when the module is run from the file system" to mean the case where a script is run using `python spam.py`, but it equally applies to the case of `python -m spam`. In both cases, spam is located somewhere in the file system.

(It is conceivable that -m could locate and run a built-in module, but I don't know any cases where that actually works. Even if it does, we surely don't need to complicate the docs for this corner case. It's enough to know that -m will locate the module and run it.)

The PR describes three cases: running from the file system, running from stdin, and running "from the module namespace" but that last one is a clumsy phrase which, it seems to me, is not correct. How do you run a module from its own namespace? Modules *are* a namespace, and we say code runs *in* a namespace, not "from" it.

In any case, it doesn't matter whether the script is specified on the command line as a file name, or as a module name with -m, or double-clicked in a GUI, in all three cases the module's code is executed in the module's namespace.

So it is wrong to distinguish "from the file system" and "from (in) the module namespace" as two distinct cases. They are the same case.

The PR replaces the comment inside the `if` block:

    # execute only if run as a script

with a comment above the `if` statement:

    # Execute only if the module is not imported.

but the new comment is factually incorrect on two counts. Firstly, it is not correct that the `if` statement executes only if the module is not imported. There is no magic to the `if` statement. It always executes, regardless of whether the module is being run as a script or not. We can write code like this:

    if print("Hello, this always runs!") or __name__ == '__main__':
        # execute only if run as a script
        print('running as a script')
    else:
        # execute only if *not* run as a script
        print('not run as a script')

Placing the comment above the `if`, where it will apply to the entire `if` statement, is incorrect.

The second problem is that when running a module with -m it *is* imported. PEP 338 is clear about this:

"if -m is used to execute a module the PEP 302 import mechanisms are used to locate the module and retrieve its compiled code, before executing the module"

(in other words: import the module). We can test this, for example, if you create a package:

    spam/
    +-- __init__.py
    +-- eggs.py

and then run `python -m spam.eggs`, not only `__main__` (the eggs.py module) but also `spam` will be found in sys.modules. So the new comment is simply wrong.

There may be other issues with the PR.
msg377026 - (view) Author: Géry (maggyero) * Date: 2020-09-16 21:15
Thanks for your extended review Steven.

> You state that these two phrases are from the runpy documentation:
>
> * "run from the module namespace"
> * "run from the file system"
>
> but neither of those phrases appear in the runpy documentation here:
>
> https://docs.python.org/3/library/runpy.html

I agree. Actually the first paragraph of the page uses the phrases:

- "located using the module namespace";
- "located using the file system",

so instead of saying:

- "run a module located using the module namespace" to mean "python <file>
- "run a module located using the file system" to mean "python -m <module>",

I simplified to:

- "run from the module namespace"
- "run from the file system"

But since the terminology is misleading I have used these phrases instead:

- `python`: "module initialized from an interactive prompt";
- `python < <file>`: "module initialized from standard input";
- `python <file>`: "module initialized from a file argument";
- `python -c <code>`: "module initialized from a `-c` argument";
- `python -m <module>`: "module initialized from a `-m` argument";
- `import <module>`: "module initialized from an import statement".

What the documentation tries to explain is that in all of these cases except the last one, code is executed in the __main__ module.

I have updated the PR.

----

> The PR changes the term "scope" to "environment", but I think that is wrong. An environment is potentially greater than a scope. `__main__` is a module namespace, hence a scope. The environment includes things outside of that scope, such as the builtins, environment variables, the current working directory, the python path, etc. We don't talk about modules being an environment, but as making up a scope.

I disagree. According to Wikipedia (https://en.wikipedia.org/wiki/Scope_(computer_science)), the term "scope" is the part of a program where a name binding is valid, while the term "environment" (synonym of "context") is the set of name bindings that are valid within a part of a program. Therefore "scope" is a property of a name binding (a name binding has a scope), and "environment" is a property of a part of a program (a part of a program has an environment).

And the term "environment" is actually already used in the original title and synopsis of the document (and it is correct):

> :mod:`__main__` --- Top-level script environment

> .. module:: __main__
>     :synopsis: The environment where the top-level script is run.

So my change to the body fixes the inconsistent and incorrect usage of "scope":

- ``'__main__'`` is the name of the scope in which top-level code executes.
+ ``'__main__'`` is the name of the environment where top-level code is run.

- A module can discover whether or not it is running in the main scope
+ A module can discover whether or not it is running in the main environment

----

> Placing the comment above the `if`, where it will apply to the entire `if` statement, is incorrect.

I agree. Sometimes you see comments before if statements but they usually don't start with "execute".

I have updated the PR.

----

> The second problem is that when running a module with -m it *is* imported. PEP 338 is clear about this:

I agree. I should have said "when the module is not initialized from an import statement".

But note that even before my change the original document already used the phrase "not imported":

- executing code in a module when it is run as a script or with ``python
- -m`` but not when it is imported::
+ executing code in a module when it is not imported::

- # execute only if run as a script
+ # Execute only if the module is not imported.

I have updated the PR.
msg377035 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2020-09-17 07:18
The main issue I have with the existing doc is its use of 'top-level' to mean the main, initial, startup module that first executes the user code for a python 'program'.  We routinely use 'top-level' instead for the global scope of a module.  Example: https://docs.python.org/3/glossary.html, 'qualified name' entry, line 2: "For top-level functions and classes, ..."  Within '__main__', some code is top-level, but class and function bodies are not.

But this does not have to be part of this PR.
msg377050 - (view) Author: Géry (maggyero) * Date: 2020-09-17 09:43
I agree with you Terry. Another thing that bothers me: in the current document, the __main__ module is reduced to its environment (aka context or dictionary), whereas a module object has other important attributes such as its code.

So how about adding the following changes?

- :mod:`__main__` --- Top-level code environment
- ==============================================
+ :mod:`__main__` --- Startup module
+ ==================================

-    :synopsis: The environment where top-level code is run.
+    :synopsis: The first module from which the code is executed at startup.

- ``'__main__'`` is the name of the environment where top-level code is run.
+ ``'__main__'`` is the name of the startup module.

- A module can discover whether or not it is running in the main environment
+ A module can discover whether or not it is initialized as the :mod:`__main__` module
msg396157 - (view) Author: Irit Katriel (iritkatriel) * (Python committer) Date: 2021-06-20 00:00
See also Issue24632 and Issue17359.
msg396443 - (view) Author: Jack DeVries (jack__d) * Date: 2021-06-23 19:28
Hi All,

As I wrote on the PR::

    I am picking up the torch on 39452, continuing where @maggyero left 
    off, and also implementing my discourse proposal, which seemed to be 
    well-liked.

Feel free to leave any feedback for me on the GitHub PR, I'm looking forward to continuing to develop this work based on community feedback.
msg399348 - (view) Author: Jack DeVries (jack__d) * Date: 2021-08-10 17:48
Hi All,

I'm pinging everyone here on the bpo because my GitHub PR has been through a lot of revision and review. Maybe it's close to being ready to merge (I hope)!

Feel free to take a look if you are interested: https://github.com/python/cpython/pull/26883
msg400219 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2021-08-24 17:01
New changeset 7cba23164cf82f6619db002cd30021b5dfb1f809 by Jack DeVries in branch 'main':
bpo-39452: Rewrite and expand __main__.rst (#26883)
https://github.com/python/cpython/commit/7cba23164cf82f6619db002cd30021b5dfb1f809
msg400238 - (view) Author: miss-islington (miss-islington) Date: 2021-08-24 20:54
New changeset ec5a03168f02ef92f98a94796bc6378fc73622e8 by Miss Islington (bot) in branch '3.10':
bpo-39452: Rewrite and expand __main__.rst (GH-26883)
https://github.com/python/cpython/commit/ec5a03168f02ef92f98a94796bc6378fc73622e8
msg400239 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2021-08-24 20:55
Thanks a lot, Géry and Jack! ✨ 🍰 ✨
msg400671 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2021-08-30 21:31
Thanks, the rewrite is great!

I have one nit: did you consider which of these two idioms is better?

if __name__ == "__main__":
    main()

vs.

if __name__ == "__main__":
    sys.exit(main())

Your docs seem to promote the second, whereas I've usually preferred the former. Was this a considered choice on your part?
msg400781 - (view) Author: Géry (maggyero) * Date: 2021-08-31 21:31
@jack__d

Thanks for the rewrite! This is a great expansion. Unfortunately I didn’t have the time to review it before the merge. If I find something to be improved I will let you know.

@gvanrossum

> Your docs seem to promote the second, whereas I've usually preferred the former.

Are you sure? Yet in your 2003 blog post [*Python main() functions*](https://www.artima.com/weblogs/viewpost.jsp?thread=4829) you promoted the opposite idiom `if __name__ == "__main__": sys.exit(main())` over the idiom `if __name__ == "__main__": main()`:

> Now the `sys.exit()` calls are annoying: when `main()` calls `sys.exit()`, your interactive Python interpreter will exit! The remedy is to let `main()`'s return value specify the exit status.

I am interested in the rationale if you changed your mind.
msg400782 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2021-08-31 21:40
You're right, I'm being inconsistent. :-(  I withdraw my objection.

There are cases where sys.exit() is easier than returning an exit code, e.g. when the error is discovered deep inside some other code. But it's probably better to raise a dedicated exception in that case and catch it in main(), rather than just calling sys.exit() deep inside the other code. It's probably too fine a point for a tutorial. Sorry!
msg400791 - (view) Author: Jack DeVries (jack__d) * Date: 2021-08-31 22:31
> Your docs seem to promote the second, whereas I've usually preferred the
> former. Was this a considered choice on your part?

First and foremost, stupid GitHub is not letting the permalink load for some
reason, but yes; this was discussed in the conversation with @graingert on
June 29th – it was his suggestion. Later, @pradyunsg from PyPa added some
suggestions about how the document described console script entrypoints,
and the documentation around this issue changed a bit again.

As far as my perspective, I also never personally use the sys.exit idiom
myself. After all, an exception is going to cause a non-zero exit code, and a
traceback is always going to have a lot more value than an exit code.

I was, however, surprised to learn how pip treats console script entry points
in the course of working on this document. Specifically, it generates an
executable script that does wrap the function in sys.exit.I definitely think
that the way the document communicates this fact while teaching the idiom is a
good thing, so I think that whole "Idiomatic Usage" section is good.

I do think we can tweak the document slightly to make it less prescriptive,
though, because in reality a lot of people _don't_ use this idiom, so
presenting it as a de-facto standard is misleading. Plus, it's not
Pythonic to dole out prescriptive boilerplate.

I attached a diff that steers in that direction. What do you all think? It is
a pretty slight change, but I think it better strikes a balance.
msg400793 - (view) Author: Géry (maggyero) * Date: 2021-08-31 22:55
No worries, it was almost twenty years ago.

> But it's probably better to raise a dedicated exception in that case and catch it in main(), rather than just calling sys.exit() deep inside the other code.

Yes I agree, and I think you explained very clearly why it is better in the blog post:

> Another refinement is to define a Usage() exception, which we catch in an except clause at the end of main():
> […]
> This gives the main() function a single exit point, which is preferable over multiple return 2 statements.

So I think you made two independent points:

- raising a dedicated exception instead of calling `sys.exit` inside nested functions and catching it inside `main` allows a single exit point;
- calling `sys.exit` outside of `main` instead of inside prevents exiting the Python interpreter in an interactive session.
msg401441 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2021-09-09 07:48
These changes are excellent - thanks for the patch!

Something even the updated version doesn't cover yet is directory and zipfile execution, so I filed bpo-45149 as a follow up ticket for that (the info does exist elsewhere in the documentation, so it's mostly just a matter of adding it to the newly expanded page, and deciding what new cross-references, if any, would be appropriate)
History
Date User Action Args
2021-09-09 07:48:17ncoghlansetnosy: + ncoghlan
messages: + msg401441
2021-08-31 22:55:18maggyerosetmessages: + msg400793
2021-08-31 22:31:41jack__dsetfiles: + less_prescriptive.diff

messages: + msg400791
2021-08-31 21:40:03gvanrossumsetmessages: + msg400782
2021-08-31 21:31:37maggyerosetmessages: + msg400781
2021-08-30 21:31:56gvanrossumsetnosy: + gvanrossum
messages: + msg400671
2021-08-24 20:55:28lukasz.langasetstatus: open -> closed
versions: - Python 3.9
messages: + msg400239

resolution: fixed
stage: patch review -> resolved
2021-08-24 20:54:18miss-islingtonsetmessages: + msg400238
2021-08-24 17:01:52miss-islingtonsetnosy: + miss-islington
pull_requests: + pull_request26381
2021-08-24 17:01:49lukasz.langasetnosy: + lukasz.langa
messages: + msg400219
2021-08-10 17:48:35jack__dsetmessages: + msg399348
2021-06-23 19:33:32zach.waresetversions: - Python 3.6, Python 3.7, Python 3.8
2021-06-23 19:28:36jack__dsetversions: + Python 3.6, Python 3.7, Python 3.8, Python 3.9, Python 3.10
2021-06-23 19:28:25jack__dsetmessages: + msg396443
2021-06-23 19:25:39jack__dsetkeywords: + patch
nosy: + jack__d

pull_requests: + pull_request25460
stage: patch review
2021-06-21 04:12:00cameronsetnosy: + cameron
2021-06-20 00:00:48iritkatrielsetversions: + Python 3.11, - Python 3.8
2021-06-20 00:00:25iritkatrielsetnosy: + iritkatriel
messages: + msg396157
2020-09-17 09:43:10maggyerosetmessages: + msg377050
2020-09-17 07:18:50terry.reedysetnosy: + terry.reedy
messages: + msg377035
2020-09-16 21:15:26maggyerosetmessages: + msg377026
2020-01-25 16:10:06steven.dapranosetnosy: + steven.daprano
messages: + msg360695
2020-01-25 14:00:07maggyerocreate