Message 398698 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	ncoghlan
Recipients	Nils Kattenbeck, bethard, ncoghlan, paul.j3, peter.otten, samuelmarks, tebeka, zertrin
Date	2021-08-01.14:13:07
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1627827187.58.0.478743647851.issue22240@roundup.psfhosted.org>
In-reply-to

Content
(Note: this is an old enough ticket that it started out with patch files on the tracker rather than PRs on GitHub. That's just a historical artifact, so feel free to open a PR as described in the devguide if you would prefer to work that way) There are two common ways of solving import bootstrapping problems that affect the build process: 1. Make the problematic import lazy, so it only affects the specific operation that needs the binary dependency; or 2. Wrap a try/except around the import, and do something reasonable in the failing case However, I think in this case it should be possible to avoid the zipfile dependency entirely, and instead make the determination based on the contents of __main__ and __main__.__spec__ using the suggestions I noted in an earlier comment (see https://docs.python.org/3/library/importlib.html#importlib.machinery.ModuleSpec for the info the module spec makes available). This should also solve a problem I believe the current patch has, where I think directory execution will give the wrong result (returning "python -m __main__" as the answer instead of the path to the directory containing the __main__.py file). The three cases to cover come from https://docs.python.org/3/using/cmdline.html#interface-options: * if `__main__.__spec__` is None, the result should continue to be `_os.path.basename(_sys.argv[0])` (as it is in the patch) * if `__main__.__spec__.name` is exactly the string "__main__", `__main__.__spec__.parent` is the empty string, and `__main__.__spec__.has_location` is true, and sys.argv[0] is equal to `__main__.__spec__.origin` with the trailing `__main__.py` removed, then we have a directory or zipfile path execution case, and the result should be a Python path execution command using f'{py} {arg0} * otherwise, we have a `-m` invocation, using f'{py} -m {mod.__spec__.name.removesuffix(".__main__")}' The patch gets the three potential results right, but it gets the check for the second case wrong by looking specifically for zipfiles instead of looking at the contents of __main__.__spec__ and seeing if it refers to a __main__.py file located inside the sys.argv[0] path (which may be a directory, zipfile, or any other valid sys.path entry). For writing robust automated tests, I'd suggest either looking at test.support.script_helper for programmatic creation of executable zipfiles and directories ( https://github.com/python/cpython/blob/208a7e957b812ad3b3733791845447677a704f3e/Lib/test/support/script_helper.py#L209 ) or else factoring the logic out such that there is a helper function that receives "__main__.__spec__" and "sys.argv[0]" as parameters, allowing the test suite to easily control them. The latter approach would require some up front manual testing to ensure the behaviour of the different scenarios was being emulated correctly, but would execute a lot faster than actually running subprocesses would.

(Note: this is an old enough ticket that it started out with patch files on the tracker rather than PRs on GitHub. That's just a historical artifact, so feel free to open a PR as described in the devguide if you would prefer to work that way)

There are two common ways of solving import bootstrapping problems that affect the build process:

1. Make the problematic import lazy, so it only affects the specific operation that needs the binary dependency; or
2. Wrap a try/except around the import, and do something reasonable in the failing case

However, I think in this case it should be possible to avoid the zipfile dependency entirely, and instead make the determination based on the contents of __main__ and __main__.__spec__ using the suggestions I noted in an earlier comment (see https://docs.python.org/3/library/importlib.html#importlib.machinery.ModuleSpec for the info the module spec makes available).

This should also solve a problem I believe the current patch has, where I think directory execution will give the wrong result (returning "python -m __main__" as the answer instead of the path to the directory containing the __main__.py file).

The three cases to cover come from https://docs.python.org/3/using/cmdline.html#interface-options:

* if `__main__.__spec__` is None, the result should continue to be `_os.path.basename(_sys.argv[0])` (as it is in the patch)
* if `__main__.__spec__.name` is exactly the string "__main__", `__main__.__spec__.parent` is the empty string, and `__main__.__spec__.has_location` is true, and sys.argv[0] is equal to `__main__.__spec__.origin` with the trailing `__main__.py` removed, then we have a directory or zipfile path execution case, and the result should be a Python path execution command using f'{py} {arg0}
* otherwise, we have a `-m` invocation, using f'{py} -m {mod.__spec__.name.removesuffix(".__main__")}'


The patch gets the three potential results right, but it gets the check for the second case wrong by looking specifically for zipfiles instead of looking at the contents of __main__.__spec__ and seeing if it refers to a __main__.py file located inside the sys.argv[0] path (which may be a directory, zipfile, or any other valid sys.path entry).

For writing robust automated tests, I'd suggest either looking at test.support.script_helper for programmatic creation of executable zipfiles and directories ( https://github.com/python/cpython/blob/208a7e957b812ad3b3733791845447677a704f3e/Lib/test/support/script_helper.py#L209 ) or else factoring the logic out such that there is a helper function that receives "__main__.__spec__" and "sys.argv[0]" as parameters, allowing the test suite to easily control them.

The latter approach would require some up front manual testing to ensure the behaviour of the different scenarios was being emulated correctly, but would execute a lot faster than actually running subprocesses would.

History
Date	User	Action	Args
2021-08-01 14:13:07	ncoghlan	set	recipients: + ncoghlan, tebeka, peter.otten, bethard, paul.j3, zertrin, samuelmarks, Nils Kattenbeck
2021-08-01 14:13:07	ncoghlan	set	messageid: <1627827187.58.0.478743647851.issue22240@roundup.psfhosted.org>
2021-08-01 14:13:07	ncoghlan	link	issue22240 messages
2021-08-01 14:13:07	ncoghlan	create