classification
Title: Useful (expensive) information is discarded in getpath.c.
Type: enhancement Stage: resolved
Components: Interpreter Core Versions: Python 3.11
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: corona10, eric.snow, gvanrossum, vstinner
Priority: normal Keywords: patch

Created on 2021-09-15 17:19 by eric.snow, last changed 2021-10-14 20:48 by eric.snow. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 28550 merged eric.snow, 2021-09-24 16:28
PR 28584 merged eric.snow, 2021-09-27 16:24
PR 28585 closed eric.snow, 2021-09-27 17:06
PR 28586 merged eric.snow, 2021-09-27 19:23
PR 28954 merged eric.snow, 2021-10-14 18:23
Messages (10)
msg401860 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2021-09-15 17:19
Currently we calculate a number of filesystem paths during runtime initialization in Modules/getpath.c (with the key goal of producing what will end up in sys.path).  Some of those paths are preserved and some are not.  In cases where the discarded data comes from filesystem access, we should preserve as much as possible.

The most notable info is location of the stdlib source files.  We would store this as PyConfig.stdlib_dir (and _PyPathConfig.stdlib_dir).  We'd expose it with sys.stdlibdir (or sys.get_stdlib_dir() if we might need to calculate lazily), similar to sys.platlibdir, sys.home, and sys.prefix.

sys.stdlibdir would allow us to avoid filesystem access, for example:

* in site.py
* in sysconfig.py
* detect if python is running out of the source tree (needed for bpo-45020)

FYI, I have a branch that mostly does what I'm suggesting here.
msg401867 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2021-09-15 17:55
Honestly I find it debatable whether we're doing anyone a favor by publishing the __file__ of the corresponding stdlib file for frozen modules. There will be situations where this points to the wrong file, and editing the file will not have an effect (unless you rebuild). I'd rather tell people to use -X frozen_modules=off until they can fix their dependency on the __file__ of frozen modules.
msg401900 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2021-09-15 20:05
Setting __file__ on frozen modules is only one example.  I actually need the stdlib dir to be preserved for other uses.

FWIW, I think you're probably right about __file__ on frozen modules.  That said, further discussion about __file__ (or __path__) is probably more suitable in bpo-21736.
msg401943 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-09-16 13:20
See also https://github.com/python/cpython/pull/23169 of bpo-42260 which rewrites getpath.c in pure Python.
msg402730 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2021-09-27 16:00
New changeset ae7839bbe817329dd015f9195da308a0f3fbd3e2 by Eric Snow in branch 'main':
bpo-45211: Move helpers from getpath.c to internal API. (gh-28550)
https://github.com/python/cpython/commit/ae7839bbe817329dd015f9195da308a0f3fbd3e2
msg402798 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2021-09-28 18:18
New changeset 0c50b8c0b8274d54d6b71ed7bd21057d3642f138 by Eric Snow in branch 'main':
bpo-45211: Remember the stdlib dir during startup. (gh-28586)
https://github.com/python/cpython/commit/0c50b8c0b8274d54d6b71ed7bd21057d3642f138
msg402814 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2021-09-28 22:50
I have what I need for now (stdlib dir).  There may be more info to preserve, but I'll leave it others to pursue that.
msg402815 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2021-09-28 23:12
Shouldn't we just close the issue and the unused PR? Otherwise we'll just have yet another vague bpo issue that doesn't have anything particularly actionable -- "there's some code that could be refactored" is not enough of a reason to have a bpo issue open, unless it's either a recurring maintenance chore (doesn't seem to be the case here) or something that a beginner could easily scoop up (definitely not the case here, there are subtleties lurking around every corner).
msg402876 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2021-09-29 14:14
Yeah, I was thinking along those lines too but hesitated. :)
msg403944 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2021-10-14 20:48
New changeset 0bbea0723ee07f9d7ad9745f0e1875718ef38715 by Eric Snow in branch 'main':
bpo-45471: Do not set PyConfig.stdlib_dir in Py_SetPythonHome(). (gh-28954)
https://github.com/python/cpython/commit/0bbea0723ee07f9d7ad9745f0e1875718ef38715
History
Date User Action Args
2021-10-14 20:48:44eric.snowsetmessages: + msg403944
2021-10-14 18:23:22eric.snowsetpull_requests: + pull_request27243
2021-09-29 14:14:04eric.snowsetstatus: open -> closed
resolution: fixed
messages: + msg402876

stage: patch review -> resolved
2021-09-28 23:12:15gvanrossumsetmessages: + msg402815
2021-09-28 22:50:50eric.snowsetassignee: eric.snow ->
messages: + msg402814
2021-09-28 18:18:41eric.snowsetmessages: + msg402798
2021-09-27 19:23:47eric.snowsetpull_requests: + pull_request26968
2021-09-27 17:06:50eric.snowsetpull_requests: + pull_request26967
2021-09-27 16:24:06eric.snowsetpull_requests: + pull_request26966
2021-09-27 16:00:41eric.snowsetmessages: + msg402730
2021-09-24 16:28:01eric.snowsetkeywords: + patch
stage: needs patch -> patch review
pull_requests: + pull_request26933
2021-09-16 13:20:17vstinnersetnosy: + vstinner
messages: + msg401943
2021-09-16 05:02:58corona10setnosy: + corona10
2021-09-15 20:05:49eric.snowsetmessages: + msg401900
2021-09-15 17:55:32gvanrossumsetnosy: + gvanrossum
messages: + msg401867
2021-09-15 17:19:45eric.snowcreate