Message 400460 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	larry
Recipients	BTaskaya, Mark.Shannon, brandtbucher, brett.cannon, eric.snow, gvanrossum, larry, lemburg, nascheme, ronaldoussoren
Date	2021-08-28.05:14:39
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1630127679.66.0.551195032564.issue45020@roundup.psfhosted.org>
In-reply-to

Content
> What the two approaches have in common is that they require rebuilding the python binary whenever you edit any of the changed modules. I heard somewhere (I'm sorry, I honestly don't recall who said it first, possibly Eric himself) that Jeethu's approach was rejected because of that. My dim recollection was that Jeethu's approach wasn't explicitly rejected, more that the community was more "conflicted" than "strongly interested", so I lost interest, and nobody else followed up. > I don't understand entirely why Jeethu's prototype had part written in C. My theory: it's easier to serialize C objects from C. It's maybe even slightly helpful? But it made building a pain. And yeah it just doesn't seem necessary. The code generator will be tied to the C representation no matter how you do it, so you might as well write it in a nice high-level language. > I never ran it so I don't know what the generated code looked like, [...] You can see an example of Jeethu's serialized objects here: https://raw.githubusercontent.com/python/cpython/267c93d61db9292921229fafd895b5ff9740b759/Python/frozenmodules.c Yours is generally more readable because you're using the new named structure initializers syntax. Though Jeethu's code is using some symbolic constants (e.g. PyUnicode_1BYTE_KIND) where you're just printing the actual value. > > With that in place, it'd be great to pre-cache all the .py files automatically read in at startup. > > All the .py files? I think the binary bloat cause by deep-freezing the entire stdlib would be excessive. I did say "all the .py files automatically read in at startup". In current trunk, there are 32 modules in sys.module at startup (when run non-interactively), and by my count 13 of those are written in Python. If we go with Eric's approach, that means we'd turn those .pyc files into static data. My quick experiment suggests that'd be less than 300k. On my 64-bit Linux system, a default build of current trunk (configure && make -j) yields a 23mb python executable, and a 44mb libpython3.11.a. If I build without -g, they are 4.3mb and 7mb respectively. So this speedup would add another 2.5% to the size of a stripped build. If even that 300k was a concern, the marshal approach would also permit us to compile all the deep-frozen modules into a separate shared library and unload it after we're done. I don't know what the runtime impact of "deep-freeze" is, but it seems like it'd be pretty minimal. You're essentially storing these objects in C static data instead of the heap, which should be about the same. Maybe it'd mean the code objects for the module bodies would stick around longer than they otherwise would? But that doesn't seem like it'd add that much overhead. It's interesting to think about applying these techniques to the entire standard library, but as you suggest that would probably be wasteful. On the other hand: if we made a viable tool that could consume some arbitrary set of .py files and produce a C file, and said C file could then be compiled into a shared library, end users could enjoy this speedup over the subset of the standard library their program used, and perhaps even their own source tree(s).

> What the two approaches have in common is that they require rebuilding the python binary whenever you edit any of the changed modules. I heard somewhere (I'm sorry, I honestly don't recall who said it first, possibly Eric himself) that Jeethu's approach was rejected because of that.

My dim recollection was that Jeethu's approach wasn't explicitly rejected, more that the community was more "conflicted" than "strongly interested", so I lost interest, and nobody else followed up.


> I don't understand entirely why Jeethu's prototype had part written in C.

My theory: it's easier to serialize C objects from C.  It's maybe even slightly helpful?  But it made building a pain.  And yeah it just doesn't seem necessary.  The code generator will be tied to the C representation no matter how you do it, so you might as well write it in a nice high-level language.


> I never ran it so I don't know what the generated code looked like, [...]

You can see an example of Jeethu's serialized objects here:

https://raw.githubusercontent.com/python/cpython/267c93d61db9292921229fafd895b5ff9740b759/Python/frozenmodules.c

Yours is generally more readable because you're using the new named structure initializers syntax.  Though Jeethu's code is using some symbolic constants (e.g. PyUnicode_1BYTE_KIND) where you're just printing the actual value.


> > With that in place, it'd be great to pre-cache all the .py files automatically read in at startup.
>
> *All* the .py files?  I think the binary bloat cause by deep-freezing the entire stdlib would be excessive.

I did say "all the .py files automatically read in at startup".  In current trunk, there are 32 modules in sys.module at startup (when run non-interactively), and by my count 13 of those are written in Python.

If we go with Eric's approach, that means we'd turn those .pyc files into static data.  My quick experiment suggests that'd be less than 300k.  On my 64-bit Linux system, a default build of current trunk (configure && make -j) yields a 23mb python executable, and a 44mb libpython3.11.a.  If I build without -g, they are 4.3mb and 7mb respectively.  So this speedup would add another 2.5% to the size of a stripped build.

If even that 300k was a concern, the marshal approach would also permit us to compile all the deep-frozen modules into a separate shared library and unload it after we're done.

I don't know what the runtime impact of "deep-freeze" is, but it seems like it'd be pretty minimal.  You're essentially storing these objects in C static data instead of the heap, which should be about the same.  Maybe it'd mean the code objects for the module bodies would stick around longer than they otherwise would?  But that doesn't seem like it'd add that much overhead.

It's interesting to think about applying these techniques to the entire standard library, but as you suggest that would probably be wasteful.  On the other hand: if we made a viable tool that could consume some arbitrary set of .py files and produce a C file, and said C file could then be compiled into a shared library, end users could enjoy this speedup over the subset of the standard library their program used, and perhaps even their own source tree(s).

History
Date	User	Action	Args
2021-08-28 05:14:39	larry	set	recipients: + larry, lemburg, gvanrossum, brett.cannon, nascheme, ronaldoussoren, Mark.Shannon, eric.snow, brandtbucher, BTaskaya
2021-08-28 05:14:39	larry	set	messageid: <1630127679.66.0.551195032564.issue45020@roundup.psfhosted.org>
2021-08-28 05:14:39	larry	link	issue45020 messages
2021-08-28 05:14:39	larry	create