This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: compile("-"*3000000 + "4", '', mode) causes hard crash
Type: crash Stage: resolved
Components: Parser Versions: Python 3.11, Python 3.10, Python 3.9
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Dennis Sweeney, Kojoley, charles.mcmarrow.4, eric.smith, eric.snow, gvanrossum, lukasz.langa, lys.nikolaou, pablogsal, serhiy.storchaka, terry.reedy, xtreak
Priority: normal Keywords: patch

Created on 2021-12-17 05:22 by charles.mcmarrow.4, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 30177 merged pablogsal, 2021-12-18 04:02
PR 30214 merged pablogsal, 2021-12-20 15:48
PR 30215 merged pablogsal, 2021-12-20 15:50
PR 30363 merged pablogsal, 2022-01-03 17:54
PR 30366 merged pablogsal, 2022-01-03 18:39
Messages (45)
msg408753 - (view) Author: Charles McMarrow (charles.mcmarrow.4) Date: 2021-12-17 05:22
`eval("-"*3000000 + "4")` in cmd causes hard crash
msg408757 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2021-12-17 08:02
In case it helps track this down. On my system I've tested these two setups:

On Windows, on the main branch, python just exists with no message when I run this from the REPL.

Also on Windows, with the Cygwin 3.8.12 version, I get MemoryError:

Python 3.8.12 (default, Nov 23 2021, 20:18:25)
[GCC 11.2.0] on cygwin
Type "help", "copyright", "credits" or "license" for more information.
>>> eval("-"*3000000 + "4")
s_push: parser stack overflow
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
MemoryError

Those are the only two systems I have available to test with.
msg408758 - (view) Author: Dennis Sweeney (Dennis Sweeney) * (Python committer) Date: 2021-12-17 08:32
This is the top of the stacktrace I got on Windows in Visual Studio:

 	ucrtbased.dll!00007ff9096d8ea8()	Unknown
 	ucrtbased.dll!00007ff9096d727c()	Unknown
 	ucrtbased.dll!00007ff9096d6f69()	Unknown
 	ucrtbased.dll!00007ff9096d70b3()	Unknown
 	ucrtbased.dll!00007ff9096d72e9()	Unknown
 	ucrtbased.dll!00007ff9096cef0c()	Unknown
 	ucrtbased.dll!00007ff9096cf446()	Unknown
>	python311_d.dll!tok_get(tok_state * tok, const char * * p_start, const char * * p_end) Line 1699	C
 	python311_d.dll!_PyTokenizer_Get(tok_state * tok, const char * * p_start, const char * * p_end) Line 2061	C
 	python311_d.dll!_PyPegen_fill_token(Parser * p) Line 223	C
 	python311_d.dll!_PyPegen_is_memoized(Parser * p, int type, void * pres) Line 309	C
 	python311_d.dll!factor_rule(Parser * p) Line 12622	C
 	python311_d.dll!factor_rule(Parser * p) Line 12682	C
 	python311_d.dll!factor_rule(Parser * p) Line 12682	C
 	python311_d.dll!factor_rule(Parser * p) Line 12682	C
 	python311_d.dll!factor_rule(Parser * p) Line 12682	C
 	python311_d.dll!factor_rule(Parser * p) Line 12682	C
 	python311_d.dll!factor_rule(Parser * p) Line 12682	C
 	python311_d.dll!factor_rule(Parser * p) Line 12682	C
 	python311_d.dll!factor_rule(Parser * p) Line 12682	C
 	python311_d.dll!factor_rule(Parser * p) Line 12682	C
 	python311_d.dll!factor_rule(Parser * p) Line 12682	C
 	python311_d.dll!factor_rule(Parser * p) Line 12682	C
        ...
msg408781 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python committer) Date: 2021-12-17 14:27
See also https://bugs.python.org/issue32758
msg408783 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python committer) Date: 2021-12-17 14:31
https://bugs.python.org/issue42609 too
msg408828 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2021-12-18 00:41
Windows, IDLE, 3.10.1: compile("-"*3000000 + "4", '', mode) crashes execution process for any of 'exec', 'eval', 'single'.

#42609 is also about 'too high' string multiplication with new compilet, though the exact breaking point in crash dumps seems different.
msg408829 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-12-18 02:04
This is a stack overflow in the parser, unfortunately, which unfortunately is very difficult to defend against.
msg408830 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-12-18 02:05
This is a known issue for recursive descendent parsers. The only thing we can do here is somehow limit the maximum stack depth of the C stack, but that can:

* Slow down the parser.
* Limit valid expressions that otherwise won't segfault.
* Still don't work in certain systems.
msg408831 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-12-18 02:10
I worked in the past in a refactor of the math based rules (https://github.com/python/cpython/pull/20696/files) that could prevent **this** particular example, but others could still make the parser crash by stack overflow
msg408833 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-12-18 02:38
I made a draft PR here:

https://github.com/python/cpython/pull/30177

to fix the issue. But we should benchmark and evaluate it before deciding anything.
msg408964 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-12-20 15:43
New changeset e9898bf153d26059261ffef11f7643ae991e2a4c by Pablo Galindo Salgado in branch 'main':
bpo-46110: Add a recursion check to avoid stack overflow in the PEG parser (GH-30177)
https://github.com/python/cpython/commit/e9898bf153d26059261ffef11f7643ae991e2a4c
msg408965 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-12-20 16:23
New changeset dc73199a212a44553578eb4952631e5ba9e5f292 by Pablo Galindo Salgado in branch '3.10':
[3.10] bpo-46110: Add a recursion check to avoid stack overflow in the PEG parser (GH-30177) (GH-30214)
https://github.com/python/cpython/commit/dc73199a212a44553578eb4952631e5ba9e5f292
msg408967 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-12-20 17:18
New changeset e5cf31d3c2b30d12f238c6ab26d15855eefb2a8a by Pablo Galindo Salgado in branch '3.9':
[3.9] bpo-46110: Add a recursion check to avoid stack overflow in the PEG parser (GH-30177) (#30215)
https://github.com/python/cpython/commit/e5cf31d3c2b30d12f238c6ab26d15855eefb2a8a
msg409501 - (view) Author: Nikita Kniazev (Kojoley) * Date: 2022-01-02 15:51
> I made a draft PR here:
> 
> https://github.com/python/cpython/pull/30177
> 
> to fix the issue. But we should benchmark and evaluate it before deciding anything.

It seems that the PR was merged without discussion about 85% regression in python_startup benchmark. https://speed.python.org/timeline/?ben=python_startup https://speed.python.org/changes/?rev=e9898bf153
msg409520 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2022-01-02 20:27
>It seems that the PR was merged without discussion about 85% regression in python_startup benchmark

Ugh, that's quite bad. We measured performance impact in general and that was quite acceptable  but seems that for startup this is quite sensitive :(

There isn't many other ways we can do this that I can think of unfortunately, so we need to make a decision on what we care most here, unless someone has a better idea on how we can overcome the recursion problem.

Adding Guido and Eric as they gave been working on startup quite a lot.
msg409522 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2022-01-02 20:30
Guido, Eric, what are your thoughts here?

The fix that I merged works by limiting the maximum recursion but seems that incrementing the recursion counter on every parser call makes quite a lot of impact on startup.

Unfortunately if we revert the fix, we still have the problem that Python can segfault for certain inputs that overload the stack.
msg409526 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2022-01-02 21:18
Let me have a look. May take a day, okay?--
--Guido (mobile)
msg409527 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2022-01-02 21:26
> Let me have a look. May take a day, okay?--

Absolutely! There is no rush as the only close release IIRC is another alpha of 3.11.
msg409533 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2022-01-02 22:04
So if I understand the speed.python.org results correctly, the time to run `python -c pass` went way up, but the time for `python -S -c pass` did not go up significantly.

Unfortunately the only machine I have access to is a Mac, and I cannot repro this result, using PGO/LTO. Could it be a Linux thing? Or due to something in the venv for pyperformance?

Note that I am using a much simpler test script: Tools/scripts/startuptime.py. I have not yet succeeded in building and running pyperformance, mostly since the Python I build doesn't have SSL configured (I always forget the flag to use on my machine) and pyperformance insists on installing a bunch of stuff (including new versions of pip and setuptools, it seels).

Can anyone repro the perf regression on their box?
msg409535 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2022-01-02 22:49
Maybe it's Linux specific? I managed to run pyperformance and got this:

### python_startup ###
Mean +- std dev: 23.2 ms +- 0.8 ms -> 23.4 ms +- 1.2 ms: 1.01x slower
Not significant

Note, I am not dismissing the report -- in fact it looks quite bad. But I am failing to reproduce it, which makes it harder to understand the root cause. :-(

Maybe we can create a microbenchmark for this that just parses a large amount of code?

Anyway, here's a random thought about why this might have such a big impact. Look at this snippet (found all over the parser.c file):

    if (p->level++ == MAXSTACK) {
        p->error_indicator = 1;
        PyErr_NoMemory();
    }
    if (p->error_indicator) {
        p->level--;
        return NULL;
    }

This is two "unlikely" branches in a row, and the first sets the variable tested by the second. Maybe this causes the processor to stall?

Also, maybe it would be wiser to use ++X instead of X++? (Though a good compiler would just change X++ == Y into ++X == Y+1.)

Anyway, without a way to reproduce, there's not much that can be done.
msg409543 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2022-01-02 23:31
This are my results running directly the pyperformance run script (https://github.com/python/pyperformance/blob/main/pyperformance/data-files/benchmarks/bm_python_startup/run_benchmark.py) with and without the fix (both PGO/LTO):

pablogsal@Obsidian-W:~$ ./cpython/python startup_benchmark.py --compare-to ./cpython_base/python
python_with_fix: ..................... 8.14 ms +- 0.17 ms
python_with_reverted_fix: ..................... 8.05 ms +- 0.16 ms

Mean +- std dev: [python_with_fix] 8.14 ms +- 0.16 ms -> [python_with_reverted_fix] 8.05 ms +- 0.17 ms: 1.01x faster
msg409544 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2022-01-02 23:35
I am not able to reproduce on Linux either, with pyperformance or manual testing in the CLI. 

Interestingle, this shows up in both machines:

https://speed.python.org/timeline/#/?exe=12&ben=python_startup&env=1&revs=50&equid=off&quarts=on&extr=on

https://speed.python.org/timeline/#/?exe=12&ben=python_startup&env=4&revs=50&equid=off&quarts=on&extr=on
msg409547 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2022-01-03 00:09
Does python_startup benchmark start with all modules parsed and __pycache__d, or with no cache, so it includes the normally one-time parse time?
msg409549 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2022-01-03 00:21
> Does python_startup benchmark start with all modules parsed and __pycache__d, or with no cache, so it includes the normally one-time parse time?

I don't know what pyperf does with the cache (adding Victor as maybe he knowns).
msg409553 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2022-01-03 01:21
Ran pyperformance with PGO/LTO CPU-isol on my Linux box and I cannot reproduce either:

❯ pyperf compare_to json/* --table --table-format=md -G
| Benchmark      | 2021-12-20_10-23-master-6ca78affc802 | 2021-12-20_15-43-master-e9898bf153d2 |
|----------------|:------------------------------------:|:------------------------------------:|
| python_startup | 19.7 ms                              | 19.0 ms: 1.03x faster                |
msg409554 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2022-01-03 02:53
Maybe something unrelated changed on the benchmark machines? (Like installing a new version of pyperformance???) Though it happened across both benchmark machines. What configuration and flags are being used to run the benchmark suite on that machine?
msg409556 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2022-01-03 03:20
> Maybe something unrelated changed on the benchmark machines?

Very unlikely, it happened on two separate machines with different distributions and nothing was updated in the machines that I can see.

Both use a configuration file for pyperformance that looks like this:

-------------

$ cat bench.conf

[config]
json_dir = ~/json_cron

[scm]
repo_dir = ~/cpython_cron
update = True
remote = origin

[compile]
lto = True
pgo = True
bench_dir = ~/bench_tmpdir_cron

[run_benchmark]
system_tune = True
upload = False

[upload]
url = https://speed.python.org/
environment = speed-python
executable = lto-pgo
project = CPython

[compile_all]

[compile_all_revisions]
COMMIT_SHA=master
msg409593 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2022-01-03 15:42
I propose a test: revert the PR and see if speed.Python.org shows a speedup
back to the previous number.--
--Guido (mobile)
msg409598 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2022-01-03 17:46
> I propose a test: revert the PR and see if speed.Python.org shows a speedup
back to the previous number.--

Ok, let's do that and see what happens
msg409601 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2022-01-03 18:29
New changeset 9d35dedc5e7e940b639230fba93c763bd9f19c09 by Pablo Galindo Salgado in branch 'main':
Revert "bpo-46110: Add a recursion check to avoid stack overflow in the PEG parser (GH-30177)" (GH-30363)
https://github.com/python/cpython/commit/9d35dedc5e7e940b639230fba93c763bd9f19c09
msg409606 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2022-01-03 18:54
I wrote a tiny script that calls compile() on raw bytes read from some source file, does this 100 times, and reports the total time. I tested the script with Lib/pydoc_data/topics.py (which happens to be the largest source file in the CPython repo, but mostly string literals) and with Lib/test/test_socket.py (the second-largest file).

I built python.exe on a Mac with PGO/LTO, from "make clean", both before and after (at) PR 30177. For both files, the difference between the results is well in the noise caused by my machine (I don't have a systematic way to stop background jobs). But it's very clear that this PR cannot have been the cause of an 85% jump in the time taken by the python_startup benchmark in PyPerformance.

For topics.py, the time was around 7.2 msec/compile; for test_socket.py, it was around 38. (I am not showing separate before/after numbers because the noise in my data really is embarrassing.)

The compilation speed comes down to ~170,000 lines/sec on my machine (an Intel Mac from 2019; 2.6 GHz 6-Core Intel Core i7 running macOS Big Sur 11.6.1; it has clang 12.0.5).

It must be something weird on the benchmark machines. I suspect that a new version of some package was installed in the venv shared by all the benchmarks (we are still using PyPerformance 1.0.2) and that affected something, perhaps through a .pth file?
msg409608 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2022-01-03 19:24
I ran the benchmarks machines with the revert and seems that it doesn't go back to the previous timing:

https://speed.python.org/timeline/#/?exe=12&ben=python_startup&env=1&revs=50&equid=off&quarts=on&extr=on

(check last data point for https://github.com/python/cpython/commit/9d35dedc5e7e940b639230fba93c763bd9f19c09).

So this seems that the difference in speed remains a mystery but is not due to this fix.
msg409612 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2022-01-03 19:52
https://speed.python.org/timeline/#/?exe=12&ben=python_startup&env=4&revs=50&equid=off&quarts=on&extr=on

also doesn't show any difference with the revert
msg409613 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2022-01-03 19:54
New changeset dd6c35761a4cd417e126a2d51dd0b89c8a30e5de by Pablo Galindo Salgado in branch 'main':
bpo-46110: Restore commit e9898bf153d26059261ffef11f7643ae991e2a4c
https://github.com/python/cpython/commit/dd6c35761a4cd417e126a2d51dd0b89c8a30e5de
msg409630 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2022-01-03 22:57
I ran all benchmarks on installed optimized framework builds of 3.9 with the change (-a) and with the revert (-revert). It shows no change:

❯ ./python3.9 -m pyperf compare_to /Volumes/RAMDisk/py39*
2to3: Mean +- std dev: [py39-a] 724 ms +- 6 ms -> [py39-revert] 722 ms +- 2 ms: 1.00x faster
fannkuch: Mean +- std dev: [py39-a] 1.26 sec +- 0.00 sec -> [py39-revert] 1.26 sec +- 0.00 sec: 1.00x faster
float: Mean +- std dev: [py39-a] 320 ms +- 3 ms -> [py39-revert] 319 ms +- 1 ms: 1.00x faster
go: Mean +- std dev: [py39-a] 726 ms +- 6 ms -> [py39-revert] 718 ms +- 4 ms: 1.01x faster
hexiom: Mean +- std dev: [py39-a] 28.3 ms +- 0.3 ms -> [py39-revert] 28.1 ms +- 0.3 ms: 1.00x faster
logging_format: Mean +- std dev: [py39-a] 22.5 us +- 0.3 us -> [py39-revert] 22.4 us +- 0.2 us: 1.00x faster
nqueens: Mean +- std dev: [py39-a] 274 ms +- 2 ms -> [py39-revert] 273 ms +- 2 ms: 1.00x faster
pickle_dict: Mean +- std dev: [py39-a] 57.4 us +- 0.6 us -> [py39-revert] 57.1 us +- 0.7 us: 1.01x faster
pickle_pure_python: Mean +- std dev: [py39-a] 1.32 ms +- 0.02 ms -> [py39-revert] 1.31 ms +- 0.02 ms: 1.01x faster
pidigits: Mean +- std dev: [py39-a] 619 ms +- 0 ms -> [py39-revert] 614 ms +- 0 ms: 1.01x faster
pyflate: Mean +- std dev: [py39-a] 2.02 sec +- 0.02 sec -> [py39-revert] 2.00 sec +- 0.01 sec: 1.01x faster
python_startup: Mean +- std dev: [py39-a] 26.3 ms +- 0.1 ms -> [py39-revert] 26.3 ms +- 0.1 ms: 1.00x slower
regex_dna: Mean +- std dev: [py39-a] 255 ms +- 2 ms -> [py39-revert] 250 ms +- 1 ms: 1.02x faster
regex_effbot: Mean +- std dev: [py39-a] 6.23 ms +- 0.04 ms -> [py39-revert] 6.18 ms +- 0.01 ms: 1.01x faster
regex_v8: Mean +- std dev: [py39-a] 43.5 ms +- 0.4 ms -> [py39-revert] 43.3 ms +- 0.1 ms: 1.01x faster
richards: Mean +- std dev: [py39-a] 228 ms +- 3 ms -> [py39-revert] 226 ms +- 3 ms: 1.01x faster
spectral_norm: Mean +- std dev: [py39-a] 430 ms +- 4 ms -> [py39-revert] 429 ms +- 3 ms: 1.00x faster
sympy_expand: Mean +- std dev: [py39-a] 1.25 sec +- 0.01 sec -> [py39-revert] 1.25 sec +- 0.01 sec: 1.00x slower
sympy_str: Mean +- std dev: [py39-a] 733 ms +- 7 ms -> [py39-revert] 729 ms +- 6 ms: 1.01x faster
telco: Mean +- std dev: [py39-a] 16.6 ms +- 0.2 ms -> [py39-revert] 16.5 ms +- 0.1 ms: 1.01x faster
unpack_sequence: Mean +- std dev: [py39-a] 238 ns +- 3 ns -> [py39-revert] 236 ns +- 2 ns: 1.01x faster
unpickle: Mean +- std dev: [py39-a] 41.3 us +- 0.5 us -> [py39-revert] 41.1 us +- 0.5 us: 1.01x faster
unpickle_list: Mean +- std dev: [py39-a] 12.5 us +- 0.1 us -> [py39-revert] 12.5 us +- 0.1 us: 1.01x slower

Benchmark hidden because not significant (35): chameleon, chaos, crypto_pyaes, deltablue, django_template, dulwich_log, json_dumps, json_loads, logging_silent, logging_simple, mako, meteor_contest, nbody, pathlib, pickle, pickle_list, python_startup_no_site, raytrace, regex_compile, scimark_fft, scimark_lu, scimark_monte_carlo, scimark_sor, scimark_sparse_mat_mult, sqlalchemy_declarative, sqlalchemy_imperative, sqlite_synth, sympy_integrate, sympy_sum, tornado_http, unpickle_pure_python, xml_etree_parse, xml_etree_iterparse, xml_etree_generate, xml_etree_process

Geometric mean: 1.00x faster
msg409631 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2022-01-03 22:58
(that's on M1 Macbook Pro on macOS Monterey)
msg409635 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2022-01-03 23:07
At thing at this point we can confidently say that is very very unlike that there is no actual regression.

What's going on with the performance servers is something I still cannot explain. I at least can confirm the servers system packages were not updated between these runs but I cannot think of anything that could have influenced that change.

I propose to close this issue as we are clearly unable to reproduce said slowdown.
msg409638 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2022-01-03 23:22
"is very very unlike that there is no actual regression"

I presume you meant

"it is very very *likely* that there is no actual regression"

This shouldn't hold up releases, but (having spent months trying to improve startup speed) I would still like to get to the bottom of the speed.python.org regression.
msg409639 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2022-01-03 23:24
> I presume you mean
> "it is very very *likely* that there is no actual regression"

Yes, sorry, that's what I meant :)

> This shouldn't hold up releases

Cool, we will proceed with 3.9.10 and 3.11.0a3 tomorrow.
msg409705 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2022-01-04 19:57
I'm still wondering why speed.python.org showed such a slowdown, and how we can revert that. Here's a new theory.

Thanks to an investigation I did together with Eric, I now suspect that the release of setuptools 60.0.0 on Dec 19 is a smoking gun. PyPerformance (re)installs the latest versions of pip and setuptools.

Setuptools 60.0 makes a change in the file distutils-precedence.pth that causes it (by default) to import something called _distutils_hack and to call its add_shim() function. In previous setuptools this was off by default, but in 60.0 it switched to on. See https://github.com/pypa/setuptools/blob/main/CHANGES.rst#v6000

That add_shim() call in turn installs an extra entry in front of sys.meta_path, which does a bunch of extra work. See https://github.com/pypa/setuptools/blob/main/_distutils_hack/__init__.py

Pablo, can we change the PyPerformance configuration or the script that runs it to set and export SETUPTOOLS_USE_DISTUTILS=stdlib, to see whether that affects perf at all?

(Note that the code in distutils-precedence.pth is executed by site.py in addpackage().)
msg409710 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2022-01-04 20:32
More data. On my Mac, with SETUPTOOLS_USE_DISTUTILS=stdlib, using -X importtime I see the following extra modules being imported:

import time:       278 |        278 |         types
import time:       112 |        112 |           _operator
import time:       419 |        531 |         operator
import time:       129 |        129 |             itertools
import time:       325 |        325 |             keyword
import time:       468 |        468 |             reprlib
import time:       258 |        258 |             _collections
import time:       978 |       2156 |           collections
import time:        78 |         78 |           _functools
import time:       835 |       3068 |         functools
import time:      1359 |       5235 |       enum
import time:       138 |        138 |         _sre
import time:       497 |        497 |           sre_constants
import time:       528 |       1025 |         sre_parse
import time:       512 |       1674 |       sre_compile
import time:       109 |        109 |       _locale
import time:       886 |        886 |       copyreg
import time:       671 |       8574 |     re
import time:       471 |        471 |       warnings
import time:       330 |        801 |     importlib
import time:       906 |      10279 |   _distutils_hack

That's around 10 msec, so in the right ballpark.
msg409718 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2022-01-04 23:00
I did a run of pyperformance manually forcing setuptools<60.0 and another with setuptools>=60.0 I can reproduce the timing difference.

I assume we can therefore close this issue and maybe open another one thinking about how to deal with the setuptools problem (maybe reaching to the package maintainera).
msg409737 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-01-05 09:02
If the issue is about how pyperformance creates its virtual environment (how setuptools is installed), I suggest to continue discussion the issue in pyperformance: https://github.com/python/pyperformance/issues/ ;-)
msg410119 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2022-01-08 21:31
To close the loop, setuptools 60.4.0 should fix this, see https://github.com/pypa/setuptools/pull/3009.
msg410132 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2022-01-09 04:47
Yup. All better: https://speed.python.org/timeline/#/?exe=12&ben=python_startup&env=1&revs=50&equid=off&quarts=on&extr=on
History
Date User Action Args
2022-04-11 14:59:53adminsetgithub: 90268
2022-01-10 11:42:32vstinnersetnosy: - vstinner
2022-01-09 04:47:34gvanrossumsetmessages: + msg410132
2022-01-08 21:31:40gvanrossumsetmessages: + msg410119
2022-01-05 09:02:03vstinnersetmessages: + msg409737
2022-01-04 23:00:52pablogsalsetstatus: open -> closed

messages: + msg409718
stage: patch review -> resolved
2022-01-04 20:32:23gvanrossumsetmessages: + msg409710
2022-01-04 19:57:42gvanrossumsetmessages: + msg409705
2022-01-03 23:24:36pablogsalsetmessages: + msg409639
2022-01-03 23:22:43gvanrossumsetpriority: release blocker -> normal

messages: + msg409638
2022-01-03 23:07:15pablogsalsetmessages: + msg409635
2022-01-03 22:58:06lukasz.langasetmessages: + msg409631
2022-01-03 22:57:33lukasz.langasetmessages: + msg409630
2022-01-03 19:54:14pablogsalsetmessages: + msg409613
2022-01-03 19:52:55pablogsalsetmessages: + msg409612
2022-01-03 19:24:49pablogsalsetmessages: + msg409608
2022-01-03 18:54:11gvanrossumsetmessages: + msg409606
2022-01-03 18:39:57pablogsalsetpull_requests: + pull_request28579
2022-01-03 18:29:21pablogsalsetmessages: + msg409601
2022-01-03 17:54:37pablogsalsetstage: resolved -> patch review
pull_requests: + pull_request28577
2022-01-03 17:46:13pablogsalsetmessages: + msg409598
2022-01-03 15:42:54gvanrossumsetmessages: + msg409593
2022-01-03 03:20:40pablogsalsetmessages: + msg409556
2022-01-03 02:53:22gvanrossumsetmessages: + msg409554
2022-01-03 01:21:01pablogsalsetmessages: + msg409553
2022-01-03 00:21:41pablogsalsetnosy: + vstinner
messages: + msg409549
2022-01-03 00:09:21terry.reedysetmessages: + msg409547
2022-01-02 23:35:06pablogsalsetmessages: + msg409544
2022-01-02 23:31:09pablogsalsetmessages: + msg409543
2022-01-02 22:49:21gvanrossumsetmessages: + msg409535
2022-01-02 22:04:39gvanrossumsetmessages: + msg409533
2022-01-02 21:26:53pablogsalsetmessages: + msg409527
2022-01-02 21:18:45gvanrossumsetmessages: + msg409526
2022-01-02 20:30:13pablogsalsetmessages: + msg409522
2022-01-02 20:28:19pablogsalsetnosy: + lukasz.langa

versions: + Python 3.9, Python 3.11
2022-01-02 20:27:53pablogsalsetstatus: closed -> open
priority: normal -> release blocker

nosy: + gvanrossum, eric.snow
messages: + msg409520
2022-01-02 15:51:25Kojoleysetnosy: + Kojoley
messages: + msg409501
2021-12-20 17:18:20pablogsalsetmessages: + msg408967
2021-12-20 16:25:43pablogsalsetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2021-12-20 16:23:48pablogsalsetmessages: + msg408965
2021-12-20 15:50:54pablogsalsetpull_requests: + pull_request28436
2021-12-20 15:48:07pablogsalsetpull_requests: + pull_request28435
2021-12-20 15:43:40pablogsalsetmessages: + msg408964
2021-12-18 04:02:54pablogsalsetkeywords: + patch
stage: patch review
pull_requests: + pull_request28395
2021-12-18 02:38:27pablogsalsetmessages: + msg408833
2021-12-18 02:10:24pablogsalsetmessages: + msg408831
2021-12-18 02:05:23pablogsalsetmessages: + msg408830
2021-12-18 02:04:17pablogsalsetmessages: + msg408829
2021-12-18 00:41:34terry.reedysetnosy: + terry.reedy, pablogsal

messages: + msg408828
title: `eval("-"*3000000 + "4")` in cmd causes hard crash -> compile("-"*3000000 + "4", '', mode) causes hard crash
2021-12-17 14:31:22xtreaksetmessages: + msg408783
2021-12-17 14:27:49xtreaksetnosy: + serhiy.storchaka, xtreak
messages: + msg408781
2021-12-17 13:26:51pablogsalsetnosy: - pablogsal
2021-12-17 08:32:06Dennis Sweeneysetnosy: + Dennis Sweeney
messages: + msg408758
2021-12-17 08:02:13eric.smithsetnosy: + eric.smith
messages: + msg408757
2021-12-17 05:22:07charles.mcmarrow.4create