msg410936 - (view) |
Author: Kumar Aditya (kumaraditya) * |
Date: 2022-01-19 11:16 |
Interns strings in deep-frozen modules.
See https://github.com/faster-cpython/ideas/issues/218
|
msg411003 - (view) |
Author: Kumar Aditya (kumaraditya) * |
Date: 2022-01-20 05:47 |
This speeds up comparison of strings by just comparing their pointers so it is much faster.
|
msg412920 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2022-02-09 16:52 |
New changeset c0a5ebeb1239020f2ecc199053bb1a70d78841a1 by Kumar Aditya in branch 'main':
bpo-46430: Intern strings in deep-frozen modules (GH-30683)
https://github.com/python/cpython/commit/c0a5ebeb1239020f2ecc199053bb1a70d78841a1
|
msg412922 - (view) |
Author: Christian Heimes (christian.heimes) * |
Date: 2022-02-09 17:05 |
I noticed that the new interning code lacks proper error reporting. There are only asserts. _PyCode_New() checks the return value of intern_strings.
|
msg412937 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2022-02-09 18:17 |
We discussed that and found that a lot of errors are ignored during interning anyway.
But it's not too late to change if you want to (sending a PR would probably be quicker than arguing :-).
|
msg412983 - (view) |
Author: miss-islington (miss-islington) |
Date: 2022-02-10 09:08 |
New changeset dee020a6f5bf29f95bec6294da9bcd577114f592 by Nikita Sobolev in branch 'main':
Fix sphinx-lint after #31097 and b878b3a (GH-31248)
https://github.com/python/cpython/commit/dee020a6f5bf29f95bec6294da9bcd577114f592
|
msg413002 - (view) |
Author: Kumar Aditya (kumaraditya) * |
Date: 2022-02-10 14:11 |
I consider this done so closing it as improving the error handling of interning it out of scope of this issue.
|
msg413004 - (view) |
Author: Christian Heimes (christian.heimes) * |
Date: 2022-02-10 14:18 |
Please leave the ticket open until we have an agreement how to handle the missing error checks.
|
msg413822 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2022-02-23 17:07 |
> New changeset c0a5ebeb1239020f2ecc199053bb1a70d78841a1 by Kumar Aditya in branch 'main':
> bpo-46430: Intern strings in deep-frozen modules (GH-30683)
This change introduced a memory leak at Python exit.
Before:
$ ./python -X showrefcount -c pass
[0 refs, 0 blocks]
After:
$ ./python -X showrefcount -c pass
[0 refs, 344 blocks]
|
msg413823 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2022-02-23 17:09 |
> This change introduced a memory leak at Python exit.
It's regression related to bpo-1635741 which I fixed recently:
https://mail.python.org/archives/list/python-dev@python.org/thread/E4C6TDNVDPDNNP73HTGHN5W42LGAE22F/
|
msg413824 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2022-02-23 17:10 |
_PyStaticCode_InternStrings() error handling is based on assert(): that's really bad. It can crash Python (exit with abort()) at the first memory allocation failure. Why not returning -1 on error?
|
msg413832 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2022-02-23 17:45 |
Okay, let's change the error handling. @Kumar, can you handle that?
@Victor, the refleak is unrelated to the error handling right? Presumably the leak is imaginary -- the deep-frozen interned strings should be accounted for somehow. @Kumar do you need help investigating this?
|
msg413837 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2022-02-23 18:02 |
> Presumably the leak is imaginary
Well, you can check with your favorite memory debugger like Valgrind if you don't trust Python internal memory debugger :-)
Not leaking memory at exit matters when Python is embedded in an application.
I don't think that it's related to the error handling which is stripped in release mode.
|
msg413839 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2022-02-23 18:07 |
> Not leaking memory at exit matters when Python is embedded
> in an application.
Sure, but "leaks" caused by deep-freezing cannot be solved by freeing up the deep-frozen memory -- the solution must be to update the accounting somewhere.
Where is the existence of Py_None accounted for (since it's statically allocated, or at least used to be)? That's likely where we'd have to do something about the deep-frozen objects.
|
msg413847 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2022-02-23 18:58 |
> Sure, but "leaks" caused by deep-freezing cannot be solved by freeing up the deep-frozen memory -- the solution must be to update the accounting somewhere.
Python allocates memory (ex: with PyObject_Malloc()) which is not released at exit. Two examples from Valgrind:
==196803== 50 bytes in 1 blocks are still reachable in loss record 1 of 87
==196803== at 0x484486F: malloc (vg_replace_malloc.c:381)
==196803== by 0x544E29: _PyMem_RawMalloc (obmalloc.c:101)
==196803== by 0x545E0E: PyObject_Malloc (obmalloc.c:700)
==196803== by 0x587159: PyUnicode_New (unicodeobject.c:1448)
==196803== by 0x58C4CF: get_latin1_char (unicodeobject.c:2148)
==196803== by 0x58D7F7: _PyUnicode_FromUCS1 (unicodeobject.c:2450)
==196803== by 0x58E374: PyUnicode_FromKindAndData (unicodeobject.c:2525)
==196803== by 0x69402B: r_object (marshal.c:1150)
==196803== by 0x694295: r_object (marshal.c:1212)
==196803== by 0x694AE0: r_object (marshal.c:1393)
==196803== by 0x694295: r_object (marshal.c:1212)
==196803== by 0x694AA4: r_object (marshal.c:1387)
==196803== by 0x6950F3: read_object (marshal.c:1524)
==196803== by 0x6953E7: PyMarshal_ReadObjectFromString (marshal.c:1641)
==196803== by 0x763223: _Py_Get_Getpath_CodeObject (getpath.c:792)
==196803== by 0x76337F: _PyConfig_InitPathConfig (getpath.c:847)
==196803== by 0x68CD04: config_init_import (initconfig.c:1967)
==196803== by 0x68CE4E: _PyConfig_InitImportConfig (initconfig.c:2000)
==196803== by 0x69E594: init_interp_main (pylifecycle.c:1103)
==196803== by 0x69EBF8: pyinit_main (pylifecycle.c:1216)
==196803== by 0x69EDCE: Py_InitializeFromConfig (pylifecycle.c:1247)
==196803== by 0x6D5346: pymain_init (main.c:67)
==196803== by 0x6D64F8: pymain_main (main.c:692)
==196803== by 0x6D65A1: Py_BytesMain (main.c:725)
==196803== by 0x41D7B5: main (python.c:15)
and
==196803== 3,336 bytes in 60 blocks are still reachable in loss record 87 of 87
==196803== at 0x484486F: malloc (vg_replace_malloc.c:381)
==196803== by 0x544E29: _PyMem_RawMalloc (obmalloc.c:101)
==196803== by 0x545E0E: PyObject_Malloc (obmalloc.c:700)
==196803== by 0x587159: PyUnicode_New (unicodeobject.c:1448)
==196803== by 0x59AEB9: unicode_decode_utf8 (unicodeobject.c:5162)
==196803== by 0x59B5B0: PyUnicode_DecodeUTF8Stateful (unicodeobject.c:5292)
==196803== by 0x58D296: PyUnicode_FromString (unicodeobject.c:2322)
==196803== by 0x5CFFAE: PyUnicode_InternFromString (unicodeobject.c:15650)
==196803== by 0x4E8B61: descr_new (descrobject.c:885)
==196803== by 0x4E8C9B: PyDescr_NewMethod (descrobject.c:934)
==196803== by 0x56694C: type_add_method (typeobject.c:5643)
==196803== by 0x566A92: type_add_methods (typeobject.c:5689)
==196803== by 0x569166: type_ready_fill_dict (typeobject.c:6165)
==196803== by 0x569A02: type_ready (typeobject.c:6421)
==196803== by 0x569B59: PyType_Ready (typeobject.c:6457)
==196803== by 0x544363: _PyTypes_InitTypes (object.c:1952)
==196803== by 0x69D12D: pycore_init_types (pylifecycle.c:704)
==196803== by 0x69D8FC: pycore_interp_init (pylifecycle.c:831)
==196803== by 0x69DC31: pyinit_config (pylifecycle.c:887)
==196803== by 0x69E355: pyinit_core (pylifecycle.c:1050)
==196803== by 0x69ED32: Py_InitializeFromConfig (pylifecycle.c:1240)
==196803== by 0x6D5346: pymain_init (main.c:67)
==196803== by 0x6D64F8: pymain_main (main.c:692)
==196803== by 0x6D65A1: Py_BytesMain (main.c:725)
==196803== by 0x41D7B5: main (python.c:15)
Before the commit c0a5ebeb1239020f2ecc199053bb1a70d78841a1, Python didn't leak this memory.
|
msg413851 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2022-02-23 19:28 |
That's pretty mysterious. The deep-freeze code isn't on the stack for either of those, but allocation of new unicode string objects is. I'm guessing these are somehow leaked by the interning, but I don't follow yet how.
|
msg413852 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2022-02-23 19:44 |
PyUnicode_InternInPlace() in intern_strings() can convert an immortal string to a regular Python strong reference, whereas _PyStaticCode_Dealloc() doesn't bother with clearing co_names, co_consts and co_localsplusnames.
|
msg413855 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2022-02-23 20:33 |
I tried to hack _PyStaticCode_Dealloc() to free strings, but since immortal objects are half-baken today, calling Py_DECREF() does crash.
Py_SETREF() should increase _Py_RefTotal if the old object is immortal and he new object is not.
Maybe this change should be reverted until Eric's PEP 683 "Immortal Objects, Using a Fixed Refcount" is implemented.
|
msg413893 - (view) |
Author: Kumar Aditya (kumaraditya) * |
Date: 2022-02-24 07:53 |
I have created a PR to fix the memory leak, See https://github.com/python/cpython/pull/31549
|
msg413895 - (view) |
Author: Kumar Aditya (kumaraditya) * |
Date: 2022-02-24 08:49 |
> Okay, let's change the error handling. @Kumar, can you handle that?
How it should be handled? Currently PyUnicode_InternInPlace ignores any errors and does not return it. It would be backwards-incompatible to change that, moreover as I explained in https://github.com/python/cpython/pull/30683#discussion_r800648477 intern_strings only check if all the items are strings which will be always true in case of deep-freeze so I don't think anything needs to be changed here. I would be interested to know if I am missing something though.
|
msg413928 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2022-02-24 16:54 |
New changeset 4dc746310bd37ad6b381f9176acd167d445f4385 by Kumar Aditya in branch 'main':
bpo-46430: Fix memory leak in interned strings of deep-frozen modules (GH-31549)
https://github.com/python/cpython/commit/4dc746310bd37ad6b381f9176acd167d445f4385
|
msg413930 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2022-02-24 17:02 |
I wrote https://github.com/python/cpython/pull/31555 to make sure that Python doesn't leak at Python exit.
|
msg413931 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2022-02-24 17:04 |
> How it should be handled? Currently PyUnicode_InternInPlace ignores any errors and does not return it. It would be backwards-incompatible to change that, moreover as I explained in https://github.com/python/cpython/pull/30683#discussion_r800648477 intern_strings only check if all the items are strings which will be always true in case of deep-freeze so I don't think anything needs to be changed here. I would be interested to know if I am missing something though.
The other functions you are calling *do* return errors. You should not ignore those. If any errors are reported the caller can decide what to do (e.g. call Py_FatalError().
|
msg414119 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2022-02-26 23:18 |
commit 0d9b565e62a5fc8c3e9b8c64cce764fe084ccb2b
Author: Kumar Aditya <59607654+kumaraditya303@users.noreply.github.com>
Date: Sat Feb 26 22:05:03 2022 +0530
Propagate errors (however unlikely) from _Py_Deepfreeze_Init() (GH-31596)
|
msg414120 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2022-02-26 23:21 |
> The other functions you are calling *do* return errors. You should not ignore those. If any errors are reported the caller can decide what to do (e.g. call Py_FatalError().
PEP 587 introduced PyStatus to Python startup code which let the Py_Initialize() caller to decide how to handle errors ;-) For example, you can open a graphical popup rather than killing the process with SIGABRT (Py_FatalError() behavior) which might be more user friendly :-D Or just log the error.
|
msg414127 - (view) |
Author: Guido van Rossum (gvanrossum) * |
Date: 2022-02-26 23:45 |
> PEP 587 introduced PyStatus to Python startup code which let the Py_Initialize() caller to decide how to handle errors ;-) For example, you can open a graphical popup rather than killing the process with SIGABRT (Py_FatalError() behavior) which might be more user friendly :-D Or just log the error.
That's up to pycore_interp_init() in pylifecycle.c now. Kumar called _PyStatus_ERR() there, so I think nothing more needs to be done.
|
|
Date |
User |
Action |
Args |
2022-04-11 14:59:54 | admin | set | github: 90588 |
2022-03-04 15:06:51 | corona10 | set | pull_requests:
- pull_request29803 |
2022-03-04 14:46:49 | corona10 | set | pull_requests:
+ pull_request29803 |
2022-02-26 23:45:31 | gvanrossum | set | messages:
+ msg414127 |
2022-02-26 23:21:02 | vstinner | set | messages:
+ msg414120 |
2022-02-26 23:18:50 | vstinner | set | messages:
+ msg414119 |
2022-02-26 16:37:30 | gvanrossum | set | status: open -> closed resolution: fixed stage: patch review -> resolved |
2022-02-26 07:23:43 | kumaraditya | set | pull_requests:
+ pull_request29719 |
2022-02-24 17:04:41 | gvanrossum | set | messages:
+ msg413931 |
2022-02-24 17:02:57 | vstinner | set | messages:
+ msg413930 |
2022-02-24 17:02:41 | corona10 | set | pull_requests:
+ pull_request29678 |
2022-02-24 16:54:14 | vstinner | set | messages:
+ msg413928 |
2022-02-24 08:49:47 | kumaraditya | set | messages:
+ msg413895 |
2022-02-24 07:53:27 | kumaraditya | set | messages:
+ msg413893 |
2022-02-24 07:29:56 | kumaraditya | set | stage: patch review pull_requests:
+ pull_request29670 |
2022-02-24 01:57:52 | corona10 | set | nosy:
+ corona10
|
2022-02-23 20:33:02 | vstinner | set | messages:
+ msg413855 |
2022-02-23 19:44:47 | vstinner | set | messages:
+ msg413852 |
2022-02-23 19:28:19 | gvanrossum | set | messages:
+ msg413851 |
2022-02-23 18:58:10 | vstinner | set | messages:
+ msg413847 |
2022-02-23 18:07:16 | gvanrossum | set | messages:
+ msg413839 |
2022-02-23 18:02:44 | vstinner | set | messages:
+ msg413837 |
2022-02-23 17:45:34 | gvanrossum | set | messages:
+ msg413832 |
2022-02-23 17:10:54 | vstinner | set | messages:
+ msg413824 |
2022-02-23 17:09:20 | vstinner | set | messages:
+ msg413823 |
2022-02-23 17:07:40 | vstinner | set | nosy:
+ vstinner messages:
+ msg413822
|
2022-02-10 14:18:11 | christian.heimes | set | status: closed -> open type: behavior messages:
+ msg413004
resolution: fixed -> (no value) stage: resolved -> (no value) |
2022-02-10 14:11:07 | kumaraditya | set | status: open -> closed resolution: fixed messages:
+ msg413002
stage: patch review -> resolved |
2022-02-10 09:08:59 | miss-islington | set | nosy:
+ miss-islington messages:
+ msg412983
|
2022-02-10 08:51:44 | sobolevn | set | nosy:
+ sobolevn pull_requests:
+ pull_request29417
|
2022-02-09 18:17:34 | gvanrossum | set | messages:
+ msg412937 |
2022-02-09 17:05:13 | christian.heimes | set | nosy:
+ christian.heimes messages:
+ msg412922
|
2022-02-09 16:52:54 | gvanrossum | set | messages:
+ msg412920 |
2022-01-20 05:47:51 | kumaraditya | set | messages:
+ msg411003 |
2022-01-19 11:16:23 | kumaraditya | set | messages:
+ msg410936 |
2022-01-19 11:07:06 | kumaraditya | set | keywords:
+ patch stage: patch review pull_requests:
+ pull_request28882 |
2022-01-19 11:06:14 | kumaraditya | create | |