classification
Title: Use Py_IDENTIFIER in Python-ast.c
Type: Stage:
Components: Interpreter Core Versions: Python 3.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: methane, python-dev, serhiy.storchaka, vstinner
Priority: normal Keywords: patch

Created on 2017-01-25 12:33 by methane, last changed 2017-01-26 00:20 by methane. This issue is now closed.

Files
File name Uploaded Description Edit
ast-identifier.patch methane, 2017-01-25 12:33 review
ast-identifier3.patch methane, 2017-01-25 13:08 review
ast-identifier4.patch methane, 2017-01-25 15:51 review
Messages (12)
msg286244 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2017-01-25 12:33
While I'm looking Python's memory usage, I found some dicts which key
is not interned "_fields", "_ast" and "__modules__".
msg286245 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-01-25 12:40
Python/Python-ast.c started with:

/* File automatically generated by Parser/asdl_c.py. */

Please fix the generator, not the generated file ;-)
msg286249 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-01-25 13:15
> While I'm looking Python's memory usage, I found some dicts which key is not interned "_fields", "_ast" and "__modules__".

AST objects are supposed to be temporary. Interning strings used in the AST processor would make these strings immortal and so increase the memory usage, no?

What is the purpose of the patch? Speedup or reduce the memory usage?
msg286251 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-01-25 13:26
LGTM. _Py_IDENTIFIER is private API, but is not new in Python-ast.c.

I expect insignificant speedup and reduce the memory usage from the patch.
msg286252 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2017-01-25 13:27
> AST objects are supposed to be temporary. Interning strings used in the AST processor would make these strings immortal and so increase the memory usage, no?
> What is the purpose of the patch? Speedup or reduce the memory usage?

The three identifiers are used in type.  Dozen (100+) types using them.

>>> import _ast
>>> _ast.Dict.__dict__
mappingproxy({'_fields': ('keys', 'values'), '__module__': '_ast', '__doc__': None})
>>> _ast.DictComp.__dict__
mappingproxy({'_fields': ('key', 'value', 'generators'), '__module__': '_ast', '__doc__': None})


Each string uses about 50 bytes. So 50 * 3 * 100 = 15KB may be reduced.

But main purpose is less noise when looking memory usage of Python.
msg286253 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2017-01-25 13:34
New changeset a507882736b2 by INADA Naoki in branch 'default':
Issue #29369: Use Py_IDENTIFIER in Python-ast.c
https://hg.python.org/cpython/rev/a507882736b2
msg286259 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-01-25 14:58
+    result = PyObject_CallFunction((PyObject*)&PyType_Type, "s(O){OOOO}",
+                    type, base,
+                    _PyUnicode_FromId(&PyId__fields), fnames,
+                    _PyUnicode_FromId(&PyId___module__),
+                    _PyUnicode_FromId(&PyId__ast));

You should check if _PyUnicode_FromId() returns NULL if it was the first call and the UTF-8 decode failed to allocate memory.

You might initialize all these identifiers (and check for errors) in init_types() to avoid having to check for errors each time they are used.
msg286260 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-01-25 15:04
> The three identifiers are used in type.  Dozen (100+) types using them.

I tried once to put *all* (most common) _Py_IDENTIFIER() in a single file and share them, but I abandoned my attempt:

http://bugs.python.org/issue19515#msg202385
Antoine Pitrou: "Well, then IMHO it's not worth it."

See issues #19512 and #19515.


"""
>>> import _ast
>>> _ast.Dict.__dict__
mappingproxy({'_fields': ('keys', 'values'), '__module__': '_ast', '__doc__': None})
"""

But the _ast module is not imported by default, only in programs importing "ast" explicitly.

Well, I'm not opposed to the change, I'm just trying to understand how the code is used ;-)
msg286262 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2017-01-25 15:51
> You should check if _PyUnicode_FromId() returns NULL if it was the first call and the UTF-8 decode failed to allocate memory.

thanks. new patch will fix it.

> You might initialize all these identifiers (and check for errors) in init_types() to avoid having to check for errors each time they are used.

Here is not so performance critical part.

> But the _ast module is not imported by default, only in programs importing "ast" explicitly.
> Well, I'm not opposed to the change, I'm just trying to understand how the code is used ;-)

We use Flask. Flask is based on Werkzeug. Werkzeug imports inspect.
inspect imports ast. ast imports _ast.
msg286263 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-01-25 15:54
"We use Flask. Flask is based on Werkzeug. Werkzeug imports inspect. inspect imports ast. ast imports _ast."

Oh, this was unexpected to me :-)
msg286267 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-01-25 17:20
I think additional checks are not needed. PyObject_CallFunction() allows passing NULLs.
msg286288 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2017-01-26 00:20
Thanks, Serhiy.  You're right.
History
Date User Action Args
2017-01-26 00:20:13methanesetstatus: open -> closed

messages: + msg286288
2017-01-25 17:20:31serhiy.storchakasetmessages: + msg286267
2017-01-25 15:54:35vstinnersetmessages: + msg286263
2017-01-25 15:51:03methanesetfiles: + ast-identifier4.patch

messages: + msg286262
2017-01-25 15:04:20vstinnersetmessages: + msg286260
2017-01-25 14:58:45vstinnersetmessages: + msg286259
2017-01-25 13:34:23methanesetresolution: fixed
2017-01-25 13:34:01python-devsetnosy: + python-dev
messages: + msg286253
2017-01-25 13:27:15methanesetmessages: + msg286252
2017-01-25 13:26:21serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg286251
2017-01-25 13:15:59vstinnersetmessages: + msg286249
2017-01-25 13:13:00methanesetfiles: - ast-identifier2.patch
2017-01-25 13:08:59methanesetfiles: + ast-identifier3.patch
2017-01-25 13:03:56methanesetfiles: + ast-identifier2.patch
2017-01-25 12:40:14vstinnersetnosy: + vstinner
messages: + msg286245
2017-01-25 12:33:14methanecreate