This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Reduce temporary unicode object while adding descriptors
Type: Stage:
Components: Interpreter Core Versions: Python 3.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: methane Nosy List: methane, python-dev, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2017-01-28 03:04 by methane, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
descr-remove-getitemstring.patch methane, 2017-01-28 03:04 review
descr-remove-getitemstring-2.patch methane, 2017-01-28 05:13 review
dict-setitemstring.patch methane, 2017-01-28 05:21 removes unnecessary comment. review
Messages (5)
msg286394 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2017-01-28 03:04
add_methods(), add_members(), and add_getset() creates PyUnicode from C string 3 times, and calls PyUnicode_InternInplace 2 times.

1. PyDict_GetItemString() at first. (PyUnicode_FromString() is called).
2. In middle, descr_new() calls PyUnicode_InternFromString().
3. PyDict_SetItemString() at last. (creates unicode and intern it).

Skipping (2) is require adding new private APIs to pass PyUnicodeObject.
But I don't think it worth enough. (I'll try it later.)
So this patch only remove last temporary unicode.
(3 PyUnicode_FromString + 2 PyUnicode_InternInplace) becomes (2 PyUnicode_FromString + 2 PyUnicode_InternInplace).

It seems ~1% startup speedup (without site).

  $ ./python -m performance.benchmarks.bm_python_startup --no-site
  default: python_startup_no_site: Median +- std dev: 12.7 ms +- 0.1 ms
  patched: python_startup_no_site: Median +- std dev: 12.6 ms +- 0.1 ms

While speedup is small, this patch removes time to think "How large this
overhead of GetItemString + SetItemString pair?" while reading code :)

Additionally, this patch removes this comment in PyDict_SetItemString:

  -    PyUnicode_InternInPlace(&kv); /* XXX Should we really? */

SetItemString is used to add something to namespace.
Changing this behavior affects too widely.  So we should do it.
msg286399 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2017-01-28 04:07
I think I found better way.

Interned string can be get from descripter. Interning can be reduced
without adding private API.
Please don't review the first patch.
msg286400 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2017-01-28 05:13
descr-remove-getitemstring-2.patch is more compact than first patch.
(3 unicode + 2 intern) becomes (2 unicode + 1 intern).

python_startup_no_site: Median +- std dev: 12.5 ms +- 0.1 ms
msg286401 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-01-28 06:23
descr-remove-getitemstring-2.patch LGTM.
msg286404 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2017-01-28 07:35
New changeset d7ec72c1620c by INADA Naoki in branch 'default':
Issue #29383: reduce temporary interned unicode
https://hg.python.org/cpython/rev/d7ec72c1620c
History
Date User Action Args
2022-04-11 14:58:42adminsetgithub: 73569
2017-01-28 07:39:33methanesetstatus: open -> closed
resolution: fixed
2017-01-28 07:35:54python-devsetnosy: + python-dev
messages: + msg286404
2017-01-28 06:23:30serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg286401
2017-01-28 05:21:34methanesetfiles: + dict-setitemstring.patch
2017-01-28 05:13:03methanesetfiles: + descr-remove-getitemstring-2.patch

messages: + msg286400
2017-01-28 04:07:39methanesetmessages: + msg286399
2017-01-28 03:04:15methanecreate