Issue 28943: Use PyUnicode_MAX_CHAR_VALUE instead of PyUnicode_KIND in some API's short path

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/73129

classification

Title:	Use PyUnicode_MAX_CHAR_VALUE instead of PyUnicode_KIND in some API's short path
Type:	enhancement	Stage:	resolved
Components:	Interpreter Core	Versions:	Python 3.7

process

Status:	closed	Resolution:	rejected
Dependencies:		Superseder:
Assigned To:		Nosy List:	serhiy.storchaka, xiang.zhang
Priority:	normal	Keywords:	patch

Created on 2016-12-12 11:22 by xiang.zhang, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
short-path.patch	xiang.zhang, 2016-12-12 11:22		review

Messages (3)
msg282982 - (view)	Author: Xiang Zhang (xiang.zhang) *	Date: 2016-12-12 11:22
Some unicode APIs like PyUnicode_Contains get a short path comparing kinds. But this get a problem cannot apply to ascii and latin1. PyUnicode_MAX_CHAR_VALUE could be used instead to make the short path also apply to ascii and latin1. This skill is already used in PyUnicode_Replace.
msg282983 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2016-12-12 11:37
PyUnicode_KIND() just extracts three bits from the state word. PyUnicode_MAX_CHAR_VALUE() extracts bits multiple times and does few conditional branching. I think it is much slower that PyUnicode_KIND(). In common case you search ASCII needle or the needle of the same kind as a string, therefore checking for fast path just adds the overhead. It is appropriate while the overhead is tiny. Optimize common cases, not rare and obscure cases.
msg282990 - (view)	Author: Xiang Zhang (xiang.zhang) *	Date: 2016-12-12 12:44
I know the difference and thought the overhead should be tiny (not in a critical part). But benchmarks show it's not. :-(

History
Date	User	Action	Args
2022-04-11 14:58:40	admin	set	github: 73129
2016-12-12 12:45:04	xiang.zhang	set	resolution: rejected
2016-12-12 12:44:42	xiang.zhang	set	status: open -> closed messages: + msg282990 stage: patch review -> resolved
2016-12-12 11:37:38	serhiy.storchaka	set	messages: + msg282983
2016-12-12 11:22:10	xiang.zhang	create