Issue 15522: improve 27 percent performance on stringpbject.c (by prefetch and loop optimization)

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/59727

classification

Title:	improve 27 percent performance on stringpbject.c (by prefetch and loop optimization)
Type:	performance	Stage:
Components:	Interpreter Core	Versions:	Python 3.4

process

Status:	closed	Resolution:	rejected
Dependencies:		Superseder:
Assigned To:		Nosy List:	abael, jcea, pitrou, r.david.murray, serhiy.storchaka
Priority:	normal	Keywords:

Created on 2012-08-01 09:33 by abael, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
stringjoin.c	abael, 2012-08-04 08:00

Messages (7)
msg167107 - (view)	Author: abael (abael)	Date: 2012-08-01 09:33
Python-2.7.3/Objects/stringobject.c( SHA256SUM ad7795c75e2a25247e4dea4cc5327c225c4da03b7c7d57226c817ba6d12a316c) static PyObject string_join(PyStringObject self, PyObject orig); OLD IMPLEMENT LOGIC(Pseudo code): char sep = PyString_AS_STRING(self); const Py_ssize_t seplen = PyString_GET_SIZE(self); seq = PySequence_Fast(orig, ""); seqlen = PySequence_Size(seq); if (seqlen == 0) return PyString_FromString(""); else if (seqlen == 1)return the exactly first one item; else{ for (i = 0; i < seqlen; i++) { const size_t old_sz = sz; item = PySequence_Fast_GET_ITEM(seq, i); if (!PyString_Check(item)){ if ( Py_USING_UNICODE and PyUnicode_Check(item)) return PyUnicode_Join((PyObject )self, seq); else PyErr_Format(...); } sz += PyString_GET_SIZE(item); if (i != 0) sz += seplen; } } / Allocate result space. / res = PyString_FromStringAndSize((char)NULL, sz); /* Catenate everything. / p = PyString_AS_STRING(res); for (i = 0; i < seqlen; ++i) { size_t n; item = PySequence_Fast_GET_ITEM(seq, i); n = PyString_GET_SIZE(item); Py_MEMCPY(p, PyString_AS_STRING(item), n); p += n; if (i < seqlen - 1) { Py_MEMCPY(p, sep, seplen); p += seplen; } } Abael's IMPLEMENT LOGIC: char sep = PyString_AS_STRING(self); const Py_ssize_t seplen = PyString_GET_SIZE(self); seq = PySequence_Fast(orig, ""); seqlen = PySequence_Size(seq); if (seqlen == 0) return PyString_FromString(""); if (seqlen == 1) return the exactly first one item; if (seqlen <0)return NULL /** PREFETCH start, get the first item size, since here we can assume seqleng >= 2 */ register size_t sz=0; register size_t old_sz=0; PyObject res = NULL; item = PySequence_Fast_GET_ITEM(seq, 0); if (!PyString_Check(item)){ if ( Py_USING_UNICODE and PyUnicode_Check(item)) return PyUnicode_Join((PyObject )self, seq); else PyErr_Format(...); } sz += PyString_GET_SIZE(item); if (sz < old_sz \|\| sz > PY_SSIZE_T_MAX) PyErr_SetString(PyExc_OverflowError,"join() result is too long for a Python string"); /* PREFETCH end, get the first item size, since here we can assume seqleng >= 2 / register Py_ssize_t i; for (i=1; i < seqlen; i++) { / then here we can loop start from 1 */ const size_t old_sz = sz; item = PySequence_Fast_GET_ITEM(seq, i); if (!PyString_Check(item)){ if ( Py_USING_UNICODE and PyUnicode_Check(item)) return PyUnicode_Join((PyObject )self, seq); else PyErr_Format(...); } sz += PyString_GET_SIZE(item); sz += seplen; /** now we don't need to test (i != 0) every loop */ } / Allocate result space. / res = PyString_FromStringAndSize((char)NULL, sz); /* Catenate everything. / /* PREFETCH start, memcpy the first item first, since here we can assume seqleng >= 2 */ register char p = PyString_AS_STRING(res); item = PySequence_Fast_GET_ITEM(seq, 0); sz = PyString_GET_SIZE(item); Py_MEMCPY(p, PyString_AS_STRING(item),sz); p += sz; /** PREFETCH end, memcpy the first item first, since here we can assume seqleng >= 2 / for (i=1; i<seqlen; ++i){ / here we also loop start from 1 / item = PySequence_Fast_GET_ITEM(seq, i); sz = PyString_GET_SIZE(item); Py_MEMCPY(p, sep, seplen); / avoid test (i < seqlen - 1) each loop in old implement **/ p += seplen; Py_MEMCPY(p, PyString_AS_STRING(item),sz); p += sz; } return res;
msg167123 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2012-08-01 13:04
Hi, several points: - Python 2.7 is in bugfix mode; you need to work from the default Mercurial branch as explained in http://docs.python.org/devguide/ . In practice, this means your patch will target the future Python 3.4, and therefore either PyUnicode_Join or _PyBytes_Join. - please provide an actual patch (a Mercurial diff, probably) - please provide benchmarks (using e.g. timeit) which demonstrate the performance improvement you are proposing
msg167398 - (view)	Author: abael (abael)	Date: 2012-08-04 08:00
added my implement( with some enhancement, got better performance, at less for my apps. ). test result: with small chunk of str, got double performanc, and 111% for big chunks; it ## Util funcion for text definition: def pf(f,n): a=time() for i in xrange(n): f() b=time() print n/(b-a) ############################################################################# >>> s=['abcd',u'\u548c\u6613\u541b'] * 1000 >>> pf(lambda:''.join(s), 10000) 2289.28293164 >>> pf(lambda:text.join('',s), 10000) 4457.27947763 >>> s=['abcd'1000,u'\u548c\u6613\u541b'1000] * 1000 >>> pf(lambda:''.join(s), 100) 15.2374868484 >>> pf(lambda:text.join('',s), 100) 16.939913023 ############################################################################## 2012/8/1 Antoine Pitrou <report@bugs.python.org> > > Antoine Pitrou added the comment: > > Hi, several points: > > - Python 2.7 is in bugfix mode; you need to work from the default > Mercurial branch as explained in http://docs.python.org/devguide/ . In > practice, this means your patch will target the future Python 3.4, and > therefore either PyUnicode_Join or _PyBytes_Join. > > - please provide an actual patch (a Mercurial diff, probably) > > - please provide benchmarks (using e.g. timeit) which demonstrate the > performance improvement you are proposing > > ---------- > nosy: +pitrou > versions: +Python 3.4 -Python 2.7 > > _______________________________________ > Python tracker <report@bugs.python.org> > <http://bugs.python.org/issue15522> > _______________________________________ >
msg167399 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2012-08-04 08:13
Please port your code to Python 3.3 and compare with it. Python 3.3 implementation of str.join() already more than twice faster then Python 2.7. Maybe your optimization will have no effect.
msg167528 - (view)	Author: abael (abael)	Date: 2012-08-06 01:28
do i need to fully maintain Python ? 2012/8/5 Serhiy Storchaka <report@bugs.python.org> > > Changes by Serhiy Storchaka <storchaka@gmail.com>: > > > ---------- > status: open -> pending > > _______________________________________ > Python tracker <report@bugs.python.org> > <http://bugs.python.org/issue15522> > _______________________________________ >
msg167529 - (view)	Author: R. David Murray (r.david.murray) *	Date: 2012-08-06 01:40
We are happy that you wish to contribute; however, our policy is that performance enhancements only go in to the newest version of Python, which means any submitted patch must be against the 'default' branch of the Python mercurial repository. If you wish to propose a patch against default, we will be happy to review it. If you do not wish to do so, we will close the issue, and your patch will remain available here on the tracker to anyone who wishes to apply it by hand to their own Python 2.7 installation.
msg173439 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2012-10-21 10:31
I'm closing this issue, since the OP since to have lost interest in his/her proposal. Abael, if you want to propose an actual patch, please open a new issue.

History
Date	User	Action	Args
2022-04-11 14:57:33	admin	set	github: 59727
2012-10-21 10:31:44	pitrou	set	status: open -> closed resolution: rejected messages: + msg173439
2012-10-21 05:21:41	Ramchandra Apte	set	status: pending -> open title: impove 27 percent performance on stringpbject.c( by prefetch and loop optimization) -> improve 27 percent performance on stringpbject.c (by prefetch and loop optimization)
2012-10-19 22:13:54	serhiy.storchaka	set	status: open -> pending
2012-08-06 01:40:35	r.david.murray	set	nosy: + r.david.murray messages: + msg167529
2012-08-06 01:28:06	abael	set	status: pending -> open messages: + msg167528
2012-08-05 11:20:39	serhiy.storchaka	set	status: open -> pending
2012-08-04 08:13:45	serhiy.storchaka	set	nosy: + serhiy.storchaka messages: + msg167399
2012-08-04 08:00:21	abael	set	files: + stringjoin.c messages: + msg167398
2012-08-01 13:42:29	jcea	set	nosy: + jcea
2012-08-01 13:04:55	pitrou	set	nosy: + pitrou messages: + msg167123 versions: + Python 3.4, - Python 2.7
2012-08-01 09:35:06	abael	set	components: + Interpreter Core, - Library (Lib)
2012-08-01 09:33:52	abael	create