classification
Title: Error in the documentation about string concatenation
Type: enhancement Stage: resolved
Components: Documentation Versions: Python 3.7
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: docs@python Nosy List: brett.cannon, dmitriym, docs@python, eric.smith, steven.daprano
Priority: normal Keywords:

Created on 2019-07-05 21:33 by dmitriym, last changed 2019-07-08 19:07 by brett.cannon. This issue is now closed.

Messages (5)
msg347390 - (view) Author: Dmitriy (dmitriym) Date: 2019-07-05 21:33
There is an error in documentation about string concatenation:

https://docs.python.org/3/library/stdtypes.html

---
Common Sequence Operations
[...]
6. Concatenating immutable sequences always results in a new object. This means that building up a sequence by repeated concatenation will have a quadratic runtime cost in the total sequence length. To get a linear runtime cost, you must switch to one of the alternatives below:

- if concatenating str objects, you can build a list and use str.join() at the end or else write to an io.StringIO instance and retrieve its value when complete
---

It is not true for str objects anymore. Example using timeit module shows that time grows linearly:
>>> timeit('a+="a"', setup='a=""', number=10000)
0.0005754000012530014
>>> timeit('a+="a"', setup='a=""', number=100000)
0.005819800004246645

But for bytes it is still right:
>>> timeit('a+=b"a"', setup='a=b""', number=10000)
0.0017669000080786645
>>> timeit('a+=b"a"', setup='a=b""', number=100000)
0.20758410000416916
msg347400 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2019-07-05 22:32
It's my understanding that this is a quality of implementation issue, and that in other (non-CPython) implementations, the run time for repeated concatenation may indeed be quadratic.

The optimization in CPython relies on knowing the reference count is 1. If CPython were to switch away from reference counting, I would expect the behavior of repeated concatenation to be quadratic again.

I'm not sure if the deserves a documentation note or not.
msg347412 - (view) Author: Steven D'Aprano (steven.daprano) * (Python committer) Date: 2019-07-06 01:14
Eric is correct that this is a CPython optimization, it is not a language feature.

Furthermore, it is an optimization that can be affected by rather subtle factors such as the operating system memory management. Here is a thread demonstrating that code that relied on this optimization ran fast on Linux and slow on Windows, literally hundreds of times slower than other tools:

https://mail.python.org/archives/list/python-dev@python.org/thread/EQO6HDZXHFCP4DBEE5Q7MYLHPJMGJ7DQ/

If you prefer the old mailman archives, here are a few relevant posts:

https://mail.python.org/pipermail/python-dev/2009-August/091125.html

https://mail.python.org/pipermail/python-dev/2009-September/091582.html

https://mail.python.org/pipermail/python-dev/2009-September/091592.html

Best practice is to *not* rely on this optimization, but to use str.join. That ensures that:

- your code remains fast if run on alternate implementations;
- your code remains fast if the OS memory management changes;
- your code remains fast if you change your code slightly.

For example, I expect both of the following to show the documented quadratic behaviour:

- Prepending instead of appending

- keeping two or more references to the string being appended to

  partials = []
  current = ''
  for s in list_of_strings:
      partials.append(current)
      current += s

There may be others.

Perhaps we ought to document this optimization specifically so we can suggest that it is intended for improving performance of casual scripts, and that *best practice* for professional quality code remains str.join.
msg347428 - (view) Author: Dmitriy (dmitriym) Date: 2019-07-06 11:32
Yes, optimization is really not working in case of prepending. In case of multiple references I couldn't get quadratic time grow.

Concerning the Windows, yes, the optimization may be not always efficient:

>>> timeit('a+="a"', setup='a=""', number=10000)
0.0011690999999984797
>>> timeit('a+="a"', setup='a=""', number=100000)
0.01114439999999206
>>> timeit('a+="a"', setup='a=""', number=1000000)
0.10783829999999739
>>> timeit('a+="a"', setup='a=""', number=10000000)
5.636337499999996

As I understand this is the case related to OS memory management.


But on Linux I got fairly predictable results:

>>> timeit('a+="a"', setup='a=""', number=10000)
0.0006532900151796639
>>> timeit('a+="a"', setup='a=""', number=100000)
0.006340583000564948
>>> timeit('a+="a"', setup='a=""', number=1000000)
0.06438201799755916
>>> timeit('a+="a"', setup='a=""', number=10000000)
0.6354853530065157
>>> timeit('a+="a"', setup='a=""', number=100000000)
6.365498173021479


Also I have found the mention about optimization in PEP8
https://www.python.org/dev/peps/pep-0008/#programming-recommendations

So maybe it would be nice to add some notes or reference to the part of upper PEP in docs about optimizations in CPython to make it more clear.
msg347505 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2019-07-08 19:07
I'm going to close this as we try to avoid documenting CPython-specific details unless absolutely necessary. And in the case of documenting the str type we should avoid that.

So thanks for the suggestion, Dmitriy, but we will just leave this as a pleasant surprise for folks who accidentally make the mistake of writing code this way.
History
Date User Action Args
2019-07-08 19:07:58brett.cannonsetstatus: open -> closed

nosy: + brett.cannon
messages: + msg347505

resolution: rejected
stage: resolved
2019-07-06 11:32:57dmitriymsetmessages: + msg347428
2019-07-06 01:14:12steven.dapranosetnosy: + steven.daprano
messages: + msg347412
2019-07-05 22:32:02eric.smithsetnosy: + eric.smith
messages: + msg347400
2019-07-05 21:33:18dmitriymcreate