classification
Title: Programming FAQ about "What is the most efficient way to concatenate many strings together?" -- Improving the example
Type: enhancement Stage: patch review
Components: Documentation Versions: Python 3.9, Python 3.8
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: docs@python Nosy List: BTaskaya, Dominik V., docs@python, petdance, rhettinger
Priority: normal Keywords: patch

Created on 2020-04-20 21:27 by Dominik V., last changed 2020-05-24 19:39 by petdance.

Messages (7)
msg366887 - (view) Author: Dominik V. (Dominik V.) * Date: 2020-04-20 21:27
The section mentions the usage of `str.join` and contains the following example:

    chunks = []
    for s in my_strings:
        chunks.append(s)
    result = ''.join(chunks)

Since `join` accepts any iterable the creation of the `chunks` list in a for loop is superfluous. If people just copy & paste from this FAQ they'll even end up with less performant code.

The example could be improved by providing an example list such as:

    strings = ['spam', 'ham', 'eggs']
    meal = ', '.join(strings)

Arguably this isn't a particularly long list of strings, so one more example could be added using e.g. `range(100)`:

    numbers = ','.join(str(x) for x in range(100))

This also emphasizes the fact that `join` takes any iterable rather than just lists.
msg366888 - (view) Author: Dominik V. (Dominik V.) * Date: 2020-04-20 21:28
Here's the link to the relevant section: https://docs.python.org/3/faq/programming.html#what-is-the-most-efficient-way-to-concatenate-many-strings-together
msg366891 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2020-04-20 21:49
Dominik, can you please limit your tracker issues to a handful on entries that you really care about?  This is turning into a stream of consciousness dump onto our tracker.  You really don't need to rewrite every entry you see, especially when we haven't had user complaints about the existing entries.

Also, it seems to me that you're missing the point of the simplified examples in the docs.  Yes, of course, the for-loop in chunks is superfluous; however, that for-loop is very common pattern, but it typically does other work in the loop.  For example:
    
    blocks = []
    while True:
        block = s.recv(4096)
        if not block:
            break
        blocks.append(block)
    page = b''.join(blocks)

The problem with that example is that it shifts focus to tcp clients rather than the core topic to how to join strings.
msg366898 - (view) Author: Dominik V. (Dominik V.) * Date: 2020-04-20 22:10
It was not my intention to disturb the traffic on the bug tracker. My apologies if that caused any trouble. I also thought only people subscribed to the indicated topic (e.g. "Documentation") would receive a notification.

The docs pages mention that for enhancement proposals one should submit a bug report on the tracker:

> If you find a bug in this documentation or would like to propose an improvement, please submit a bug report on the tracker (https://docs.python.org/3/bugs.html).

I do care about the quality of Python's documentation and I think it could be improved in these cases. Often it is newcomers who consult these pages and they might be irritated by the mentioned parts.

I see how it would be distracting to include a more complex real world example, but using an example which performs apparently superfluous steps without any additional comment might seem strange. More experienced users probably won't need such an example at all. In addition it might make people falsely belief that `str.join` expects a list of strings rather than any iterable, and hence the explicit construction of the list.
msg366899 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2020-04-20 22:55
Your contributions are welcome.
msg367606 - (view) Author: Batuhan Taskaya (BTaskaya) * (Python committer) Date: 2020-04-29 01:33
Sorry for the noise, wrong issue (thought 344, but actually was 334)
msg369821 - (view) Author: Andy Lester (petdance) * Date: 2020-05-24 19:39
I'd also like to suggest that the question not be "most efficient" but "fastest".  I don't think it should treat "efficient" and "fast" as synonyms. 
"Efficient" can mean things other than execution speed, such as memory usage, or programmer time.  Are there space/time considerations besides execution time?  What if the user has a huge list and wants to use as little new allocated RAM as possible?

I understand that it's immediately after the "How do I speed up my program?" question, and I think it's worth considering making the question more explicit what kind of efficiency we're talking about.
History
Date User Action Args
2020-05-24 19:39:14petdancesetmessages: + msg369821
2020-05-24 16:34:53BTaskayasetpull_requests: - pull_request19625
2020-05-24 16:30:27BTaskayasetpull_requests: + pull_request19625
2020-05-24 16:30:12BTaskayasetpull_requests: - pull_request19624
2020-05-24 16:09:56BTaskayasetkeywords: + patch
stage: patch review
pull_requests: + pull_request19624
2020-04-29 01:38:44BTaskayasetstage: patch review -> (no value)
2020-04-29 01:38:32BTaskayasetkeywords: - patch
2020-04-29 01:38:15BTaskayasetpull_requests: - pull_request19105
2020-04-29 01:34:47BTaskayasetpull_requests: + pull_request19105
2020-04-29 01:33:48BTaskayasetnosy: rhettinger, docs@python, BTaskaya, petdance, Dominik V.
messages: + msg367606
2020-04-29 01:32:29BTaskayasetpull_requests: - pull_request19103
2020-04-29 01:29:21BTaskayasetkeywords: + patch
nosy: + BTaskaya

pull_requests: + pull_request19103
stage: patch review
2020-04-20 22:55:49rhettingersetmessages: + msg366899
2020-04-20 22:35:36petdancesetnosy: + petdance
2020-04-20 22:10:25Dominik V.setmessages: + msg366898
2020-04-20 21:49:31rhettingersetnosy: + rhettinger
messages: + msg366891
2020-04-20 21:28:13Dominik V.setmessages: + msg366888
2020-04-20 21:27:30Dominik V.create