Title: join method for list and tuple
Type: enhancement Stage:
Components: Library (Lib) Versions: Python 3.8
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Javier Dehesa, christian.heimes, eric.araujo, iamsav, josh.r, serhiy.storchaka
Priority: normal Keywords:

Created on 2018-04-03 14:33 by Javier Dehesa, last changed 2019-09-16 10:04 by serhiy.storchaka.

Messages (9)
msg314881 - (view) Author: Javier Dehesa (Javier Dehesa) Date: 2018-04-03 14:33
It is pretty trivial to concatenate a sequence of strings:

    ''.join([str1, str2, ...])

Concatenating a sequence of lists is for some reason significantly more convoluted. Some current options include:

    sum([lst1, lst2, ...], [])
    [x for y [lst1, lst2, ...] for x in y]
    list(itertools.chain(lst1, lst2, ...))

The first one being the less recomendable but more intuitive and the third one being the faster but most cumbersome (see ). None of these looks like "the one obvious way to do it" to me. Furthermore, I feel a dedicated concatenation method could be more efficient than any of these approaches.

If we accept that ''.join(...) is an intuitive idiom, why not provide the syntax:

    [].join([lst1, lst2, ...])

And while we are at it:

    ().join([tpl1, tpl2, ...])

Like with str, these methods should only accept sequences of objects of their own class (e.g. we could do [].join(list(s) for s in seqs) if seqs contains lists, tuples and generators). The use case for non-empty joiners would probably be less frequent than for strings, but it also solves a problem that has no clean solution with the current tools. Here is what I would probably do to join a sequence of lists with [None, 'STOP', None]:

lsts = [lst1, lst2, ...]
joiner = [None, 'STOP', None]
lsts_joined = list(itertools.chain.from_iterable(lst + joiner for lst in lsts))[:-len(joiner)]

Which is awful and inefficient (I am not saying this is the best or only possible way to solve it, it is just what I, self-considered experienced Python developer, might write).
msg314882 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2018-04-03 14:40
join() is a bad choice, because new developers will confusing list.join with str.join.

We could turn list.extend(iterable) into list.extend(*iterable). Or you could just use extend with a chain iterator:

>>> l = []
>>> l.extend(itertools.chain([1], [2], [3]))
>>> l
[1, 2, 3]
msg314883 - (view) Author: Javier Dehesa (Javier Dehesa) Date: 2018-04-03 15:06
Thanks Christian. I thought of join precisely because it performs conceptually the same function as with str, so the parallel between ''.join(), [].join() and ().join() looked more obvious. Also there is os.path.join and PurePath.joinpath, so the verb seemed well-established. As for shared method names, index and count are present both in sequences and str - although it is true that these do return the same kind of object in any cases.

I'm not saying your points aren't valid, though. Your proposed way with extend is I guess about the same as list(itertools.chain(...)), which could be considered to be enough. I just feel that is not particularly convenient, especially for newer developers, which will probably gravitate towards sum(...) more than itertools or a nested generator expression, but I may be wrong.
msg314885 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-04-03 15:23
String concatenation: f'{a}{b}{c}'
List concatenation: [*a, *b, *c]
Tuple concatenation: (*a, *b, *c)
Set union: {*a, *b, *c}
Dict merging: {**a, **b, **c}
msg352387 - (view) Author: Josh Rosenberg (josh.r) * (Python triager) Date: 2019-09-13 18:35
Note that all of Serhiy's examples are for a known, fixed number of things to concatenate/union/merge. str.join's API can be used for that by wrapping the arguments in an anonymous tuple/list, but it's more naturally for a variable number of things, and the unpacking generalizations haven't reached the point where:

    [*seq for seq in allsequences]

is allowed.


handles that just fine, but I could definitely see it being convenient to be able to do:


That said, a big reason str provides .join is because it's not uncommon to want to join strings with a repeated separator, e.g.:

    # For not-really-csv-but-people-do-it-anyway

    # Separate words with spaces
    ' '.join(words)

    # Separate lines with newlines

I'm not seeing even one motivating use case for list.join/tuple.join that would actually join on a non-empty list or tuple ([None, 'STOP', None] being rather contrived). If that's not needed, it might make more sense to do this with an alternate constructor (a classmethod), e.g.:


which would avoid the cost of creating an otherwise unused empty list (the empty tuple is a singleton, so no cost is avoided there). It would also work equally well with both tuple and list (where making list.extend take varargs wouldn't help tuple, though it's a perfectly worthy idea on its own).

Personally, I don't find using itertools.chain (or its from_iterable alternate constructor) all that problematic (though I almost always import it with from itertools import chain to reduce the verbosity, especially when using chain.from_iterable). I think promoting itertools more is a good idea; right now, the notes on concatenation for sequence types mention str.join, bytes.join, and replacing tuple concatenation with a list that you call extend on, but doesn't mention itertools.chain at all, which seems like a failure to make the best solution the discoverable/obvious solution.
msg352530 - (view) Author: Александр Семенов (iamsav) Date: 2019-09-16 09:33
in javascript join() is made the other way around
['1','2','3'].join(', ')
so, [].join() may confuse some peoples.
msg352531 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2019-09-16 09:46
> in javascript join() is made the other way around
> ['1','2','3'].join(', ')
> so, [].join() may confuse some peoples.

It would be too confusing to have two different approaches to join strings in Python. Besides ECMAScript 1 came out in 1997, 5 years after Python was first released. By that argument JavaScript that should.
msg352532 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-09-16 09:53
How common is the case of variable number of things to concatenate/union/merge?

From my experience, in most ceases this looks like:

    result = []
    for ...:
        # many complex statements
        # may include continue and break
        result.extend(items) # may be intermixed with result.append(item)

So concatenating purely lists from some sequence is very special case. And there are several ways to perform it.

    result = []
    for items in seq:
        # nothing wrong with this simple code, really

    result = [x for items in seq for x in items]
    # may be less effective for really long sublists,
    # but looks simple

    result = list(itertools.chain.from_iterable(items))
    # if you are itertools addictive ;-)
msg352534 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-09-16 10:04
It is history, but in 1997 Python had the same order of arguments as ECMAScript: string.join(words [, sep]). str.join() was added only in 1999 (226ae6ca122f814dabdc40178c7b9656caf729c2).
Date User Action Args
2019-09-16 10:04:49serhiy.storchakasetmessages: + msg352534
2019-09-16 09:53:43serhiy.storchakasetmessages: + msg352532
2019-09-16 09:46:11christian.heimessetmessages: + msg352531
2019-09-16 09:33:29iamsavsetnosy: + iamsav
messages: + msg352530
2019-09-13 18:35:07josh.rsetnosy: + josh.r
messages: + msg352387
2018-04-06 16:29:01eric.araujosetnosy: + eric.araujo
2018-04-03 15:23:56serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg314885
2018-04-03 15:06:33Javier Dehesasetmessages: + msg314883
2018-04-03 14:40:42christian.heimessetnosy: + christian.heimes

messages: + msg314882
versions: + Python 3.8
2018-04-03 14:33:53Javier Dehesacreate