In py3k, PyUnicode_Join inherits some complexity from the 2.x days.
However, it seems some of the precautions taken there may not be needed
anymore. Witness the following comment:

    /* Grrrr.  A codec may be invoked to convert str objects to
     * Unicode, and so it's possible to call back into Python code
     * during PyUnicode_FromObject(), and so it's possible for a sick
     * codec to change the size of fseq (if seq is a list).  Therefore
     * we have to keep refetching the size -- can't assume seqlen
     * is invariant.

Perhaps it would also allow to preallocate the target buffer all at once
(like bytes.join does) rather than resize it incrementally.
Marc-Andre, what do you think?
