BTW. The unicodeFromTclStringAndSize() basically undoes the special treatment of \0 in Modified UTF-8 [1]. That page says that all known implementation of MUTF-8 treat surrogate pairs the same as CESU-8 [2], which is UTF-8 with characters outside of the BMP encoded as surrogate pairs which are then converted to UTF-8.

Neither encoding is currently supported by Python.

