This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author taleinat
Recipients erlendaasland, serhiy.storchaka, shreyanavigyan, steven.daprano, taleinat, terry.reedy
Date 2021-05-24.11:49:42
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1621856982.79.0.610081034982.issue44217@roundup.psfhosted.org>
In-reply-to
Content
> It is partially an IDLE issue. The code expects that indices in Python string correspond indices in Tcl string, but this is not true in case of astral characters which are encoded as 2 (or maybe even 4) characters in Tcl.

It's not just that - Tk's Text widget is the indexing in the line itself wrong. In the string from Terry's example, which has 11 characters in a line including three smiley emojis, the can be fetch using t.get('1.1'), t.get('1.2') etc. through t.get('1.11'). t.get('1.12') returns '\n' since it is at or after the end of the line. So, as far as indexing is concerned, each of those emoji characters is treated as a single character.
History
Date User Action Args
2021-05-24 11:49:42taleinatsetrecipients: + taleinat, terry.reedy, steven.daprano, serhiy.storchaka, erlendaasland, shreyanavigyan
2021-05-24 11:49:42taleinatsetmessageid: <1621856982.79.0.610081034982.issue44217@roundup.psfhosted.org>
2021-05-24 11:49:42taleinatlinkissue44217 messages
2021-05-24 11:49:42taleinatcreate