Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tkinter hangs or crashes when displaying astral chars #86391

Closed
terryjreedy opened this issue Nov 1, 2020 · 37 comments
Closed

Tkinter hangs or crashes when displaying astral chars #86391

terryjreedy opened this issue Nov 1, 2020 · 37 comments
Labels
3.8 only security fixes 3.9 only security fixes 3.10 only security fixes topic-tkinter topic-unicode type-bug An unexpected behavior, bug, or error

Comments

@terryjreedy
Copy link
Member

BPO 42225
Nosy @terryjreedy, @ronaldoussoren, @ned-deily, @ezio-melotti, @serhiy-storchaka, @miss-islington, @E-Paine
PRs
  • bpo-42225: IDLE - document two unix-related problems. #25078
  • [3.9] bpo-42225: IDLE - document two unix-related problems. (GH-25078) #25105
  • [3.8] bpo-42225: IDLE - document two unix-related problems. (GH-25078) #25106
  • Files
  • fedora32.png
  • Ubuntu-2020.04.png
  • emojis.png
  • Screenshots_128547-128593.pdf: Character output to Ubuntu and Windows
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2021-03-31.10:10:50.250>
    created_at = <Date 2020-11-01.03:17:12.424>
    labels = ['type-bug', 'expert-tkinter', '3.9', '3.10', '3.8', 'expert-unicode']
    title = 'Tkinter hangs or crashes when displaying astral chars'
    updated_at = <Date 2021-03-31.10:11:24.763>
    user = 'https://github.com/terryjreedy'

    bugs.python.org fields:

    activity = <Date 2021-03-31.10:11:24.763>
    actor = 'terry.reedy'
    assignee = 'none'
    closed = True
    closed_date = <Date 2021-03-31.10:10:50.250>
    closer = 'terry.reedy'
    components = ['Tkinter', 'Unicode']
    creation = <Date 2020-11-01.03:17:12.424>
    creator = 'terry.reedy'
    dependencies = []
    files = ['49556', '49557', '49567', '49581']
    hgrepos = []
    issue_num = 42225
    keywords = ['patch']
    message_count = 37.0
    messages = ['380112', '380119', '380137', '380138', '380139', '380140', '380143', '380144', '380146', '380149', '380151', '380173', '380211', '380227', '380260', '380266', '380282', '380283', '380288', '380305', '380393', '380549', '380550', '380551', '380552', '380565', '380573', '380574', '380575', '380716', '389665', '389667', '389677', '389871', '389872', '389875', '389876']
    nosy_count = 9.0
    nosy_names = ['terry.reedy', 'ronaldoussoren', 'wordtech', 'ned.deily', 'ezio.melotti', 'serhiy.storchaka', 'miss-islington', 'epaine', 'IanSt1']
    pr_nums = ['25078', '25105', '25106']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue42225'
    versions = ['Python 3.8', 'Python 3.9', 'Python 3.10']

    @terryjreedy
    Copy link
    Member Author

    On my macOS Mohave, 3.10, echoing '\U0001####' (# = hex digit) or chr(#####) (decimal digits) in IDLE's shell either prints an error box or hangs. On bpo-13153, freezing on macOS was reported for 3.7.6. Until tkinter on Mac works better, we should try to get an error box for all astral chars.

    For an SO questioner with Ubuntu 18.04, now updated to 20.04 with python 3.8.6, some chars display (128512-128547; 128549-128555; 128557-128576, example chr(128516)) and some 'crash' (example chr(128077)).  I am trying to get 'crash' narrowed down and the tk version Ubuntu uses.
    Serhiy, does >>> chr(128516) echo thumbs up on your Linux?

    The SO crash example works for me on Windows. I should test more codepoints.

    @terryjreedy terryjreedy added OS-mac 3.10 only security fixes topic-tkinter topic-unicode type-bug An unexpected behavior, bug, or error labels Nov 1, 2020
    @serhiy-storchaka
    Copy link
    Member

    I get a crash for chr(128516) ("😄") in Tk.

    $ wish
    % label .l -text 😄
    .l
    % X Error of failed request:  BadLength (poly request too large or internal Xlib length error)
      Major opcode of failed request:  139 (RENDER)
      Minor opcode of failed request:  20 (RenderAddGlyphs)
      Serial number of failed request:  599
      Current serial number in output stream:  599

    @vstinner
    Copy link
    Member

    vstinner commented Nov 1, 2020

    Serhiy:

    I get a crash for chr(128516) ("😄") in Tk.

    On Linux? What is your Tk version?

    On my Fedora 32, the character is displayed properly. It seems like Tk is still using X11 whereas my GNOME desktop is using Wayland.

    $ ./python -m test.pythoninfo|grep ^tkinter
    tkinter.TCL_VERSION: 8.6
    tkinter.TK_VERSION: 8.6
    tkinter.info_patchlevel: 8.6.10

    @vstinner
    Copy link
    Member

    vstinner commented Nov 1, 2020

    Hum, I didn't explain well. My test. I ran:

    ./python -m idlelib

    In the IDLE shell, I wrote chr(0x1F604) which displays the emoji as expected:

    >>> chr(0x1F604)
    '😄'

    @serhiy-storchaka
    Copy link
    Member

    I generated a script for testing all characters:

    with open('withtest.sh', 'w', errors='surrogatepass') as f:
        for i in range(0x100, 0x110000): print(f"echo 'label .l -text \"{chr(i)}\"; exit' | wish 2>/dev/null && echo OK '\\U{i:08x}' {chr(i)!r} || echo FAIL '\\U{i:08x}' {chr(i)!r}", file=f)

    It takes a time. It tested around 20% of all characters for 6-7 hours. And it seems that all failed characters are colored emojies and all passed characters are non-colored. Seems it is related either to the font that provides colored emojies, or to the mechanism that interprets such fonts, or Tk just cannot correctly handle the output when such fonts are used (maybe reserve too small buffer or cannot interpret result code).

    @vstinner
    Copy link
    Member

    vstinner commented Nov 1, 2020

    Serhiy's test also work as expected.

    $ wish
    % label .l -text 😄

    Since the Serhiy's test doesn't use Python, is it worth it to track this Tk crash in the Python bug tracker?

    @serhiy-storchaka
    Copy link
    Member

    Yes, on Linux. Ubuntu 2020.04. Tk 8.6.10. X.Org X Server 1.20.8.

    I tried to report the bug upstream, but failed. I did not use the Tk bugtracker several years, and it was on different computer, so I have no password to my account, and when I tried to create new accounts, I cannot login with them too. I tried to write to the mailing list, but it requires subscribing, and when I subscribed I did not receive a message with confirmation. If anybody can, please report this bug to Tk developers.

    @serhiy-storchaka
    Copy link
    Member

    Victor, do you see a color smiling face in my example or monochromatic or just a bar?

    @vstinner
    Copy link
    Member

    vstinner commented Nov 1, 2020

    Victor, do you see a color smiling face in my example or monochromatic or just a bar?

    See attached screenshot: fedora32.png.

    @serhiy-storchaka
    Copy link
    Member

    It looks different on my computer. I suppose it will crash to you too if you install a color emoji font.

    @ronaldoussoren
    Copy link
    Contributor

    The error on Linux could be related to this issue: https://bugzilla.redhat.com/show_bug.cgi?id=1498269

    @terryjreedy
    Copy link
    Member Author

    In IDLE on Windows the following prints the first 3 astral planes in a couple of minutes.

    for i in range(0x10000, 0x40000, 32):
        chars = ''.join(chr(i+j) for j in range(32))
        print(hex(i), chars)

    Perhaps half of the assigned chars in the first plane are printed instead of being replaced with a narrow box. This includes emoticons as foreground color outlines on background color. Maybe all of the second plane of extended CJK chars are printed. The third plane is unassigned and prints as unassigned boxes (with an X).

    Fixing OS graphics or tk is out of scope for us. Preventing hangs or crashes when using tkinter is. On Mac, refusing to insert any astral char into a tk widget might be the best solution. Serhiy, could that be done in tkinter/_tkinter?

    On Linux, the situation appears to be more complex. The SO questioner
    https://stackoverflow.com/questions/64615570/why-do-some-emoticons-cause-python-idle-to-crash-on-ubuntu
    could print the two multicolor 'grinning face with smiling eyes' 😄, which fails for Serhiy, but not the simpler thumbsup 👍. I don't know if we can detect fonts that cause crashes.

    @vstinner
    Copy link
    Member

    vstinner commented Nov 2, 2020

    Fixing OS graphics or tk is out of scope for us. Preventing hangs or crashes when using tkinter is. On Mac, refusing to insert any astral char into a tk widget might be the best solution. Serhiy, could that be done in tkinter/_tkinter?

    I dislike attempting to workaround Tk issues in Python. As you can see, the behavior really depends on the platform. As I wrote, on Fedora 32 it works (the character is rendered properly). I would prefer to not block such character on Fedora 32 because it does crash on some other platforms.

    Or you should detect the very precise conditions explaining why it works on some platforms and crash on some other platforms...

    @E-Paine
    Copy link
    Mannequin

    E-Paine mannequin commented Nov 2, 2020

    For me, this is not limited to special characters. Trying to load anything in Tk using the 'JoyPixels' font crashes (sometimes it does load but all characters are very random - most are whitespace - and it crashes again after a call to fc-cache). IDLE crashes when trying to preview the font.

    I believe this is what is being experienced on https://askubuntu.com/questions/1236488/x-error-of-failed-request-badlength-poly-request-too-large-or-internal-xlib-le because they are not using any special characters yet are reporting the same problem.

    @E-Paine E-Paine mannequin removed OS-mac labels Nov 2, 2020
    @terryjreedy
    Copy link
    Member Author

    Victor, does my test run to completion (without exception) on your Fedora? If it does, I definitely would not disable astral char display on Fedora. This version catches exceptions and reports them separately and runs directly with tkinter, in about a second.

    tk = True
    if tk:
        from tkinter import Tk
        from tkinter.scrolledtext import ScrolledText
        root = Tk()
        text = ScrolledText(root, width=80, height=40)
        text.pack()
        def print(txt):
            text.insert('insert', txt+'\n')
    
    errors = []
    for i in range(0x10000, 0x40000, 32):
        chars = ''.join(chr(i+j) for j in range(32))
        try:
           print(f"{hex(i)} {chars}")
        except Exception as e:
            errors.append(f"{hex(i)} {e}")
    print("ERRORS:")
    for line in errors:
        print(line)

    @serhiy-storchaka
    Copy link
    Member

    It works on Ubuntu if uninstall the color Emoji font (package fonts-noto-color-emoji).

    @vstinner
    Copy link
    Member

    vstinner commented Nov 3, 2020

    The following program fails with:
    ---
    X Error of failed request: BadLength (poly request too large or internal Xlib length error)
    Major opcode of failed request: 138 (RENDER)
    Minor opcode of failed request: 20 (RenderAddGlyphs)
    Serial number of failed request: 4248
    Current serial number in output stream: 4956
    ---

    Python program:
    ---

    from tkinter import Tk
    from tkinter.scrolledtext import ScrolledText
    root = Tk()
    text = ScrolledText(root, width=80, height=40)
    text.pack()
    
    for i in range(0x10000, 0x40000, 32):
        chars = ''.join(chr(i+j) for j in range(32))
        text.insert('insert', f"{hex(i)} {chars}\n")
    
    input("Press enter to exit")

    It seems like the first character which triggers this RenderAddGlyphs BadLength issue is: U+1f6c2. See attached emoji.png screenshot. As you can see, some emojis are rendered in color in Gnome Terminal. I guess that it uses the Gtk 3 pango library to render these characters.

    @vstinner
    Copy link
    Member

    vstinner commented Nov 3, 2020

    This version catches exceptions and reports them separately and runs directly with tkinter, in about a second.

    The X Error is displayed and then the process exit. Python cannot catch this fatal X Error.

    @ronaldoussoren
    Copy link
    Contributor

    @kevin Walzer: Is the problem were seeing a known issue with Tk?

    @wordtech
    Copy link
    Mannequin

    wordtech mannequin commented Nov 4, 2020

    Some work has been done this year on expanding support for these types of glyphs in Tk, but I'm not sure of its current state--it's not my area of expertise. Can you open a ticket at https://core.tcl-lang.org/tk/ so one of the folks working on this can take a look?

    @terryjreedy
    Copy link
    Member Author

    Kevin, Serhiy tried to report this upstream but failed. msg380143.
    Perhaps you could.

    One person running my test program reported
    """
    Fedora 32 x86-64
    Cinnamon 4.6.7
    Linux 5.8.16-200.fc32.x86_64
    Python 3.8.6 (default, Sep 25 2020, 00:00:00)
    [GCC 10.2.1 20200723 (Red Hat 10.2.1-1)] on linux

    Running line-by-line in terminal, the for-loop crashes with:
    <<<
    X Error of failed request: BadLength (poly request too large or internal Xlib length error)
    Major opcode of failed request: 138 (RENDER)
    Minor opcode of failed request: 20 (RenderAddGlyphs)
    Serial number of failed request: 3925
    Current serial number in output stream: 4865
    """

    Another reported "Seems to produce garbage on my system:
    [ads@ADS4 x]$ uname -a
    Linux ADS4 5.8.17-100.fc31.x86_64 #1 SMP Thu Oct 29 18:58:48 UTC 2020
    x86_64 x86_64 x86_64 GNU/Linux"

    But the program ran to completion without errors. A copy of the output from the window was attached. I have asked for the tcl/tk version. My response included:
    """
    On *nix, Python (unicode) chars are utf-8 encoded by _tkinter for tk. The encoding of astral non-BMP chars uses 4 bytes. Perhaps tk on your ADS Linux (new to me) displays the 4 bytes as 4 chars instead of 1. For each block of 32, the first 3 are the same. This is true in this file, but easily seeing this depends on the display software.

    I don't know what you saw, but Notepad++ displays control chars with the high bit set (C1 controls) as their reversed type (white on black) 3 char acronym as defined on
    https://en.wikipedia.org/wiki/Latin-1_Supplement_(Unicode_block) Character table.

    Thus the first astral U+10000 is encoded as b"\xF0\x90\x80\x80. In Notepad++, what is in the file appears as 4 characters, not 1, displayed 'ðDCSPADPAD', with the part after ð being being the correct white on black triplets for code points U+90 and U+80. The first char '\xf0' == 'ð' is the same for all quadruples shown by Notepad++. The next 3 vary as appropriate. In some cases, all 4 are normal printable chars, such as 0x29aa0, a CJK char, showing as "𩪠"

    If I cut the first 4 chars from Notepad++ to Thunderbird the result is "ð���". I see only ð but the presence of 3 0-width chars is revealed by moving through the string with arrow keys.
    """
    Here on Firefox the C1 controls, invisible in Thunderbird, display as squares with digits 0090, 0080 in two rows. Serhiy probably understands these reports better than I do. This tc in ADS4 Linux seems to doing something like what Serhiy described as "Tcl fails to decode the string from UTF-8 and falls back to Latin1" before his _tkinter fix.

    As far as IDLE and Linux is concerned, I am just going to consider what to change or add in "User output in Shell" in the IDLE doc.

    @IanSt1
    Copy link
    Mannequin

    IanSt1 mannequin commented Nov 8, 2020

    Further to the information I posted on Stack Overflow (referred to above) relating to reproducing emoticon characters from Idle under Ubuntu, I have done more testing. Based on some of the code/comments above, I tried modifications which I hoped might identify errors before Idle crashed.
    At a simple level I can generate some error information in a Ubuntu terminal from the following.
    usr/bin$ idle-python3.8
    Entering chr(0x1f624) gives the following error message in terminal.
    X Error of failed request: BadLength (poly request too large or internal Xlib length error)
    Major opcode of failed request: 139 (RENDER)
    Minor opcode of failed request: 20 (RenderAddGlyphs)
    Serial number of failed request: 4484
    Current serial number in output stream: 4484

    Another test used this code.
    --------------

    def FileSave(sav_file_name,outputstring):
        with open(sav_file_name, "a", encoding="utf8",newline='') as myfile:
            myfile.write(outputstring)
    
    def FileSave1(sav_file_name,eoutputstring):
        with open(sav_file_name, "a", encoding="utf8",newline='') as myfile:
            myfile.write(eoutputstring)
    
    tk = True
    if tk:
        from tkinter import Tk
        from tkinter.scrolledtext import ScrolledText
        root = Tk()
        text = ScrolledText(root, width=80, height=40)
        text.pack()
        def print1(txt):
            text.insert('insert', txt+'\n')
    
    errors = []
    outputstring = "Characters:"+ "\n"+"\n"
    eoutputstring = "Errors:"+ "\n"+"\n"
    
    #for i in range(0x1f600, 0x1f660):   #crashes at 0x1f624
    for i in range(0x1f623, 0x1f624):  # 1f624, 1f625 then try 1f652  
        chars = chr(i)
        decimal = str(int(hex(i)[2:],16))
        try:
            outputstring = str(hex(i))+" "+decimal+" "+chars+ "\n"
            FileSave("Charsfile.txt", outputstring)
            print1(f"{hex(i)} {decimal} {chars}")
            print(f"{hex(i)} {decimal} {chars}")
        except Exception as e:
            print(str(hex(i)))
            eoutputstring = str(hex(i))+ "\n"
            FileSave1("Errorfile.txt", eoutputstring)
            errors.append(f"{hex(i)} {e}")
    
    print("ERRORS:")
    
    for line in errors:
        print(line)

    With the range starting at 0x1f623 and changing the end point, in Ubuntu, with end point 0x1f624, this prints ok, but if higher numbers are used the Idle windows all closed. However on some occasions, if I began with end point at 0x1f624 and run, then without closing the editor window I increased the end point to 0x1f625, save and run, the Text window would close, but the console window would remain open. I could then increase the upper range further and repeat and more characters would print to the console.
    I have attached screenshots of the console output with the fonts-noto-color-emoji fonts package installed(with font), then with this package uninstalled (no font) and finally the same when run under Windows 10.
    For the console output produced while the font package is installed, if I select in the character column where there is a blank space, "something" can be selected. If I save the console as a text file or select all the rows, copy and paste to a text file, the missing characters are revealed. When the font package is uninstalled, the missing characters are truely missing. It is the apparently missing characters (such as 0x1f624, 0x1f62c, 0x1f641, 0x1f642, 0x1f644-0x1f64f) which appear to be causing the Idle crashes. Presumably such as 0x1f650 and 0x1f651 are unallocated codes so show up as rectangular outlines.

    In none of the tests with the more complex code above did I manage to generate any error output.

    My set up is as follows.
    Ubuntu 20.04.1 LTS
    x86_64
    GNOME version: 3.36.3
    Python 3.8.6 (default, Sep 25 2020, 21:22:01)
    Tk version: 8.6.10
    [GCC 7.5.0] on linux

    Hopefully, the above might give some pointers to handling these characters.

    @IanSt1 IanSt1 mannequin added 3.8 only security fixes and removed 3.10 only security fixes labels Nov 8, 2020
    @ronaldoussoren
    Copy link
    Contributor

    I've filed a Tk issue about this: https://core.tcl-lang.org/tk/tktview/f9fa926666d8e06972b5f0583b07a3c98eaac0a0

    What versions of Tk are used?

    • On macOS I've tested with the Python.org installer, which uses Tk 8.6.8.

    @IanSt1
    Copy link
    Mannequin

    IanSt1 mannequin commented Nov 8, 2020

    On Ubuntu, Tk version is showing as 8.6.10
    On Windows 10, Tk version is showing as 8.6.9

    @ronaldoussoren
    Copy link
    Contributor

    The crash I had on macOS with tk 8.6.8 appears to be gone when using tk 8.6.10.

    What I got back was a SyntaxError when pasting a smiley emoji in an IDLE shell window when trying to type execute print("😀"). The SyntaxError message says: 'utf-8' codec can't encode characters in position 7-12: surrogates not allowed. That's likely to to how Tk represents this character in its text widget, and is something we could work around when converting Tcl/Tk strings to Python strings.

    Printing the emoji using 'print(chr(128516))' works fine.

    The scriptlet in msg380173 also works.

    @terryjreedy
    Copy link
    Member Author

    Serhiy, does Ronald's report above re 8.6.10 on macOS suggest what might be needed to make print("😀") work on Mac? As I remember, your year-old _tkinter patch to make print(<astral>) work on Linux and Windows converts Python strings differently on the two systems. But you did not know for sure what to do for macOS because nothing would work.

    @ronaldoussoren
    Copy link
    Contributor

    Note that the main installers for Python 3.8 and 3.9 will continue to use Tk 8.6.8 due to problems when building later Tk version on macOS 10.9.

    The current plan is to add an installer variant to (amongst others) uses Tk 8.6.10 (and .11 when that's released).

    @ronaldoussoren
    Copy link
    Contributor

    W.r.t. the SyntaxError I got (msg380552): It looks like it will be possible to work around that problem in _tkinter.c:unicodeFromTclStringAndSize by merging surrogate pairs.

    @serhiy-storchaka
    Copy link
    Member

    Please open a new issue for "surrogates not allowed".

    @ronaldoussoren
    Copy link
    Contributor

    I've filed bpo-42318 about the surrogate pairs error I mention in msg380552.

    @terryjreedy
    Copy link
    Member Author

    I closed python/issues-test-cpython#43647 as a duplicate of this.  It reported that BMP chars can fail also.  For instance, with "Noto Sans Mono", but not 'Dejavu Mono', the following crash.
    >>> '\u2705'
    '✅'
    >>> '\u270f'
    '✏'

    Unfortunately, as least on some *nix, the default tkFixedFont resolves to Noto Sans Mono.

    @serhiy-storchaka
    Copy link
    Member

    At least it is not a regression caused by support of astral characters (bpo-13153).

    @terryjreedy
    Copy link
    Member Author

    No, seems strictly a matter of complicated color, which is perhaps becoming more common. Firefox colors the checkbox (white checkmark on green field in a largish black square) but not the (smaller) pencil. I did not recognize either the FF or tk Windows pencil as a pencil without any color (so I searched), so I won't be surprised if FF upgrades its pencil too.

    @terryjreedy
    Copy link
    Member Author

    New changeset 1b4a9c7 by Terry Jan Reedy in branch 'master':
    bpo-42225: IDLE - document two unix-related problems. (bpo-25078)
    1b4a9c7

    @terryjreedy
    Copy link
    Member Author

    On macOS with 3.10.0a, 8.6.11 appears to fix this issue.
    >>> chr(128516)
    "😄"

    For IDLE, I am adding a paragraph to the doc. I will then close this issue as 'fixed' (insofar as we can for what is a 3rd party failure).

    @miss-islington
    Copy link
    Contributor

    New changeset e92923b by Miss Islington (bot) in branch '3.8':
    bpo-42225: IDLE - document two unix-related problems. (GH-25078)
    e92923b

    @miss-islington
    Copy link
    Contributor

    New changeset 84694c3 by Miss Islington (bot) in branch '3.9':
    bpo-42225: IDLE - document two unix-related problems. (GH-25078)
    84694c3

    @terryjreedy terryjreedy added 3.9 only security fixes 3.10 only security fixes labels Mar 31, 2021
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.8 only security fixes 3.9 only security fixes 3.10 only security fixes topic-tkinter topic-unicode type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants