classification
Title: tkFont.py assumes that all font families are encoded as ascii in Python 2.7
Type: behavior Stage: resolved
Components: Tkinter Versions: Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: culler, serhiy.storchaka, terry.reedy
Priority: normal Keywords: patch

Created on 2017-05-08 21:42 by culler, last changed 2017-05-27 14:01 by serhiy.storchaka. This issue is now closed.

Files
File name Uploaded Description Edit
JapanesePythonBug.png culler, 2017-05-08 21:44
tkFont.patch culler, 2017-05-10 02:09 patch for tkFont.py
Pull Requests
URL Status Linked Edit
PR 1567 merged serhiy.storchaka, 2017-05-13 05:29
PR 1832 merged serhiy.storchaka, 2017-05-27 13:36
Messages (8)
msg293256 - (view) Author: Marc Culler (culler) * Date: 2017-05-08 21:42
And that is a very bad assumption. On Windows 10 in the Japanese locale the default TkFixedFont has family u'\uff2d\uff33 \u30b4\u30b7\u30c3\u30af' (a transliteration of MS Gothic).

The error occurs on line 51:

     47     def _set(self, kw):
     48         options = []
     49         for k, v in kw.items():
     50             options.append("-"+k)
>>>> 51             options.append(str(v))
     52         return tuple(options)

I will attach a screenshot showing the crash on a Japanese Windows 10 system, running in the Python 2.7 command line application.
msg293369 - (view) Author: Marc Culler (culler) * Date: 2017-05-10 02:09
The attached patch simply decodes string options to the Font._set() method using the utf8 codec.  Other options (which will be numbers) are converted to ascii strings as currently happens.

This makes it possible to use the Font.copy() method without raising an exception when the font family name is not ascii encoded.
msg293572 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2017-05-12 21:20
Marc, In the future please copy and paste such text interactions instead of posting an image.

A 'crash' on Windows is when one gets the Windows' Error messagebox 'You application has stopped working.'

Since the exception comes from trying to encode non-ascii unicode to bytes, I would expect the condition in the fix to be
+            if isinstance(v, unicode):
+                options.append(v.encode('utf8'))
instead of the str/decode version in your patch.
If v is already a string (of ascii bytes), I think it should just be appended as is.  Anyway, Serhiy can decide what to do.
msg293589 - (view) Author: Marc Culler (culler) * Date: 2017-05-12 22:41
The name of a Tk font family is a byte sequence obtained from the operating system.  But, this being Python 2.7, there is no distinction between the str type and the bytes type.  The byte sequence is definitely not ascii encoded on a Japanese Windows system.  It is a utf8-encoded byte string.  This is why I called v.decode('utf8') in my patch.  Note that this bug does not occur with Python 3.6.

Terry, I understand that text is better and I hope I never have to resort to an image again.  Since I don't speak Japanese myself, even setting up a Japanese Windows VM for testing was pretty challenging for me.  I was able to take a screenshot without having to translate any Japanese menus, so I took that shortcut.  Sorry about that.

This report was indeed triggered by a real "crash" in your sense.  It occurred in a GUI application bundled with pyinstaller.  Any unhandled exception in a pyinstaller app results in termination of the process and posts a Windows error message box saying 'Failed to execute script XXX'.  For the report, however, I was isolating the underlying unhandled exception in Font.copy() that had caused the real crash of the GUI application.
msg293594 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-05-13 05:24
Since the source of the unicode font family is Tkinter, I think it is better not encode it in Python, but pass it as unicode and allow Tkinter to convert a unicode to a Tcl value.
msg293597 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2017-05-13 06:02
Test needed.

import Tkinter as tk
import tkFont as tkf

root = tk.Tk()
font = tkf.Font(root, size=20, family=u"MS \u30b4\u30b7\u30c3\u30af")

reproduces the failure in Marc's example.

  File "C:\Programs\Python27\lib\lib-tk\tkFont.py", line 74, in __init__
    font = self._set(options)
  File "C:\Programs\Python27\lib\lib-tk\tkFont.py", line 51, in _set
    options.append(str(v))
UnicodeEncodeError: 'ascii' codec can't encode characters in position 3-6: ordinal not in range(128)

After patching my installed 2.7.13, the code runs without exception and font is created, with the unrecognized family replaced by Ariel.
msg294531 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-05-26 05:15
New changeset 96f502059717a692ca3abd968b26c5ea2918ad3a by Serhiy Storchaka in branch '2.7':
[2.7] bpo-30310: tkFont now supports unicode options (e.g. font family). (#1567)
https://github.com/python/cpython/commit/96f502059717a692ca3abd968b26c5ea2918ad3a
msg294588 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-05-27 14:01
New changeset a92adf8f0782a1ccdc68942767bdb357a9281b30 by Serhiy Storchaka in branch 'master':
bpo-30310: Add a test for non-ascii font family. (#1567) (#1832)
https://github.com/python/cpython/commit/a92adf8f0782a1ccdc68942767bdb357a9281b30
History
Date User Action Args
2017-05-27 14:01:33serhiy.storchakasetmessages: + msg294588
2017-05-27 13:36:37serhiy.storchakasetpull_requests: + pull_request1915
2017-05-26 05:16:59serhiy.storchakasetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2017-05-26 05:15:53serhiy.storchakasetmessages: + msg294531
2017-05-13 06:02:55terry.reedysetmessages: + msg293597
2017-05-13 05:29:36serhiy.storchakasetpull_requests: + pull_request1662
2017-05-13 05:24:46serhiy.storchakasetmessages: + msg293594
2017-05-12 22:41:56cullersetmessages: + msg293589
2017-05-12 21:20:48terry.reedysetnosy: + terry.reedy, serhiy.storchaka
messages: + msg293572

type: crash -> behavior
stage: patch review
2017-05-10 02:09:07cullersetfiles: + tkFont.patch
keywords: + patch
messages: + msg293369
2017-05-08 21:44:22cullersetfiles: + JapanesePythonBug.png
2017-05-08 21:42:16cullercreate