This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Idle - incorrectly displaying a character (Latin capital letter sharp s)
Type: Stage:
Components: IDLE, Tkinter, Unicode Versions: 3rd party
process
Status: closed Resolution: works for me
Dependencies: Superseder:
Assigned To: Nosy List: amaury.forgeotdarc, loewis, vbr
Priority: normal Keywords:

Created on 2008-11-07 20:31 by vbr, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
idle-capital-sharp-s.jpg vbr, 2008-11-07 20:31 screenshot of an Idle session in 3.0rc2 showing an incorrectly displayed character Latin capital letter sharp s - U+1E9E
capital-sharp-s-TCL-Idle.jpg vbr, 2008-11-08 18:23 "capital sharp s" in TCL wish and Python Idle
Messages (8)
msg75613 - (view) Author: Vlastimil Brom (vbr) Date: 2008-11-07 20:31
While experimenting with the new unicodedata for version 5.1 (many 
thanks for it!) I discovered some strange behaviour of Idle with regard 
to a character not available in any font on my system, namely Latin 
capital letter sharp s - U+1E9E.
Cf. the following sessions:

Python 3.0rc2 (r30rc2:67141, Nov  7 2008, 11:43:46) [MSC v.1500 32 bit 
(Intel)] on win32
Type "copyright", "credits" or "license()" for more information.
...    
IDLE 3.0rc2      

>>> print("\N{LATIN CAPITAL LETTER SHARP S}")
ẞ
>>> print("\N{LATIN CAPITAL LETTER S WITH CEDILLA}")
Ş
>>> print("\N{PHAGS-PA LETTER KA}")
ꡀ
>>> print("\ufff0")
￰
>>> hex(ord("ẞ"))
'0x1e9e'
>>> hex(ord("Ş"))
'0x15e'
>>> 

Of course, the exact view cannot be copied, but basically I see very 
similar glyphs for the first two characters, while I had expected a 
"square"-sign or something for the first one; this is what I get with 
other surely unavailable glyph as well as a non existent character. See 
the attached screenshot.

However, the characters remain clearly distinguished, as can be seen 
e.g. after copying them as a parameter of ord(...).

Python 2.6 behaves the same way:
=======================
Python 2.6 (r26:66721, Oct  2 2008, 11:35:03) [MSC v.1500 32 bit 
(Intel)] on win32
Type "copyright", "credits" or "license()" for more information.
...    
IDLE 2.6      
>>> print u"\N{LATIN CAPITAL LETTER SHARP S}"
ẞ
>>> 

...
==============================================

Not that it is much important, but I found it a bit surprising. I'm 
using WinXPh SP3 Czech.
msg75619 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-11-07 22:18
Idle seems indeed to do a hard job to find a font that can display the 
character.
On my machine (WinXP sp3, French) there are many fonts, but only "Myriad 
Web" can display \N{LATIN CAPITAL LETTER SHARP S}.

I was surprised that the "Character Map" application does not display 
it. To find which font was used to display this character, I used 
Microsoft Word: paste the character (from Idle), select the text, open 
the Font chooser dialog box, and scroll through the polices until it 
displays correctly in the "Preview" pane.
msg75621 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2008-11-07 23:04
IDLE, in itself, doesn't do anything special to render that character.
It passes it on to Tk.

I don't know how precisely Tk works in this case, but it might be that
it doesn't do anything special with the character *either*, but passes
it on to Windows.

FWIW, in this bug report, Firefox/Iceweasel, on Debian, renders the
character as a small sharp s. This rendition appears to be correct, as
the letter is supposed to look like the small letter, in principle.

I can't see why Amaury says it works for him (there is surely something
odd in the OP's system), I'm fairly confident that there is no bug in
Python here. In particular, the unicodedata module has nothing to do
with it.
msg75624 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-11-08 00:12
My mistake, I did not know that "SHARP S" stands for the "Eszett" German 
letter.

There is a font (Myriad Web on my system) that proposes a (wrong) glyph 
for this code point. According to
http://www.tcl.tk/software/tcltk/whatsnew.tml#i18n
"""
Tk guarantees to find a way to display any Unicode character regardless 
of the font you selected, as long as there is some font in the system 
that contains the Unicode character
"""
but Tk does not guarantee that the font will display the correct 
character...
msg75629 - (view) Author: Vlastimil Brom (vbr) Date: 2008-11-08 08:16
I'm aware, that it isn't an issue of unicodedata, it was just the way I 
came to try such a "modern" unicode character.
I also see, that tk works pretty well in finding an appropriate font 
(e.g. compared to wx, which I use more often) - it took me quite some 
time to find another clearly unavailable character - PHAGS-PA 
LETTER ... :-), which however behaves like I'd expect - a square is 
displayed.

Regarding capital eszet/sharp s missing in windows' charmap, this isn't 
anything special, as its character database seems to be quite ancient 
(acording to http://www.babelstone.co.uk/Software/BabelMap.html it 
might be Unicode 2.0, instead of the current 5.1). Capital sharp s was 
added in this last version of the Unicode standard - 5.1. (The 
mentioned tool also has a tool to find a font containing a given 
unicode block.)
In the unicode database, there is a "crossreference" to the small sharp 
s, but I'm not sure, how this info should be interpretted, it's a 
different item than the plain lower/capital pair.

However, is there something to look into to check, what might be 
misconfigured on my system?
msg75632 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2008-11-08 12:15
On my system, a square box is drawn indeed.

First, I would like to confirm that this is not a bug in Python. Can you
please install Tcl 8.5 separately, run wish, and execute

   label .l -text "\u1e9e"
   pack .l

IIUC, Tk will try to find a font that contains the character. First,
there is a list of fallback fonts per family. If none supports the
character, there is a global fallback list. If the character is still
not found, it will enumerate all fonts in the system, and invokes
GetFontData, asking for the resource 0x636d6170 - this should give the
list of all characters supported in the font.

It would be useful to find out what specific font Tk has chosen.
Unfortunately, there seems to be no direct way to find out. In the
Tktest Tcl extension, there is a command "testfont subfonts <fontname>"
which you can use to find out what subfonts have been loaded; this might
give a clue what subfont was used.

If you are willing to recompile Tk, you can augment
tkWinFont.c:FindSubFontForChar to print a message when this specific
char gets looked up, and what the resulting subfont was.
msg75641 - (view) Author: Vlastimil Brom (vbr) Date: 2008-11-08 18:23
I can confirm, that TCL displays the same character as Idle, hence it 
itsn't a bug in Python (cf. the screenshot).
Unfortunately, I couldn't identify the font used here; I'm not able to 
modify and recompile Tk, as suggested, but I tried to check the 
possible serif fonts visually.
None of the fonts listed in Word is identical to the one used for 
capital sharp s in tcl (I created a simple app with Tkinter Label-s 
showing the pairs of the characters in question using the potentially 
similar fonts; while some are really close, in all cases there are 
various differences in glyphs; )

In any case, I guess this isn't a problem in python, which would have 
to be further examined; I have quite a lot of fonts installed, probably 
with some of them behaving in some "non-standard" ways
msg83582 - (view) Author: Vlastimil Brom (vbr) Date: 2009-03-14 11:53
I just wanted to confirm, that there isn't a bug in idle nor tk, but 
somwhere in my istalled fonts.
Now while testing python 3.1a1, when I also have a font containing ẞ 
LATIN CAPITAL LETTER SHARP S (DejaVu), it's more clear.
Printing this character using a default font in idle I get the wrong 
glyph mentioned in the report; however this is corrected immediately 
after changing the font to DejaVu.
Some of the fonts on my system seems to "shadow" this newly added 
character with a wrong glyph (also preventing tk to find a font realy 
suporting this).

Sorry for the needles bug report.
   vbr
History
Date User Action Args
2022-04-11 14:56:41adminsetgithub: 48531
2009-03-14 11:53:19vbrsetmessages: + msg83582
2008-11-08 18:38:01loewissetstatus: open -> closed
resolution: works for me
versions: + 3rd party, - Python 2.6, Python 3.0
2008-11-08 18:23:53vbrsetfiles: + capital-sharp-s-TCL-Idle.jpg
messages: + msg75641
2008-11-08 12:15:41loewissetmessages: + msg75632
2008-11-08 08:16:09vbrsetmessages: + msg75629
2008-11-08 00:12:03amaury.forgeotdarcsetstatus: closed -> open
resolution: works for me -> (no value)
messages: + msg75624
2008-11-07 23:04:44loewissetnosy: + loewis
messages: + msg75621
2008-11-07 22:18:45amaury.forgeotdarcsetstatus: open -> closed
resolution: works for me
messages: + msg75619
nosy: + amaury.forgeotdarc
2008-11-07 20:31:59vbrcreate