msg75613 - (view) |
Author: Vlastimil Brom (vbr) |
Date: 2008-11-07 20:31 |
While experimenting with the new unicodedata for version 5.1 (many
thanks for it!) I discovered some strange behaviour of Idle with regard
to a character not available in any font on my system, namely Latin
capital letter sharp s - U+1E9E.
Cf. the following sessions:
Python 3.0rc2 (r30rc2:67141, Nov 7 2008, 11:43:46) [MSC v.1500 32 bit
(Intel)] on win32
Type "copyright", "credits" or "license()" for more information.
...
IDLE 3.0rc2
>>> print("\N{LATIN CAPITAL LETTER SHARP S}")
ẞ
>>> print("\N{LATIN CAPITAL LETTER S WITH CEDILLA}")
Ş
>>> print("\N{PHAGS-PA LETTER KA}")
ꡀ
>>> print("\ufff0")
>>> hex(ord("ẞ"))
'0x1e9e'
>>> hex(ord("Ş"))
'0x15e'
>>>
Of course, the exact view cannot be copied, but basically I see very
similar glyphs for the first two characters, while I had expected a
"square"-sign or something for the first one; this is what I get with
other surely unavailable glyph as well as a non existent character. See
the attached screenshot.
However, the characters remain clearly distinguished, as can be seen
e.g. after copying them as a parameter of ord(...).
Python 2.6 behaves the same way:
=======================
Python 2.6 (r26:66721, Oct 2 2008, 11:35:03) [MSC v.1500 32 bit
(Intel)] on win32
Type "copyright", "credits" or "license()" for more information.
...
IDLE 2.6
>>> print u"\N{LATIN CAPITAL LETTER SHARP S}"
ẞ
>>>
...
==============================================
Not that it is much important, but I found it a bit surprising. I'm
using WinXPh SP3 Czech.
|
msg75619 - (view) |
Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) *  |
Date: 2008-11-07 22:18 |
Idle seems indeed to do a hard job to find a font that can display the
character.
On my machine (WinXP sp3, French) there are many fonts, but only "Myriad
Web" can display \N{LATIN CAPITAL LETTER SHARP S}.
I was surprised that the "Character Map" application does not display
it. To find which font was used to display this character, I used
Microsoft Word: paste the character (from Idle), select the text, open
the Font chooser dialog box, and scroll through the polices until it
displays correctly in the "Preview" pane.
|
msg75621 - (view) |
Author: Martin v. Löwis (loewis) *  |
Date: 2008-11-07 23:04 |
IDLE, in itself, doesn't do anything special to render that character.
It passes it on to Tk.
I don't know how precisely Tk works in this case, but it might be that
it doesn't do anything special with the character *either*, but passes
it on to Windows.
FWIW, in this bug report, Firefox/Iceweasel, on Debian, renders the
character as a small sharp s. This rendition appears to be correct, as
the letter is supposed to look like the small letter, in principle.
I can't see why Amaury says it works for him (there is surely something
odd in the OP's system), I'm fairly confident that there is no bug in
Python here. In particular, the unicodedata module has nothing to do
with it.
|
msg75624 - (view) |
Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) *  |
Date: 2008-11-08 00:12 |
My mistake, I did not know that "SHARP S" stands for the "Eszett" German
letter.
There is a font (Myriad Web on my system) that proposes a (wrong) glyph
for this code point. According to
http://www.tcl.tk/software/tcltk/whatsnew.tml#i18n
"""
Tk guarantees to find a way to display any Unicode character regardless
of the font you selected, as long as there is some font in the system
that contains the Unicode character
"""
but Tk does not guarantee that the font will display the correct
character...
|
msg75629 - (view) |
Author: Vlastimil Brom (vbr) |
Date: 2008-11-08 08:16 |
I'm aware, that it isn't an issue of unicodedata, it was just the way I
came to try such a "modern" unicode character.
I also see, that tk works pretty well in finding an appropriate font
(e.g. compared to wx, which I use more often) - it took me quite some
time to find another clearly unavailable character - PHAGS-PA
LETTER ... :-), which however behaves like I'd expect - a square is
displayed.
Regarding capital eszet/sharp s missing in windows' charmap, this isn't
anything special, as its character database seems to be quite ancient
(acording to http://www.babelstone.co.uk/Software/BabelMap.html it
might be Unicode 2.0, instead of the current 5.1). Capital sharp s was
added in this last version of the Unicode standard - 5.1. (The
mentioned tool also has a tool to find a font containing a given
unicode block.)
In the unicode database, there is a "crossreference" to the small sharp
s, but I'm not sure, how this info should be interpretted, it's a
different item than the plain lower/capital pair.
However, is there something to look into to check, what might be
misconfigured on my system?
|
msg75632 - (view) |
Author: Martin v. Löwis (loewis) *  |
Date: 2008-11-08 12:15 |
On my system, a square box is drawn indeed.
First, I would like to confirm that this is not a bug in Python. Can you
please install Tcl 8.5 separately, run wish, and execute
label .l -text "\u1e9e"
pack .l
IIUC, Tk will try to find a font that contains the character. First,
there is a list of fallback fonts per family. If none supports the
character, there is a global fallback list. If the character is still
not found, it will enumerate all fonts in the system, and invokes
GetFontData, asking for the resource 0x636d6170 - this should give the
list of all characters supported in the font.
It would be useful to find out what specific font Tk has chosen.
Unfortunately, there seems to be no direct way to find out. In the
Tktest Tcl extension, there is a command "testfont subfonts <fontname>"
which you can use to find out what subfonts have been loaded; this might
give a clue what subfont was used.
If you are willing to recompile Tk, you can augment
tkWinFont.c:FindSubFontForChar to print a message when this specific
char gets looked up, and what the resulting subfont was.
|
msg75641 - (view) |
Author: Vlastimil Brom (vbr) |
Date: 2008-11-08 18:23 |
I can confirm, that TCL displays the same character as Idle, hence it
itsn't a bug in Python (cf. the screenshot).
Unfortunately, I couldn't identify the font used here; I'm not able to
modify and recompile Tk, as suggested, but I tried to check the
possible serif fonts visually.
None of the fonts listed in Word is identical to the one used for
capital sharp s in tcl (I created a simple app with Tkinter Label-s
showing the pairs of the characters in question using the potentially
similar fonts; while some are really close, in all cases there are
various differences in glyphs; )
In any case, I guess this isn't a problem in python, which would have
to be further examined; I have quite a lot of fonts installed, probably
with some of them behaving in some "non-standard" ways
|
msg83582 - (view) |
Author: Vlastimil Brom (vbr) |
Date: 2009-03-14 11:53 |
I just wanted to confirm, that there isn't a bug in idle nor tk, but
somwhere in my istalled fonts.
Now while testing python 3.1a1, when I also have a font containing ẞ
LATIN CAPITAL LETTER SHARP S (DejaVu), it's more clear.
Printing this character using a default font in idle I get the wrong
glyph mentioned in the report; however this is corrected immediately
after changing the font to DejaVu.
Some of the fonts on my system seems to "shadow" this newly added
character with a wrong glyph (also preventing tk to find a font realy
suporting this).
Sorry for the needles bug report.
vbr
|
|
Date |
User |
Action |
Args |
2022-04-11 14:56:41 | admin | set | github: 48531 |
2009-03-14 11:53:19 | vbr | set | messages:
+ msg83582 |
2008-11-08 18:38:01 | loewis | set | status: open -> closed resolution: works for me versions:
+ 3rd party, - Python 2.6, Python 3.0 |
2008-11-08 18:23:53 | vbr | set | files:
+ capital-sharp-s-TCL-Idle.jpg messages:
+ msg75641 |
2008-11-08 12:15:41 | loewis | set | messages:
+ msg75632 |
2008-11-08 08:16:09 | vbr | set | messages:
+ msg75629 |
2008-11-08 00:12:03 | amaury.forgeotdarc | set | status: closed -> open resolution: works for me -> (no value) messages:
+ msg75624 |
2008-11-07 23:04:44 | loewis | set | nosy:
+ loewis messages:
+ msg75621 |
2008-11-07 22:18:45 | amaury.forgeotdarc | set | status: open -> closed resolution: works for me messages:
+ msg75619 nosy:
+ amaury.forgeotdarc |
2008-11-07 20:31:59 | vbr | create | |