Issue 5791: title information of unicodedata is wrong in some cases

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/50041

classification

Title:	title information of unicodedata is wrong in some cases
Type:	behavior	Stage:
Components:	Interpreter Core	Versions:	Python 3.0, Python 2.4, Python 2.6, Python 2.5

process

Created on 2009-04-19 10:57 by Carl.Friedrich.Bolz, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (2)
msg86163 - (view)	Author: Carl Friedrich Bolz-Tereick (Carl.Friedrich.Bolz) *	Date: 2009-04-19 10:57
There seems to be a problem with some unicode character's title information: $ python2.6 Python 2.6.2c1 (release26-maint, Apr 14 2009, 08:02:48) [GCC 4.3.3] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> unichr(453) u'\u01c5' >>> unichr(453).title() u'\u01c4' But the title should return the same character, according to this: http://www.fileformat.info/info/unicode/char/01c5/index.htm (I also checked the files that unicode.org provides). I tried to follow the problem a bit, it seems to come from _PyUnicode_ToTitlecase in unicodetype.c. The unicode record contains the offset of the character to its titled version. If the character is its own titled version, then the offset is zero. But zero is also used for when there is no information available, so the offset to the upper-case version of the character is used. If this is a different character (as for the example above), the result of .title() is wrong.
msg86164 - (view)	Author: Martin v. Löwis (loewis) *	Date: 2009-04-19 11:05
This is a duplicate of issue4971

History
Date	User	Action	Args
2022-04-11 14:56:47	admin	set	github: 50041
2009-04-19 11:05:08	loewis	set	status: open -> closed nosy: + loewis messages: + msg86164 superseder: Incorrect title case resolution: duplicate
2009-04-19 10:57:47	Carl.Friedrich.Bolz	create