Message 191729 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	belopolsky
Recipients	belopolsky, benjamin.peterson, ezio.melotti, lemburg, loewis, serhiy.storchaka
Date	2013-06-23.20:43:45
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1372020225.63.0.919311160342.issue18234@psf.upfronthosting.co.za>
In-reply-to

Content
unicodedata.name() was discussed in #12353 (msg144739) where MvL argued that misspelled names are better than corrected because they are more likely to appear misspelled in other sources. I am not sure I buy this argument. Someone googling for 'BYZANTINE MUSICAL SYMBOL FHTORA SKLIRON CHROMA VASIS' will probably just enter BYZANTINE VASIS and find what he or she needs. A more likely scenario is someone trying to get all FTHORA symbols using a naive code like this: [hex(i) for i in range(1114112) if 'FTHORA' in ud.name(chr(i), '')]. Even more likely scenario is someone seeing a fancy symbol on the web and wanting to use it in a python program. Such programmer would copy the symbol to python prompt, call unicode.name() and copy the result in the program. Do we want to encourage people to perpetuate the mistake that Unicode has corrected? I don't think the issue of control codes names was discussed in #12353. I see no downside with returning the first alias in case no name is present.

unicodedata.name() was discussed in #12353 (msg144739) where MvL argued that misspelled names are better than corrected because they are more likely to appear misspelled in other sources.  I am not sure I buy this argument.  Someone googling for 'BYZANTINE MUSICAL SYMBOL FHTORA SKLIRON CHROMA VASIS' will probably just enter BYZANTINE VASIS and find what he or she needs.  A more likely scenario is someone trying to get all FTHORA symbols using a naive code like this: [hex(i) for i in range(1114112) if 'FTHORA' in ud.name(chr(i), '')].

Even more likely scenario is someone seeing a fancy symbol on the web and wanting to use it in a python program.  Such programmer would copy the symbol to python prompt, call unicode.name() and copy the result in the program.  Do we want to encourage people to perpetuate the mistake that Unicode has corrected?

I don't think the issue of control codes names was discussed in #12353.  I see no downside with returning the first alias in case no name is present.

History
Date	User	Action	Args
2013-06-23 20:43:45	belopolsky	set	recipients: + belopolsky, lemburg, loewis, benjamin.peterson, ezio.melotti, serhiy.storchaka
2013-06-23 20:43:45	belopolsky	set	messageid: <1372020225.63.0.919311160342.issue18234@psf.upfronthosting.co.za>
2013-06-23 20:43:45	belopolsky	link	issue18234 messages
2013-06-23 20:43:45	belopolsky	create