This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author frederic.grosshans
Recipients ezio.melotti, frederic.grosshans, gregory.p.smith, mark.dickinson, rhettinger, serhiy.storchaka, terry.reedy, vstinner, weightwatchers-carlanderson
Date 2021-04-24.13:09:34
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1619269774.39.0.444772408084.issue43520@roundup.psfhosted.org>
In-reply-to
Content
@Gregory P. Smith 

unicodedata.numeric, in the sdandard library, already handles non-Ascii fractions in many scripts. The current “problem” is it outputs a float (even for integers):

>>> unicodedata.numeric('⅔')
0.6666666666666666

The UnicodeData.txt file from the Unicode standard it takes its data from, however, contains the corresponding “ascii fractions”. For example, below are two lines of this file for two (very) different ways of encoding two thirds

2154;VULGAR FRACTION TWO THIRDS;No;0;ON;<fraction> 0032 2044 0033;;;2/3;N;FRACTION TWO THIRDS;;;;
1245B;CUNEIFORM NUMERIC SIGN TWO THIRDS DISH;Nl;0;L;;;;2/3;N;;;;;

Adding an exact value extraction to unicodedata should be doable, either via an function or an extra keyword to the unicodedata.numeric function.

The only information that would be lost (but which is unavailable now anyway) would be for the few codepoints which encode reducible fractions. As of unicode 13.0, these codepoints are

* ↉ U+2189 VULGAR FRACTION ZERO THIRDS
* 𐧷 U+109F7 MEROITIC CURSIVE FRACTION TWO TWELFTHS
* 𐧸 U+109F8 MEROITIC CURSIVE FRACTION THREE TWELFTHS
* 𐧹 U+109F9 MEROITIC CURSIVE FRACTION FOUR TWELFTHS
* 𐧻 U+109FB MEROITIC CURSIVE FRACTION SIX TWELFTHS
* 𐧽 U+109FD MEROITIC CURSIVE FRACTION EIGHT TWELFTHS
* 𐧾 U+109FE MEROITIC CURSIVE FRACTION NINE TWELFTHS
* 𐧿 U+109FF MEROITIC CURSIVE FRACTION TEN TWELFTHS
History
Date User Action Args
2021-04-24 13:09:34frederic.grosshanssetrecipients: + frederic.grosshans, rhettinger, terry.reedy, gregory.p.smith, mark.dickinson, vstinner, ezio.melotti, serhiy.storchaka, weightwatchers-carlanderson
2021-04-24 13:09:34frederic.grosshanssetmessageid: <1619269774.39.0.444772408084.issue43520@roundup.psfhosted.org>
2021-04-24 13:09:34frederic.grosshanslinkissue43520 messages
2021-04-24 13:09:34frederic.grosshanscreate