This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author steven.daprano
Recipients StyXman, steven.daprano
Date 2019-02-24.10:05:25
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1551002725.83.0.385571425891.issue36100@roundup.psfhosted.org>
In-reply-to
Content
I think that analysis is wrong. The Wikipedia page describes the meaning of the Unicode Decimal/Digit/Numeric properties:

https://en.wikipedia.org/wiki/Unicode_character_property#Numeric_values_and_types

and the characters you show aren't appropriate for converting to ints:

py> for c in '一二三四五':
...     print(unicodedata.name(c))
...
CJK UNIFIED IDEOGRAPH-4E00
CJK UNIFIED IDEOGRAPH-4E8C
CJK UNIFIED IDEOGRAPH-4E09
CJK UNIFIED IDEOGRAPH-56DB
CJK UNIFIED IDEOGRAPH-4E94

The first one, for example, is translated as "one; a, an; alone"; it is better read as the *word* one rather than the numeral 1. (Disclaimer: I am not a Chinese speaker and I welcome correction from an expert.)

Likewise U+4E8C, translated as "two; twice".

The blog post is factually wrong when it claims:

"str.isdigit only returns True for what I said before, strings containing solely the digits 0-9."

py> s = "\N{BENGALI DIGIT ONE}\N{BENGALI DIGIT TWO}"
py> s.isdigit()
True
py> int(s)
12

So I think that there's nothing to do here (unless it is perhaps to add a FAQ about it, or improve the docs).
History
Date User Action Args
2019-02-24 10:05:25steven.dapranosetrecipients: + steven.daprano, StyXman
2019-02-24 10:05:25steven.dapranosetmessageid: <1551002725.83.0.385571425891.issue36100@roundup.psfhosted.org>
2019-02-24 10:05:25steven.dapranolinkissue36100 messages
2019-02-24 10:05:25steven.dapranocreate