Title: unicode.isdecimal bug in online Python 2 documentation
Created on 2019-03-24 14:47 by pewscorner, last changed 2019-04-10 07:31 by zheng.

Author: PEW's Corner (pewscorner) Date: 2019-03-24 14:47
The online Python 2 documentation for unicode.isdecimal ( incorrectly states:

"Decimal characters include digit characters".

This is wrong (decimal characters are actually a subset of digit characters), and u'\u00b3' is an example of a character that is a digit but not a decimal.

Issue 26483 ( corrected the same bug in the Python 3 documentation, and a similar correction should be applied to the Python 2 documentation.
Author: zheng (zheng) Date: 2019-04-10 07:31
I propose we copy over the exact changes made to the Python 3 documentation.

I looked through the code mentioned in the other thread. Namely, `Objects/unicodeobject.c` and `Tools/unicode/`. The implementation is identical between python 2 and python 3. The only difference appears to be the unicode version used.

    # decimal digit, integer digit
                decimal = 0
                if record[6]:
                    flags |= DECIMAL_MASK
                    decimal = int(record[6])
                digit = 0
                if record[7]:
                    flags |= DIGIT_MASK
                    digit = int(record[7])
                if record[8]:
                    flags |= NUMERIC_MASK
                    numeric.setdefault(record[8], []).append(char)

Another form of validation I did was enumerate all the digits and decimals and compare between versions. It looks like the general change is that there are a bunch of new unicode characters introduced in python 3. The exception is NEW TAI LUE THAM DIGIT ONE which gets recategorized as a digit.

python 2, compiled with UCS4
for u in map(unichr, list(range(0x10FFFF))):
    if unicode.isdigit(u):

python 3
for u in map(chr, range(0x10FFFF)):
    if str.isdigit(u):
