This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author lemburg
Recipients lemburg
Date 2010-11-29.15:15:36
SpamBayes Score 9.471202e-12
Marked as misclassified No
Message-id <1291043740.62.0.294053403052.issue10575@psf.upfronthosting.co.za>
In-reply-to
Content
The code point is also not listed as decimal digit (relevant for the int() decimal parsing):

>>> unicodedata.decimal(unicode('三', 'utf-8'))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: not a decimal

This is the relevant part of the script:

        for line in open(unihan):
            if not line.startswith('U+'):
                continue
            code, tag, value = line.split(None, 3)[:3]
            if tag not in ('kAccountingNumeric', 'kPrimaryNumeric',
                           'kOtherNumeric'):
                continue
            value = value.strip().replace(',', '')
            i = int(code[2:], 16)
            # Patch the numeric field
            if table[i] is not None:
                table[i][8] = value

The decimal column is not set for code points that have a kPrimaryNumeric value set. Position table[i][8] refers to the
numeric database entry, which correctly gives:

>>> unicodedata.numeric(unicode('三', 'utf-8'))
3.0
History
Date User Action Args
2010-11-29 15:15:40lemburgsetrecipients: + lemburg
2010-11-29 15:15:40lemburgsetmessageid: <1291043740.62.0.294053403052.issue10575@psf.upfronthosting.co.za>
2010-11-29 15:15:36lemburglinkissue10575 messages
2010-11-29 15:15:36lemburgcreate