Message122809
The code point is also not listed as decimal digit (relevant for the int() decimal parsing):
>>> unicodedata.decimal(unicode('三', 'utf-8'))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: not a decimal
This is the relevant part of the script:
for line in open(unihan):
if not line.startswith('U+'):
continue
code, tag, value = line.split(None, 3)[:3]
if tag not in ('kAccountingNumeric', 'kPrimaryNumeric',
'kOtherNumeric'):
continue
value = value.strip().replace(',', '')
i = int(code[2:], 16)
# Patch the numeric field
if table[i] is not None:
table[i][8] = value
The decimal column is not set for code points that have a kPrimaryNumeric value set. Position table[i][8] refers to the
numeric database entry, which correctly gives:
>>> unicodedata.numeric(unicode('三', 'utf-8'))
3.0 |
|
Date |
User |
Action |
Args |
2010-11-29 15:15:40 | lemburg | set | recipients:
+ lemburg |
2010-11-29 15:15:40 | lemburg | set | messageid: <1291043740.62.0.294053403052.issue10575@psf.upfronthosting.co.za> |
2010-11-29 15:15:36 | lemburg | link | issue10575 messages |
2010-11-29 15:15:36 | lemburg | create | |
|