Issue 36417: unicode.isdecimal bug in online Python 2 documentation

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/80598

classification

Title:	unicode.isdecimal bug in online Python 2 documentation
Type:	behavior	Stage:	resolved
Components:	Documentation	Versions:	Python 2.7

process

Status:	closed	Resolution:	out of date
Dependencies:		Superseder:
Assigned To:	docs@python	Nosy List:	docs@python, pewscorner, zach.ware, zheng
Priority:	normal	Keywords:	easy, patch

Created on 2019-03-24 14:47 by pewscorner, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Pull Requests
URL	Status	Linked	Edit
PR 12757	closed	zheng, 2019-04-10 07:27

Messages (3)
msg338736 - (view)	Author: PEW's Corner (pewscorner) *	Date: 2019-03-24 14:47
The online Python 2 documentation for unicode.isdecimal (https://docs.python.org/2/library/stdtypes.html#unicode.isdecimal) incorrectly states: "Decimal characters include digit characters". This is wrong (decimal characters are actually a subset of digit characters), and u'\u00b3' is an example of a character that is a digit but not a decimal. Issue 26483 (https://bugs.python.org/issue26483) corrected the same bug in the Python 3 documentation, and a similar correction should be applied to the Python 2 documentation.
msg339832 - (view)	Author: zheng (zheng) *	Date: 2019-04-10 07:31
I propose we copy over the exact changes made to the Python 3 documentation. I looked through the code mentioned in the other thread. Namely, `Objects/unicodeobject.c` and `Tools/unicode/makeunicodedata.py`. The implementation is identical between python 2 and python 3. The only difference appears to be the unicode version used. # decimal digit, integer digit decimal = 0 if record[6]: flags \|= DECIMAL_MASK decimal = int(record[6]) digit = 0 if record[7]: flags \|= DIGIT_MASK digit = int(record[7]) if record[8]: flags \|= NUMERIC_MASK numeric.setdefault(record[8], []).append(char) Another form of validation I did was enumerate all the digits and decimals and compare between versions. It looks like the general change is that there are a bunch of new unicode characters introduced in python 3. The exception is NEW TAI LUE THAM DIGIT ONE which gets recategorized as a digit. python 2, compiled with UCS4 for u in map(unichr, list(range(0x10FFFF))): if unicode.isdigit(u): print(unicodedata.name(u)) python 3 for u in map(chr, range(0x10FFFF)): if str.isdigit(u): print(name(u))
msg360267 - (view)	Author: Zachary Ware (zach.ware) *	Date: 2020-01-19 19:00
As Python 2.7 has reached EOL and the branch is closed to regular maintenance, I'm closing the issue. Thanks for the report and patch anyway!

History
Date	User	Action	Args
2022-04-11 14:59:12	admin	set	github: 80598
2020-01-19 19:00:52	zach.ware	set	status: open -> closed nosy: + zach.ware messages: + msg360267 resolution: out of date stage: patch review -> resolved
2019-04-10 07:31:29	zheng	set	nosy: + zheng messages: + msg339832
2019-04-10 07:27:12	zheng	set	keywords: + patch stage: needs patch -> patch review pull_requests: + pull_request12684
2019-03-24 15:15:10	serhiy.storchaka	set	keywords: + easy stage: needs patch
2019-03-24 14:47:13	pewscorner	create