New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Correct the float(), int() and complex() documentation #54819
Comments
The Python3 documentation for these numeric constructors is wrong. Python has supported Unicode numerals specified as code points from the Unicode category "Nd" (decimal digit) since Python 1.6.0 when Unicode was first introduced in Python.
The Python3 documentation adds a reference to the language spec which is not really ideal, since the language spec has different requirements than a number object constructor which has to deal with data input rather than program text:
The Python2 documentation does not have such an implication:
The Python3 documentation needs to be extended to either mention that all Unicode code points from the Unicode category "Nd" (decimal digit) are accepted as digits and used with their corresponding decimal digit value, or include a copy of the referenced language spec section with this definition of ''digit'': digit := ::= "0"..."9" and any Unicode code point with property "Nd" Here's a complete list of the code point ranges that have this property: http://www.unicode.org/Public/5.2.0/ucd/extracted/DerivedNumericType.txt (scroll to the end of the file) It would also be worthwhile to add a note to the Python2 documentation. |
The reference to the language spec was really just a way to avoid spelling out all the details (again) about the precise form of a floating-point string; apart from the accepted set of digits, the forms are exactly the same (optional sign, numeric part, optional exponent, ...); spelling it all out twice gets a bit tiresome. Would it be acceptable to add a note to the current documentation describing the alternative digits that are accepted? |
Marc, I don't want to further sprawl the python-dev thread, but it would be great if you could help with bpo-10587 as well. That is a documentation-only issue, but there is some disagreement about how specific the docs should be. Some of the relevant functions are documented in the header files, but some such as str.splitlines() are not. I am posting it here because the level of detail that we want to document is probably similar in the two issues. For example, we don't want to document things like int(3, -909) producing 3 in 2.6. On the other hand, the fact that Arabic numerals are accepted by int() but Chinese are not, should probably be included. |
Should we also review the documentation for fractions and decimals? For example, fractions are documented as accepting "strings of decimal digits", but given that we have presumably non-identical str.isdigit() and str.isdecimal() methods, the above definition begs a question whether accepted strings should be digits, decimals or both? |
Try not to sprawl this all over the docs. Find the most common root and document it there. No need to garbage-up Fractions, Decimal etc. with something that is of zero interest to 99.9% of users. |
On Fri, Dec 3, 2010 at 12:10 AM, Raymond Hettinger
Decimal do already has a big BNF display with digit ::= '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' And a note that, btw, "Other Unicode decimal digits are also permitted http://docs.python.org/dev/library/decimal.html#decimal-objects Builtin int() doc take you on a link chase that ends at the language |
Let me know when you have a proposed doc patch. Ideally, the details should just be in one place and we can refer to it elsewhere. We don't want to add extra info to every function or method in Python that uses int(s) and gets extra unicode digits as an unintended artifact. |
Alexander Belopolsky wrote:
The term "decimal digit" is defined in the Unicode standard as those code The methods .isdecimal(), .isdigit() and .isnumeric() check the See http://www.unicode.org/reports/tr44/#Numeric_Type for details The docs for those methods need to be updated as well. Doing this The best option is to refer to the code point properties The resp. numeric values are available via the unicodedata module. |
Raymond Hettinger wrote:
That's a good idea. It may be enough to just add a new unicode_decimal_digit ::= ... to the language spec (even if it is not used there) and then reference Same for unicode_digit and unicode_numeric. |
New changeset 6853b480e388 by Raymond Hettinger in branch '3.1': New changeset a1e685ceb3bd by Raymond Hettinger in branch '3.2': New changeset 997271aebd69 by Raymond Hettinger in branch 'default': |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: