classification
Title: Correct the float(), int() and complex() documentation
Type: Stage:
Components: Documentation, Unicode Versions: Python 3.1, Python 3.2, Python 3.3, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: belopolsky, docs@python, lemburg, mark.dickinson, python-dev, rhettinger, terry.reedy
Priority: normal Keywords:

Created on 2010-12-02 22:31 by lemburg, last changed 2011-03-23 00:35 by rhettinger. This issue is now closed.

Messages (10)
msg123136 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2010-12-02 22:31
The Python3 documentation for these numeric constructors is wrong.

Python has supported Unicode numerals specified as code points from the Unicode category "Nd" (decimal digit) since Python 1.6.0 when Unicode was first introduced in Python.

    http://www.unicode.org/versions/Unicode5.2.0/ch04.pdf
    (see Section 4.5: General Category)

The Python3 documentation adds a reference to the language spec which is not really ideal, since the language spec has different requirements than a number object constructor which has to deal with data input rather than program text:

    http://docs.python.org/dev/py3k/library/functions.html#float

The Python2 documentation does not have such an implication:

    http://docs.python.org/library/functions.html#float

The Python3 documentation needs to be extended to either mention that all Unicode code points from the Unicode category "Nd"  (decimal digit) are accepted as digits and used with their corresponding decimal digit value, or include a copy of the referenced language spec section with this definition of ''digit'':

digit := ::=  "0"..."9" and any Unicode code point with property "Nd"

Here's a complete list of the code point ranges that have this property:

   http://www.unicode.org/Public/5.2.0/ucd/extracted/DerivedNumericType.txt

(scroll to the end of the file)

It would also be worthwhile to add a note to the Python2 documentation.
msg123137 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-12-02 22:34
The reference to the language spec was really just a way to avoid spelling out all the details (again) about the precise form of a floating-point string;  apart from the accepted set of digits, the forms are exactly the same (optional sign, numeric part, optional exponent, ...);  spelling it all out twice gets a bit tiresome.

Would it be acceptable to add a note to the current documentation describing the alternative digits that are accepted?
msg123143 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010-12-02 23:23
Marc,

I don't want to further sprawl the python-dev thread, but it would be great if you could help with issue10587 as well.  That is a documentation-only issue, but there is some disagreement about how specific the docs should be.  Some of the relevant functions are documented in the header files, but some such as str.splitlines() are not.   I am posting it here because the level of detail that we want to document is probably similar in the two issues.  For example, we don't want to document things like int(3, -909) producing 3 in 2.6.  On the other hand, the fact that Arabic numerals are accepted by int() but Chinese are not, should probably be included.
msg123157 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010-12-03 01:58
Should we also review the documentation for fractions and decimals?  For example, fractions are documented as accepting "strings of decimal digits", but given that we have presumably non-identical str.isdigit() and str.isdecimal() methods, the above definition begs a question whether accepted strings should be digits, decimals or both?
msg123182 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2010-12-03 05:10
Try not to sprawl this all over the docs.  Find the most common root and document it there.  No need to garbage-up Fractions, Decimal etc. with something that is of zero interest to 99.9% of users.
msg123183 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010-12-03 05:19
On Fri, Dec 3, 2010 at 12:10 AM, Raymond Hettinger
<report@bugs.python.org> wrote:
..
> Try not to sprawl this all over the docs.  Find the most common root and document it there.
> No need to garbage-up Fractions, Decimal etc. with something that is of zero interest to
> 99.9% of users.

Decimal do already has a big BNF display with

digit          ::=  '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9'

And a note that, btw, "Other Unicode decimal digits are also permitted
where digit appears above. These include decimal digits from various
other alphabets (for example, Arabic-Indic and Devanāgarī digits)
along with the fullwidth digits '\uff10' through '\uff19'."

http://docs.python.org/dev/library/decimal.html#decimal-objects

Builtin int() doc take you on a link chase that ends at the language
reference int literal BNF.   Bringing these all to a common root was
exactly the reason I brought up these related modules.
msg123196 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2010-12-03 07:39
Let me know when you have a proposed doc patch.  Ideally, the details should just be in one place and we can refer to it elsewhere.   We don't want to add extra info to every function or method in Python that uses int(s) and gets extra unicode digits as an unintended artifact.
msg123209 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2010-12-03 09:14
Alexander Belopolsky wrote:
> 
> Alexander Belopolsky <belopolsky@users.sourceforge.net> added the comment:
> 
> Should we also review the documentation for fractions and decimals?  For example, fractions are documented as accepting "strings of decimal digits", but given that we have presumably non-identical str.isdigit() and str.isdecimal() methods, the above definition begs a question whether accepted strings should be digits, decimals or both?

The term "decimal digit" is defined in the Unicode standard as those code
points having the category "Ld". See
http://www.unicode.org/versions/Unicode5.2.0/ch04.pdf

The methods .isdecimal(), .isdigit() and .isnumeric() check the
availability the resp. field entries 6, 7 and 8 in the UCD

See http://www.unicode.org/reports/tr44/#Numeric_Type for details
and http://www.unicode.org/Public/6.0.0/ucd/extracted/DerivedNumericType.txt
for the full list of code points with these fields set.

The docs for those methods need to be updated as well. Doing this
for .isdigit() and .isnumeric() is a bit difficult, though, since
the code points don't fall into just a single category.

The best option is to refer to the code point properties
Numeric_Type=Decimal for .isdecimal(), Numeric_Type=Digit
for .isdigit() and Numeric_Type=Numeric for .isnumeric().

The resp. numeric values are available via the unicodedata module.
msg123210 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2010-12-03 09:18
Raymond Hettinger wrote:
> 
> Raymond Hettinger <rhettinger@users.sourceforge.net> added the comment:
> 
> Try not to sprawl this all over the docs.  Find the most common root and document it there.  No need to garbage-up Fractions, Decimal etc. with something that is of zero interest to 99.9% of users.

That's a good idea. It may be enough to just add a new

unicode_decimal_digit ::= ...

to the language spec (even if it is not used there) and then reference
it from the other parts of the docs.

Same for unicode_digit and unicode_numeric.
msg131826 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2011-03-23 00:34
New changeset 6853b480e388 by Raymond Hettinger in branch '3.1':
Issue #10610: Document that int(), float(), and complex() accept numeric literals with the Nd property.
http://hg.python.org/cpython/rev/6853b480e388

New changeset a1e685ceb3bd by Raymond Hettinger in branch '3.2':
Issue #10610: Document that int(), float(), and complex() accept numeric literals with the Nd property.
http://hg.python.org/cpython/rev/a1e685ceb3bd

New changeset 997271aebd69 by Raymond Hettinger in branch 'default':
Issue #10610: Document that int(), float(), and complex() accept numeric literals with the Nd property.
http://hg.python.org/cpython/rev/997271aebd69
History
Date User Action Args
2011-03-23 00:35:23rhettingersetstatus: open -> closed
nosy: lemburg, rhettinger, terry.reedy, mark.dickinson, belopolsky, docs@python, python-dev
resolution: fixed
versions: + Python 3.1, Python 2.7, Python 3.3
2011-03-23 00:34:48python-devsetnosy: + python-dev
messages: + msg131826
2010-12-03 09:18:19lemburgsetmessages: + msg123210
2010-12-03 09:14:04lemburgsetmessages: + msg123209
2010-12-03 07:39:29rhettingersetmessages: + msg123196
2010-12-03 07:32:32rhettingersetmessages: - msg123190
2010-12-03 05:41:30rhettingersetassignee: docs@python -> rhettinger
messages: + msg123190
2010-12-03 05:19:51belopolskysetmessages: + msg123183
2010-12-03 05:10:25rhettingersetnosy: + rhettinger
messages: + msg123182
2010-12-03 01:58:50belopolskysetmessages: + msg123157
2010-12-02 23:23:11belopolskysetnosy: + belopolsky
messages: + msg123143
2010-12-02 23:22:47terry.reedysetnosy: + terry.reedy
2010-12-02 22:34:23mark.dickinsonsetmessages: + msg123137
2010-12-02 22:31:55mark.dickinsonsetnosy: + mark.dickinson
2010-12-02 22:31:13lemburgcreate