Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correct the float(), int() and complex() documentation #54819

Closed
malemburg opened this issue Dec 2, 2010 · 10 comments
Closed

Correct the float(), int() and complex() documentation #54819

malemburg opened this issue Dec 2, 2010 · 10 comments
Assignees
Labels
docs Documentation in the Doc dir topic-unicode

Comments

@malemburg
Copy link
Member

BPO 10610
Nosy @malemburg, @rhettinger, @terryjreedy, @mdickinson, @abalkin

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = 'https://github.com/rhettinger'
closed_at = <Date 2011-03-23.00:35:23.759>
created_at = <Date 2010-12-02.22:31:13.399>
labels = ['expert-unicode', 'docs']
title = 'Correct the float(), int() and complex() documentation'
updated_at = <Date 2011-03-23.00:35:23.757>
user = 'https://github.com/malemburg'

bugs.python.org fields:

activity = <Date 2011-03-23.00:35:23.757>
actor = 'rhettinger'
assignee = 'rhettinger'
closed = True
closed_date = <Date 2011-03-23.00:35:23.759>
closer = 'rhettinger'
components = ['Documentation', 'Unicode']
creation = <Date 2010-12-02.22:31:13.399>
creator = 'lemburg'
dependencies = []
files = []
hgrepos = []
issue_num = 10610
keywords = []
message_count = 10.0
messages = ['123136', '123137', '123143', '123157', '123182', '123183', '123196', '123209', '123210', '131826']
nosy_count = 7.0
nosy_names = ['lemburg', 'rhettinger', 'terry.reedy', 'mark.dickinson', 'belopolsky', 'docs@python', 'python-dev']
pr_nums = []
priority = 'normal'
resolution = 'fixed'
stage = None
status = 'closed'
superseder = None
type = None
url = 'https://bugs.python.org/issue10610'
versions = ['Python 3.1', 'Python 2.7', 'Python 3.2', 'Python 3.3']

@malemburg
Copy link
Member Author

The Python3 documentation for these numeric constructors is wrong.

Python has supported Unicode numerals specified as code points from the Unicode category "Nd" (decimal digit) since Python 1.6.0 when Unicode was first introduced in Python.

http://www.unicode.org/versions/Unicode5.2.0/ch04.pdf
(see Section 4.5: General Category)

The Python3 documentation adds a reference to the language spec which is not really ideal, since the language spec has different requirements than a number object constructor which has to deal with data input rather than program text:

http://docs.python.org/dev/py3k/library/functions.html#float

The Python2 documentation does not have such an implication:

http://docs.python.org/library/functions.html#float

The Python3 documentation needs to be extended to either mention that all Unicode code points from the Unicode category "Nd" (decimal digit) are accepted as digits and used with their corresponding decimal digit value, or include a copy of the referenced language spec section with this definition of ''digit'':

digit := ::= "0"..."9" and any Unicode code point with property "Nd"

Here's a complete list of the code point ranges that have this property:

http://www.unicode.org/Public/5.2.0/ucd/extracted/DerivedNumericType.txt

(scroll to the end of the file)

It would also be worthwhile to add a note to the Python2 documentation.

@malemburg malemburg added docs Documentation in the Doc dir topic-unicode labels Dec 2, 2010
@mdickinson
Copy link
Member

The reference to the language spec was really just a way to avoid spelling out all the details (again) about the precise form of a floating-point string; apart from the accepted set of digits, the forms are exactly the same (optional sign, numeric part, optional exponent, ...); spelling it all out twice gets a bit tiresome.

Would it be acceptable to add a note to the current documentation describing the alternative digits that are accepted?

@abalkin
Copy link
Member

abalkin commented Dec 2, 2010

Marc,

I don't want to further sprawl the python-dev thread, but it would be great if you could help with bpo-10587 as well. That is a documentation-only issue, but there is some disagreement about how specific the docs should be. Some of the relevant functions are documented in the header files, but some such as str.splitlines() are not. I am posting it here because the level of detail that we want to document is probably similar in the two issues. For example, we don't want to document things like int(3, -909) producing 3 in 2.6. On the other hand, the fact that Arabic numerals are accepted by int() but Chinese are not, should probably be included.

@abalkin
Copy link
Member

abalkin commented Dec 3, 2010

Should we also review the documentation for fractions and decimals? For example, fractions are documented as accepting "strings of decimal digits", but given that we have presumably non-identical str.isdigit() and str.isdecimal() methods, the above definition begs a question whether accepted strings should be digits, decimals or both?

@rhettinger
Copy link
Contributor

Try not to sprawl this all over the docs. Find the most common root and document it there. No need to garbage-up Fractions, Decimal etc. with something that is of zero interest to 99.9% of users.

@abalkin
Copy link
Member

abalkin commented Dec 3, 2010

On Fri, Dec 3, 2010 at 12:10 AM, Raymond Hettinger
<report@bugs.python.org> wrote:
..

Try not to sprawl this all over the docs.  Find the most common root and document it there.
 No need to garbage-up Fractions, Decimal etc. with something that is of zero interest to
99.9% of users.

Decimal do already has a big BNF display with

digit ::= '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9'

And a note that, btw, "Other Unicode decimal digits are also permitted
where digit appears above. These include decimal digits from various
other alphabets (for example, Arabic-Indic and Devanāgarī digits)
along with the fullwidth digits '\uff10' through '\uff19'."

http://docs.python.org/dev/library/decimal.html#decimal-objects

Builtin int() doc take you on a link chase that ends at the language
reference int literal BNF. Bringing these all to a common root was
exactly the reason I brought up these related modules.

@rhettinger rhettinger assigned rhettinger and unassigned docspython Dec 3, 2010
@rhettinger
Copy link
Contributor

Let me know when you have a proposed doc patch. Ideally, the details should just be in one place and we can refer to it elsewhere. We don't want to add extra info to every function or method in Python that uses int(s) and gets extra unicode digits as an unintended artifact.

@malemburg
Copy link
Member Author

Alexander Belopolsky wrote:

Alexander Belopolsky <belopolsky@users.sourceforge.net> added the comment:

Should we also review the documentation for fractions and decimals? For example, fractions are documented as accepting "strings of decimal digits", but given that we have presumably non-identical str.isdigit() and str.isdecimal() methods, the above definition begs a question whether accepted strings should be digits, decimals or both?

The term "decimal digit" is defined in the Unicode standard as those code
points having the category "Ld". See
http://www.unicode.org/versions/Unicode5.2.0/ch04.pdf

The methods .isdecimal(), .isdigit() and .isnumeric() check the
availability the resp. field entries 6, 7 and 8 in the UCD

See http://www.unicode.org/reports/tr44/#Numeric_Type for details
and http://www.unicode.org/Public/6.0.0/ucd/extracted/DerivedNumericType.txt
for the full list of code points with these fields set.

The docs for those methods need to be updated as well. Doing this
for .isdigit() and .isnumeric() is a bit difficult, though, since
the code points don't fall into just a single category.

The best option is to refer to the code point properties
Numeric_Type=Decimal for .isdecimal(), Numeric_Type=Digit
for .isdigit() and Numeric_Type=Numeric for .isnumeric().

The resp. numeric values are available via the unicodedata module.

@malemburg
Copy link
Member Author

Raymond Hettinger wrote:

Raymond Hettinger <rhettinger@users.sourceforge.net> added the comment:

Try not to sprawl this all over the docs. Find the most common root and document it there. No need to garbage-up Fractions, Decimal etc. with something that is of zero interest to 99.9% of users.

That's a good idea. It may be enough to just add a new

unicode_decimal_digit ::= ...

to the language spec (even if it is not used there) and then reference
it from the other parts of the docs.

Same for unicode_digit and unicode_numeric.

@python-dev
Copy link
Mannequin

python-dev mannequin commented Mar 23, 2011

New changeset 6853b480e388 by Raymond Hettinger in branch '3.1':
Issue bpo-10610: Document that int(), float(), and complex() accept numeric literals with the Nd property.
http://hg.python.org/cpython/rev/6853b480e388

New changeset a1e685ceb3bd by Raymond Hettinger in branch '3.2':
Issue bpo-10610: Document that int(), float(), and complex() accept numeric literals with the Nd property.
http://hg.python.org/cpython/rev/a1e685ceb3bd

New changeset 997271aebd69 by Raymond Hettinger in branch 'default':
Issue bpo-10610: Document that int(), float(), and complex() accept numeric literals with the Nd property.
http://hg.python.org/cpython/rev/997271aebd69

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Documentation in the Doc dir topic-unicode
Projects
None yet
Development

No branches or pull requests

4 participants