Author shreevatsa
Recipients docs@python, ezio.melotti, shreevatsa, vstinner
Date 2015-09-30.05:19:44
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1443590385.97.0.810723896221.issue25275@psf.upfronthosting.co.za>
In-reply-to
Content
Summary: This is about int(u'१२३४') == 1234.

At https://docs.python.org/2/library/functions.html and also https://docs.python.org/3/library/functions.html the documentation for 

     class int(x=0)
     class int(x, base=10)

says (respectively):

> If x is not a number or if base is given, then x must be a string or Unicode object representing an integer literal in radix base.

> If x is not a number or if base is given, then x must be a string, bytes, or bytearray instance representing an integer literal in radix base.

If you follow the definition of "integer literal" into the reference (https://docs.python.org/2/reference/lexical_analysis.html#integers and https://docs.python.org/3/reference/lexical_analysis.html#integers respectively), the definitions ultimately involve

     nonzerodigit   ::=  "1"..."9"
     octdigit       ::=  "0"..."7"
     bindigit       ::=  "0" | "1"
     digit          ::=  "0"..."9"

So it looks like whether the behaviour of int() conforms to its documentation hinges on what "representing" means. Apparently it is some definition under which u'१२३४' represents the integer literal 1234, but it would be great to either clarify the documentation of int() or change its behaviour.
History
Date User Action Args
2015-09-30 05:19:46shreevatsasetrecipients: + shreevatsa, vstinner, ezio.melotti, docs@python
2015-09-30 05:19:45shreevatsasetmessageid: <1443590385.97.0.810723896221.issue25275@psf.upfronthosting.co.za>
2015-09-30 05:19:45shreevatsalinkissue25275 messages
2015-09-30 05:19:44shreevatsacreate