This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author terry.reedy
Recipients belopolsky, terry.reedy
Date 2013-06-21.21:54:51
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1371851691.46.0.362394730024.issue18236@psf.upfronthosting.co.za>
In-reply-to
Content
You stated facts: what is your proposal?

The fact that unicode calls characters 'space' does not make then whitespace as commonly understood, or as defined by C, or even as defined by the Unicode database. Unicode apparently has a WSpace property. According to the table in
https://en.wikipedia.org/wiki/Whitespace_%28computer_science%29
1C - 1F are not included by that definition either. For ascii chars, that table matches the C definition, with \r included.

So I think your implied proposal to treat them as whitespace (in strings but not bytes) should be rejected as invalid. For 3.x, the manual should specify that it follows the C definition of 'whitespace' (\r included) for bytes and the extended unicode definition for strings.

>>> int('3\r')
3
>>> int('3\u00a0')
3
>>> int('3\u2000')
3
>>> int(b'3\r')
3
>>> int(b'3\u00a0')
Traceback (most recent call last):
  File "<pyshell#10>", line 1, in <module>
    int(b'3\u00a0')
ValueError: invalid literal for int() with base 10: '3\\u00a0'
History
Date User Action Args
2013-06-21 21:54:51terry.reedysetrecipients: + terry.reedy, belopolsky
2013-06-21 21:54:51terry.reedysetmessageid: <1371851691.46.0.362394730024.issue18236@psf.upfronthosting.co.za>
2013-06-21 21:54:51terry.reedylinkissue18236 messages
2013-06-21 21:54:51terry.reedycreate