This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author terry.reedy
Recipients akuchling, belopolsky, eric.araujo, georg.brandl, terry.reedy
Date 2010-11-18.19:41:53
SpamBayes Score 1.084044e-11
Marked as misclassified No
Message-id <1290109315.27.0.672337666392.issue4153@psf.upfronthosting.co.za>
In-reply-to
Content
Thanks for persisting with this. Looking at the patch:

@@ -65,7 +63,7 @@
 goal was to have Unicode contain the alphabets for every single human language.
 It turns out that even 16 bits isn't enough to meet that goal, and the modern
 Unicode specification uses a wider range of codes, 0-1,114,111 (0x10ffff in
-base-16).
+base 16).

I visually parse 0-1,114,111 as 0-1, 114, 111. So I think either the commas should be removed or extra spaces are needed: 0-1114111 or 0 - 1,114,111. In your recent (and excellent) chr/ord doc patch, you used (or stayed with) 'hexadecimal' versus 'base 16'. Do we have a standard? I *think* I prefer the former.

-character with value 0x12ca (4810 decimal).  The Unicode standard contains a lot
+character with value 0x12ca (4,810 decimal).  The Unicode standard contains a lot

I prefer without the added comma.

     >>> b'\x80abc'.decode("utf-8", "replace")
-    '\ufffdabc'
+    '�abc'

Three replacements (i with diaeresis, upside-down ?, 1/2) for one bad char looks wrong. With IDLE I get '�abc' (? in hexagon, codepoint 65533). Perhaps something just went wrong to patch from your file to my browser window.

@@ -281,10 +279,10 @@
 built-in :func:`ord` function that takes a one-character Unicode string and
 returns the code point value::

You fixed chr/ord doc, need to fix references thereto in this doc.

-point.  The ``\U`` escape sequence is similar, but expects 8 hex digits, not 4::
+point.  The ``\U`` escape sequence is similar, but expects eight base 16
+digits, not four::

I really think of them as hex or hexadecimal digits, just as 0-9 are decimal, not base 10 digits.


 
     >>> s = "a\xac\u1234\u20ac\U00008000"
               ^^^^ two-digit hex escape
History
Date User Action Args
2010-11-18 19:41:55terry.reedysetrecipients: + terry.reedy, akuchling, georg.brandl, belopolsky, eric.araujo
2010-11-18 19:41:55terry.reedysetmessageid: <1290109315.27.0.672337666392.issue4153@psf.upfronthosting.co.za>
2010-11-18 19:41:54terry.reedylinkissue4153 messages
2010-11-18 19:41:53terry.reedycreate