Author ezio.melotti
Recipients Rhamphoryncus, amaury.forgeotdarc, bupjae, ezio.melotti, lemburg, vstinner
Date 2010-07-08.08:16:36
SpamBayes Score 0.0603106
Marked as misclassified No
Message-id <1278577000.33.0.00795425454255.issue5127@psf.upfronthosting.co.za>
In-reply-to
Content
[This should probably be discussed on python-dev or in another issue, so feel free to move the conversation there.]

The current implementation considers printable """all the characters except those characters defined in the Unicode character database as following categories are considered printable.
  * Cc (Other, Control)
  * Cf (Other, Format)
  * Cs (Other, Surrogate)
  * Co (Other, Private Use)
  * Cn (Other, Not Assigned)
  * Zl Separator, Line ('\u2028', LINE SEPARATOR)
  * Zp Separator, Paragraph ('\u2029', PARAGRAPH SEPARATOR)
  * Zs (Separator, Space) other than ASCII space('\x20')."""

We could also arbitrary exclude all the non-BMP chars, but that shouldn't be based on the availability of the fonts IMHO.

> Note that Python3 will send printable code points as-is to the
> console, so whether or not a code point is considered printable
> should take the common availability of fonts being able to display
> the code point into account. Otherwise, a user would just see a
> square box instead of the much more useful escape sequence

If the concern is about the usefulness of repr() in the console, note that on the Windows terminal trying to display most of the characters results in an error (see #5110), and that makes repr() barely usable.
ascii() might be an alternative if the user wants to see the escape sequence instead of a square box.
History
Date User Action Args
2010-07-08 08:16:40ezio.melottisetrecipients: + ezio.melotti, lemburg, amaury.forgeotdarc, Rhamphoryncus, vstinner, bupjae
2010-07-08 08:16:40ezio.melottisetmessageid: <1278577000.33.0.00795425454255.issue5127@psf.upfronthosting.co.za>
2010-07-08 08:16:37ezio.melottilinkissue5127 messages
2010-07-08 08:16:37ezio.melotticreate