Author akira
Recipients BreamoreBoy, akira, akuchling, kevinpt
Date 2014-06-19.17:35:10
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1403199312.05.0.275887186833.issue9770@psf.upfronthosting.co.za>
In-reply-to
Content
I've fixed isblank to accept tab instead of backspace and added tests 
for character classification functions from curses.ascii module that
have corresponding analogs in ctype.h. They've uncovered issues in 
isblank, iscntrl, and ispunct functions.

Open questions:

- is it a security bug (backspace is treated as tab in isblank())?
  If it is then 3.1, 3.2, 3.3 branches should also be updated
  [devguide]. If not then only 2.7, 3.4, and default branches should
  be changed.

  [devguide]: http://hg.python.org/devguide/file/9794412fa62d/devcycle.rst#l105

- iscntrl() mistakenly returns false for 0x7f but c11 defines it as
  a control character. Should iscntrl behavior (and its docs) be
  changed to confirm? Should another issue be opened?

- ispunct() mistakenly returns true for control characters such as
  '\n'. The documentation says (paraphrasing) 'any printable except
  space and alnum'. string.printable includes '\n' but 'printing
  character' in C11 does not include the newline. Moreover
  curses.ascii.isprint follows C behavior and excludes control
  characters. Should another issue be opened to return false from
  ispunct() for control characters such as '\n'?

- ispunct() mistakenly returns true for non-ascii characters such as
  0xff

- negative integer values: C functions are defined for EOF macros
  (some negative value) and the behavior is undefined for any other
  negative integer value. What should be curses.ascii.is* predicates
  behavior? Should Python guarantee that False is returned?

- curses.ascii.isspace/.isblank doesn't raise TypeError for bytes,
  None on Python 3

- should constants from string module be used? What is more
  fundamental: string.digits or curses.ascii.isdigit?

- no tests for: isascii, isctrl, ismeta (they are not defined in
  ctype.h). It is unclear what the behaviour should be e.g., isascii
  mistakenly returns True for negative ints, ismeta returns True for
  any non-ascii character including Unicode letters. It is not clear
  how isctrl is different from iscntrl.
History
Date User Action Args
2014-06-19 17:35:12akirasetrecipients: + akira, akuchling, BreamoreBoy, kevinpt
2014-06-19 17:35:12akirasetmessageid: <1403199312.05.0.275887186833.issue9770@psf.upfronthosting.co.za>
2014-06-19 17:35:12akiralinkissue9770 messages
2014-06-19 17:35:11akiracreate