Message 141016 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	petri.lehtinen
Recipients	eric.araujo, ezio.melotti, jcea, max-alleged, petri.lehtinen, rhettinger, terry.reedy, xuanji
Date	2011-07-23.20:25:06
SpamBayes Score	1.5637491e-13
Marked as misclassified	No
Message-id	<1311452708.7.0.884544953438.issue12170@psf.upfronthosting.co.za>
In-reply-to

Content
Attached a patch with the following changes: Allow an integer argument in range(0, 256) for the following bytes and bytearray methods: count, find, index, rfind, rindex. Initially, only count and index were targeted, but as index is implemented in a helper function that is also used to implement find, rfind and rindex, these functions were affected too. The bytes methods were changed to use the new buffer protocol instead of the deprecated PyObject_AsCharBuffer, for consistency with the bytearray code. Tests for all the modified functions were expanded to cover the new functionality. While at it, the tests for count, index and rindex were also further expanded (to test for slices, for example), as they were initially quite minimal. A paragraph describing the additional semantics of the five methods was added to the documentation. The error messages of index and rindex were left untouched ("substring not found" and "subsection not found"). In a case where the first argument is an integer, the error messages could talk about a byte instead of substring/subsection. This would have been a bit non-straightforward to implement, so I didn't. The docstrings were also left unchanged, as I couldn't find a good wording for them. The problem is not that the first argument may now be an integer, but as it can now be more than a substring or subsection, we might have to specify what a substring or subsection really means. And that explanation would be lengthy (because of the buffer protocol, that's not a concept that a regular Python programmer is, or even needs to be, familiar with)... And finally, there's one thing that I'm unsure of: When an integer out of range(0, 256) is passed as the first argument, should we raise a ValueError or a TypeError? Currently, a ValueError is raised, but this may be bad for index and rindex, as they raise a ValueError when the substring or byte is not found. I made the decision to raise a ValueError decision because __contains__ of both bytes and bytearray raise a ValueError when passed an integer not in range(0, 256).

Attached a patch with the following changes:

Allow an integer argument in range(0, 256) for the following bytes and
bytearray methods: count, find, index, rfind, rindex. Initially, only
count and index were targeted, but as index is implemented in a helper
function that is also used to implement find, rfind and rindex, these
functions were affected too.

The bytes methods were changed to use the new buffer protocol instead
of the deprecated PyObject_AsCharBuffer, for consistency with the
bytearray code.

Tests for all the modified functions were expanded to cover the new
functionality. While at it, the tests for count, index and rindex were
also further expanded (to test for slices, for example), as they were
initially quite minimal.

A paragraph describing the additional semantics of the five methods
was added to the documentation.

The error messages of index and rindex were left untouched
("substring not found" and "subsection not found"). In a case where
the first argument is an integer, the error messages could talk about
a byte instead of substring/subsection. This would have been a bit
non-straightforward to implement, so I didn't.

The docstrings were also left unchanged, as I couldn't find a good
wording for them. The problem is not that the first argument may now
be an integer, but as it can now be more than a substring or
subsection, we might have to specify what a substring or subsection
really means. And that explanation would be lengthy (because of the buffer protocol, that's not a concept that a regular Python programmer is, or even needs to be, familiar with)...

And finally, there's one thing that I'm unsure of:

When an integer out of range(0, 256) is passed as the first argument,
should we raise a ValueError or a TypeError? Currently, a ValueError
is raised, but this may be bad for index and rindex, as they raise a
ValueError when the substring or byte is not found. I made the
decision to raise a ValueError decision because __contains__ of both
bytes and bytearray raise a ValueError when passed an integer not in
range(0, 256).

History
Date	User	Action	Args
2011-07-23 20:25:09	petri.lehtinen	set	recipients: + petri.lehtinen, rhettinger, terry.reedy, jcea, ezio.melotti, eric.araujo, xuanji, max-alleged
2011-07-23 20:25:08	petri.lehtinen	set	messageid: <1311452708.7.0.884544953438.issue12170@psf.upfronthosting.co.za>
2011-07-23 20:25:08	petri.lehtinen	link	issue12170 messages
2011-07-23 20:25:07	petri.lehtinen	create