classification
Title: startswith and endswith don't accept None as slice index
Type: behavior Stage: resolved
Components: None Versions: Python 3.1, Python 3.2, Python 3.3, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: jcea Nosy List: ezio.melotti, gvanrossum, hkBst, jcea, python-dev, r.david.murray, rhettinger, santoso.wijaya, torsten.becker
Priority: normal Keywords: patch

Created on 2011-04-11 15:09 by hkBst, last changed 2011-11-08 08:43 by gvanrossum. This issue is now closed.

Files
File name Uploaded Description Edit
issue-11828-v1.patch torsten.becker, 2011-04-11 22:39 Use _ParseTupleFinds() for count, startswith and endswith + tests review
issue-11828-v2.patch torsten.becker, 2011-04-12 09:23 Curly brackets adhere PEP 7 review
issue-11828-error-msg-tests.patch torsten.becker, 2011-04-12 09:24 Tests for wrong error messages in rfind(), etc. review
issue-11828-v3.patch torsten.becker, 2011-04-12 13:04 Complete patch for {unicode,bytes,bytearray}object review
issue-11828-v4-3.1.patch torsten.becker, 2011-04-13 21:19 Refined patch against 3.1 branch
issue-11828-v4-2.7.patch torsten.becker, 2011-04-13 21:20 Refined patch against 2.7 branch
87be2a5e22ed.diff jcea, 2011-04-20 13:37 review
Repositories containing patches
https://bitbucket.org/t0rsten/cpython#startswith-slices-issue11828-2.7
https://bitbucket.org/t0rsten/cpython#startswith-slices-issue11828-3.1
Messages (20)
msg133527 - (view) Author: Marijn Schouten (hkBst) Date: 2011-04-11 15:09
startswith and endswith don't accept None as slice index, as shown by below interaction. Same behavior for python-3.1.3(with print()) and python-2.7.1. If instead this is intended behavior then the error message is wrong.

$ python -c "print 'abc'[-1:None]"
c
$ python -c "print 'abc'.endswith('c',-1,None)"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
TypeError: slice indices must be integers or None or have an __index__ method
msg133528 - (view) Author: Marijn Schouten (hkBst) Date: 2011-04-11 15:12
I remark that `find' and `index' do accept None:

$ python3 -c "print('abc'.find('c',None,None))"
2
msg133530 - (view) Author: Jesús Cea Avión (jcea) * (Python committer) Date: 2011-04-11 15:27
Since this is not a regression, this would be solved in 3.3 only. I guess.
msg133531 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-04-11 15:39
Well, if it is judged a bug (and it seems to me that it is), then it can get fixed in 2.7 and 3.2 (and yes I did confirm that the same bug is present in 2.7).

It appears to be the result of startswith/endswith applying naive parsing to their arguments rather than the more sophisticated stringlib calls used by find and index.
msg133532 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2011-04-11 16:01
Letting None be used as an index has always been treated as a feature request; however, in this case, it can be deemed a bug because the error message is making a promise that isn't kept.

It would be nice if once and for all this got solved by making argument parsing codes for indices and slices; otherwise, we're doomed to get some variant of this feature request over and over again.

FWIW, Google's code search shows that this slicing feature for startswith/endswith was of dubious worth -- no one seems to uses it.  The feature should probably have not been added in the first place.

Now that it is in, we still need to handle this report because people will need to make method wrappers for string-like classes.
msg133538 - (view) Author: Jesús Cea Avión (jcea) * (Python committer) Date: 2011-04-11 18:19
Anybody taking care of this, should check also "byte" and "bytearray".
msg133547 - (view) Author: Torsten Becker (torsten.becker) Date: 2011-04-11 22:39
Hi, I started working on a first patch for this.  A function _ParseTupleFinds() exists which does the proper parsing for this kind of arguments in unicodeobject.c, I adapted it to be usable for startswith() and endswith() besides find() and friends.

In issue-8282-v1.patch I fixed this for startswith() and endswith().  count() suffered from the same behavior and I updated it there as well.
msg133551 - (view) Author: Torsten Becker (torsten.becker) Date: 2011-04-11 22:48
While working on this, I discovered anther problem.  find(), etc. all use the same parsing function (_ParseTupleFinds()).  So when an error occurs, the exception message will always start with "find()" even though index() or rfind() might have caused the error:

>>> "asd".index("x", None, None, None)
TypeError: find() takes at most 3 arguments (4 given)

I attached a patch (issue-8282-error-message-tests.patch) which adds test cases for the wrong error messages.

I was thinking about fixing this as well but wanted make sure my approach is correct first:

  - I would like to add another argument to _ParseTupleFinds(): const char * function_name
  - in _ParseTupleFinds(): allocate a buffer of 50 chars on the stack to hold "O|OO:" + function name
  - copy "O|OO:" into buffer
  - copy max(strlen(function_name), 44) chars from function_name into buffer
  - use buffer as format argument of PyArg_ParseTuple()
  - change all calls of _ParseTupleFinds to include the function name as first argument

Would that approach work with Python's C style or are there any Python-specific helper functions I could use?
msg133570 - (view) Author: Torsten Becker (torsten.becker) Date: 2011-04-12 09:23
Just realized that part of my v1 patch did not conform to PEP 7, I hope, I fixed that in v2.

Please also excuse for the wrong name of the error message patch, it was supposed to be named "issue-11828-error-msg-tests.patch".
msg133578 - (view) Author: Torsten Becker (torsten.becker) Date: 2011-04-12 13:04
Hi, since nobody stopped me by complaining about the approach or the first patch, I now fixed this for bytes and bytearray as well. :)

I renamed the old _ParseTupleFinds function to stringlib_parse_tuple_finds, added a parameter for function name, and another if it shall do unicode conversion.  I used this helper function throughout all 3 files now.

I am new to writing C code for Python, so any comments on how to improve the patch are welcome.
msg133669 - (view) Author: Jesús Cea Avión (jcea) * (Python committer) Date: 2011-04-13 15:18
Some comments posted in the review.

Could you possibly post a patch for 2.7 too?.

Thanks.

If any other reviewer agree, I will commit the patch.
msg133679 - (view) Author: Torsten Becker (torsten.becker) Date: 2011-04-13 17:23
> Some comments posted in the review.

I'm not sure if my review reply got mailed as I did not get a copy and nothing showed up here.  I added some responses/follow up questions in the review.

> Could you possibly post a patch for 2.7 too?.

Sure, I'll write the next version against 3.3 and 2.7
msg133680 - (view) Author: Jesús Cea Avión (jcea) * (Python committer) Date: 2011-04-13 17:33
I got your comments, Torsten. I finds funny too that the tracker is not notified.

I wrote new comments too, but not using "the right way", so now I am the one not sure you got them... :-)

Go for the 3.3/2.7 patch. Thanks.
msg133681 - (view) Author: Jesús Cea Avión (jcea) * (Python committer) Date: 2011-04-13 17:43
Better to have a 3.1/2.7 patch. The current workflow requires to patch the old version first (3.1), and up-port the change to 3.2 and 3.3.

So, 2.7 and 3.1 would be more useful. Al least if the patch applies to 3.2 and 3.3 easily. If major surgery is needed, let me know.

PS: If you use mercurial, try to upload the patch directly from it. See the "Remote hg repo" box.
msg133695 - (view) Author: Torsten Becker (torsten.becker) Date: 2011-04-13 21:19
> I got your comments, Torsten. I finds funny too that the tracker is
> not notified.
> I wrote new comments too, but not using "the right way", so now I am
> the one not sure you got them... :-)

That time I actually got a separate mail. :)

> Better to have a 3.1/2.7 patch. The current workflow requires to
> patch the old version first (3.1), and up-port the change to 3.2 and
> 3.3.
> So, 2.7 and 3.1 would be more useful. Al least if the patch applies
> to 3.2 and 3.3 easily. If major surgery is needed, let me know.

I uploaded an improved v4 patch against 2.7 and 3.1.  patch does not apply it cleanly in the 3.2 and 3.3 branches, though.

This is mostly because Objects/stringlib/find.h has changed too much and the #define STRINGLIB_IS_UNICODE (3.3, 3.2) is called FROM_UNICODE in 3.1.  The other files work fine.

It should be no problem to merge this up by hand, though.

> PS: If you use mercurial, try to upload the patch directly from it.
> See the "Remote hg repo" box.

I'm using Mercurial, but unfortunately hg hangs forever when trying to push to bitbucket, so I am just sticking with patches for now.
msg133976 - (view) Author: Torsten Becker (torsten.becker) Date: 2011-04-18 15:23
I pushed my changes to a hg repository, they are in the two branches "startswith-slices-issue11828-2.7" and "startswith-slices-issue11828-3.1".
msg134000 - (view) Author: Jesús Cea Avión (jcea) * (Python committer) Date: 2011-04-18 23:12
Torsten, could you possibly publish branches too for 3.2 and default (3.3)?
msg134086 - (view) Author: Torsten Becker (torsten.becker) Date: 2011-04-19 18:17
Hi, Jesús, I merged the patch up in the branches "startswith-slices-issue11828-3.2" [1] and "startswith-slices-issue11828-3.3" [2] in my hg repository.

[1]: https://bitbucket.org/t0rsten/cpython/changeset/49028581e43a
[2]: https://bitbucket.org/t0rsten/cpython/changeset/eafafe258362
msg134160 - (view) Author: Roundup Robot (python-dev) Date: 2011-04-20 15:59
New changeset f4da64529e8c by Jesus Cea in branch '2.7':
startswith and endswith don't accept None as slice index. Patch by Torsten Becker. (closes #11828)
http://hg.python.org/cpython/rev/f4da64529e8c

New changeset 77c657e47b5c by Jesus Cea in branch '3.1':
startswith and endswith don't accept None as slice index. Patch by Torsten Becker. (closes #11828)
http://hg.python.org/cpython/rev/77c657e47b5c

New changeset e5f11efe89a6 by Jesus Cea in branch '3.2':
MERGE: startswith and endswith don't accept None as slice index. Patch by Torsten Becker. (closes #11828)
http://hg.python.org/cpython/rev/e5f11efe89a6

New changeset b8d1cf9fde04 by Jesus Cea in branch 'default':
MERGE: startswith and endswith don't accept None as slice index. Patch by Torsten Becker. (closes #11828)
http://hg.python.org/cpython/rev/b8d1cf9fde04
msg147278 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2011-11-08 08:43
As I wrote in python-dev:

I agree with Raymond that this should be treated as a feature request and not "fixed" in 2.7 / 3.2. (However the mention of 'find' in the error message for 'index' is a bug and should be fixed.)

As for the feature request, I think that allowing None in more places is more regular and consistent across interfaces. I note that the slice() object also represents "missing" or "default" values as None, so it is not just a carryover from the old string.py.

So, +1 on the feature for 3.3; -1 on the "fix" in 3.2 or 2.7.
History
Date User Action Args
2011-11-08 08:43:09gvanrossumsetnosy: + gvanrossum
messages: + msg147278
2011-04-20 16:00:56jceasetassignee: jcea
2011-04-20 15:59:51python-devsetstatus: open -> closed

nosy: + python-dev
messages: + msg134160

resolution: fixed
stage: resolved
2011-04-20 13:37:59jceasetfiles: + 87be2a5e22ed.diff
2011-04-19 18:17:02torsten.beckersetmessages: + msg134086
2011-04-18 23:12:35jceasetmessages: + msg134000
2011-04-18 15:29:01torsten.beckersetfiles: - 2b48fd451c85.diff
2011-04-18 15:27:05torsten.beckersethgrepos: + hgrepo22
2011-04-18 15:26:06torsten.beckersetfiles: + 2b48fd451c85.diff
2011-04-18 15:23:45torsten.beckersethgrepos: + hgrepo21
messages: + msg133976
2011-04-18 13:44:50r.david.murraysetpriority: normal
2011-04-13 21:20:11torsten.beckersetfiles: + issue-11828-v4-2.7.patch
2011-04-13 21:19:51torsten.beckersetfiles: + issue-11828-v4-3.1.patch

messages: + msg133695
2011-04-13 17:43:41jceasetmessages: + msg133681
2011-04-13 17:33:54jceasetmessages: + msg133680
2011-04-13 17:23:50torsten.beckersetmessages: + msg133679
2011-04-13 15:18:10jceasetmessages: + msg133669
2011-04-12 13:04:08torsten.beckersetfiles: + issue-11828-v3.patch

messages: + msg133578
2011-04-12 09:24:58torsten.beckersetfiles: - issue-8282-error-message-tests.patch
2011-04-12 09:24:43torsten.beckersetfiles: + issue-11828-error-msg-tests.patch
2011-04-12 09:23:34torsten.beckersetfiles: + issue-11828-v2.patch

messages: + msg133570
2011-04-12 01:08:05ezio.melottisetnosy: + ezio.melotti
2011-04-11 22:48:04torsten.beckersetfiles: + issue-8282-error-message-tests.patch

messages: + msg133551
2011-04-11 22:39:35torsten.beckersetfiles: + issue-11828-v1.patch

nosy: + torsten.becker
messages: + msg133547

keywords: + patch
2011-04-11 18:41:44santoso.wijayasetnosy: + santoso.wijaya
2011-04-11 18:19:46jceasetmessages: + msg133538
2011-04-11 16:01:27rhettingersetpriority: normal -> (no value)
nosy: + rhettinger
messages: + msg133532

2011-04-11 15:39:53r.david.murraysetnosy: + r.david.murray

messages: + msg133531
versions: + Python 3.1, Python 2.7, Python 3.3
2011-04-11 15:27:54jceasetnosy: + jcea
messages: + msg133530
2011-04-11 15:26:34jceasetversions: + Python 3.2, - Python 3.1, Python 2.7
2011-04-11 15:12:55hkBstsetmessages: + msg133528
2011-04-11 15:09:08hkBstcreate