Message 226564 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	cwr
Recipients	cwr
Date	2014-09-08.09:29:01
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1410168542.37.0.66219638693.issue22360@psf.upfronthosting.co.za>
In-reply-to

Content
Currently we have a "split" function which splits a str/bytestr into chunks of their underlying data. This works great for the most tivial jobs. But there is no possibility to pass an offset parameter into the split function which indicates the next "user-defined" starting index. Actually the next starting position will be build upon the last starting position (of found sep.) + separator length + 1. It should be possible to manipulate the next starting index by changing this behavior into: last starting position (of found sep.) + separator length + OFFSET. NOTE: The slicing start index (for substring) stay untouched. This will help us to solve splitting sequences with one or more consecutive separators. The following demonstrates the actually behavior. >>> s = 'abc;;def;hij' >>> s.split(';') ['abc', '', 'def', 'hij'] This works fine for both str/bytes values. The following demonstrates an "offset variant" of split function. >>> s = 'abc;;def;hij' >>> s.split(';', offset=1) ['abc', ';def', 'hij'] The behavior of maxcount/None sep. parameter should be generate the same output as before. A change will be affect (as far as I can see): - split.h - split_char/rsplit_char - split/rsplit

Currently we have a "split" function which splits a str/bytestr into
chunks of their underlying data. This works great for the most tivial jobs.
But there is no possibility to pass an offset parameter into the split
function which indicates the next "user-defined" starting index.

Actually the next starting position will be build upon the last starting
position (of found sep.) + separator length + 1.

It should be possible to manipulate the next starting index by changing this
behavior into:

last starting position (of found sep.) + separator length + OFFSET.

NOTE: The slicing start index (for substring) stay untouched.

This will help us to solve splitting sequences with one or more consecutive
separators. The following demonstrates the actually behavior.

>>> s = 'abc;;def;hij'
>>> s.split(';')
['abc', '', 'def', 'hij']

This works fine for both str/bytes values.
The following demonstrates an "offset variant" of split function.

>>> s = 'abc;;def;hij'
>>> s.split(';', offset=1)
['abc', ';def', 'hij']

The behavior of maxcount/None sep. parameter should be generate the same
output as before.

A change will be affect (as far as I can see):
- split.h
    - split_char/rsplit_char
    - split/rsplit

History
Date	User	Action	Args
2014-09-08 09:29:02	cwr	set	recipients: + cwr
2014-09-08 09:29:02	cwr	set	messageid: <1410168542.37.0.66219638693.issue22360@psf.upfronthosting.co.za>
2014-09-08 09:29:02	cwr	link	issue22360 messages
2014-09-08 09:29:01	cwr	create