Message 218879 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	Wellington.Fan
Recipients	Wellington.Fan, ezio.melotti, mrabarnett
Date	2014-05-21.15:38:10
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1400686692.12.0.566986679321.issue21551@psf.upfronthosting.co.za>
In-reply-to

Content
Hello, It seems that the word boundary sequence -- r'\b' -- is not behaving as expected using re.split(). The regex docs say: \b Matches the empty string, but only at the start or end of a word. My (failing) test: > import re > re.split(r'\b', 'A funky string') ['A funky string'] We get a one-element array returned; I would expect a seven-element array: ['', 'A', ' ', 'funky', ' ', 'string', ''] I have equivalent code in PHP that does work: php > print_r( preg_split('/\b/', 'A funny string') ); Array ( [0] => [1] => A [2] => [3] => funny [4] => [5] => string [6] => )

Hello,

It seems that the word boundary sequence -- r'\b' -- is not behaving as expected using re.split(). The regex docs say:

  \b       Matches the empty string, but only at the start or end of a word.

My (failing) test:

> import re
> re.split(r'\b', 'A funky string')
['A funky string']


We get a one-element array returned; I would expect a seven-element array:
['', 'A', ' ', 'funky', ' ', 'string', '']

I have equivalent code in PHP that *does* work:
 php > print_r( preg_split('/\b/', 'A funny string') );
 Array
 (
     [0] =>
     [1] => A
     [2] =>
     [3] => funny
     [4] =>
     [5] => string
     [6] =>
 )

History
Date	User	Action	Args
2014-05-21 15:38:12	Wellington.Fan	set	recipients: + Wellington.Fan, ezio.melotti, mrabarnett
2014-05-21 15:38:12	Wellington.Fan	set	messageid: <1400686692.12.0.566986679321.issue21551@psf.upfronthosting.co.za>
2014-05-21 15:38:11	Wellington.Fan	link	issue21551 messages
2014-05-21 15:38:10	Wellington.Fan	create