New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
re.split emptyok flag (fix for #852532) #40540
Comments
This patch addresses bug bpo-852532. The underlying My preference would be to just change the behavior of (Linux 2.6.3 i686) |
Logged In: YES Practical example where the current behaviour produces >>> import re
>>> re.split(r'(?<=[A-Z])(?=[^a-z])','SOMEstring')
['SOMEstring'] # desired is ['SOME','string'] |
Logged In: YES Overall I like the patch and wouldn't mind seeing the change |
Logged In: YES I picked through CVS, python-dev and google and came up with The python-dev archive doesn't seem to go back far enough to As far as I can tell, the current behavior was a design (I didn't notice that re.findall doc when I originally wrote |
Logged In: YES Apparently this patch is stalled, but I'd like to get it in, If I make a patch with the "doesn't count" behavior, could |
Logged In: YES Fred, what do you think of the proposal. Are the backwards |
Logged In: YES This patch seems to have been stalled now for over a year. |
Logged In: YES I agree completely that splitting on non-zero matches should >>> re.split('x*', 'abxxxcdefxxx', emptyok=True)
['', 'a', 'b', '', 'c', 'd', 'e', 'f', '', ''] To me, this means there's an empty string, beginning and >>> re.split('x*', 'abxxxcdefxxx')
['a', 'b', 'c', 'd', 'e', 'f', ''] That is, empty matches cause a split when they are not >>> re.split('(x*)', 'abxxxcdefxxx')
['', 'a', '', 'b', 'xxx', '', 'c', '', 'd', '', 'e', '',
'f', 'xxx', ''] Using the same approach, these results would also seem >>> re.split('(?m)$', 'foo\nbar\nbaz')
['foo', '\nbar', '\nbaz']
>>> re.split('(?m)^', 'foo\nbar\nbaz')
['foo\n', 'bar\n', 'baz'] Splitting a one-character string should be possible only if >>> re.split('\w*', 'a')
['', '']
>>> re.split('\d*', 'a')
['a'] |
Logged In: YES I think I still agree with my original answer on this (see I'm completely worn down on this, though, so I'd happily |
Hello from 2004! This is your long-lost bug in re.split--how's it going? I'm still alive and well. I think everyone pretty much agrees that I really am a bug, and at least one guy still writes code just to work around me every few weeks or so. My attempt to keep a low profile is doing well--I'm not even documented in the library reference. This allows me to meet new Python users on a regular basis (whether they like it or not). Well, that's it for now. If I don't hear from you until then, I'll drop you another line in 2009. (Hey I'm a poet, too!) Regards, |
take a look at the patch being worked on in issue bpo-3262. |
Closing as a duplicate of bpo-3262, which seems to be active. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: