This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author Rick Otten
Recipients Rick Otten, ezio.melotti, mrabarnett
Date 2015-02-26.23:00:23
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1424991623.26.0.907615110662.issue23532@psf.upfronthosting.co.za>
In-reply-to
Content
The documentation states that "|" parsing goes from left to right.  This doesn't seem to be true when spaces are involved.  (or \s).

Example:

In [40]: mystring
Out[40]: 'rwo incorporated'

In [41]: re.sub('incorporated| inc|llc|corporation|corp| co', '', mystring)
Out[41]: 'rwoorporated'

In this case " inc" was processed before incorporated.
If I take the space out:

In [42]: re.sub('incorporated|inc|llc|corporation|corp| co', '', mystring)
Out[42]: 'rwo '

incorporated is processed first.

If I put a space with each, then " incorporated" is processed first:

In [43]: re.sub(' incorporated| inc|llc|corporation|corp| co', '', mystring)
Out[43]: 'rwo'

And If use \s instead of a space, it is processed first:

In [44]: re.sub('incorporated|\sinc|llc|corporation|corp| co', '', mystring)
Out[44]: 'rwoorporated'
History
Date User Action Args
2015-02-26 23:00:23Rick Ottensetrecipients: + Rick Otten, ezio.melotti, mrabarnett
2015-02-26 23:00:23Rick Ottensetmessageid: <1424991623.26.0.907615110662.issue23532@psf.upfronthosting.co.za>
2015-02-26 23:00:23Rick Ottenlinkissue23532 messages
2015-02-26 23:00:23Rick Ottencreate