This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author ralph.corderoy
Recipients docs@python, eric.araujo, ezio.melotti, ralph.corderoy
Date 2011-05-08.14:27:08
SpamBayes Score 0.000204854
Marked as misclassified No
Message-id <1304864830.64.0.333911175024.issue10713@psf.upfronthosting.co.za>
In-reply-to
Content
Examining the source of Ubuntu's python2.6 2.6.6-5ubuntu1 package
suggests beyond the limits of the string is considered \W, like Perl.

    Modules/_sre.c:
       336  LOCAL(int)
       337  SRE_AT(SRE_STATE* state, SRE_CHAR* ptr, SRE_CODE at)
       338  {
       339      /* check if pointer is at given position */
       340
       341      Py_ssize_t thisp, thatp;
       ...
       365      case SRE_AT_BOUNDARY:
       366          if (state->beginning == state->end)
       367              return 0;
       368          thatp = ((void*) ptr > state->beginning) ?
       369              SRE_IS_WORD((int) ptr[-1]) : 0;
       370          thisp = ((void*) ptr < state->end) ?
       371              SRE_IS_WORD((int) ptr[0]) : 0;
       372          return thisp != thatp;

SRE_IS_WORD() returns 16 for the 63 \w characters, 0 otherwise.

This is born out by tests.

Note, 366 above confirms it's never true for an empty string.  The
documentation states that \B "is just the opposite of \b" yet
re.match(r'\b', '') returns None and so does \B so \B isn't the opposite
of \b in all cases.
History
Date User Action Args
2011-05-08 14:27:10ralph.corderoysetrecipients: + ralph.corderoy, ezio.melotti, eric.araujo, docs@python
2011-05-08 14:27:10ralph.corderoysetmessageid: <1304864830.64.0.333911175024.issue10713@psf.upfronthosting.co.za>
2011-05-08 14:27:09ralph.corderoylinkissue10713 messages
2011-05-08 14:27:08ralph.corderoycreate