This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author mrabarnett
Recipients Arfrever, ezio.melotti, mrabarnett, pitrou, pyos, serhiy.storchaka, vstinner
Date 2012-12-15.00:41:26
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1355532086.96.0.798449802893.issue16688@psf.upfronthosting.co.za>
In-reply-to
Content
In function SRE_MATCH, the code for SRE_OP_GROUPREF (line 1290) contains this:

    while (p < e) {
        if (ctx->ptr >= end ||
            SRE_CHARGET(state, ctx->ptr, 0) != SRE_CHARGET(state, p, 0))
            RETURN_FAILURE;
        p += state->charsize;
        ctx->ptr += state->charsize;
    }

However, the code for SRE_OP_GROUPREF_IGNORE (line 1316) contains this:

    while (p < e) {
        if (ctx->ptr >= end ||
            state->lower(SRE_CHARGET(state, ctx->ptr, 0)) != state->lower(*p))
            RETURN_FAILURE;
        p++;
        ctx->ptr += state->charsize;
    }

(In both cases 'p' is of type 'char*'.)

The problem appears to be that the latter is still using '*p' and 'p++' and is thus always working with chars (it gets and advances 1 byte at a time instead of 1, 2 or 4 bytes for Unicode).
History
Date User Action Args
2012-12-15 00:41:27mrabarnettsetrecipients: + mrabarnett, pitrou, vstinner, ezio.melotti, Arfrever, serhiy.storchaka, pyos
2012-12-15 00:41:26mrabarnettsetmessageid: <1355532086.96.0.798449802893.issue16688@psf.upfronthosting.co.za>
2012-12-15 00:41:26mrabarnettlinkissue16688 messages
2012-12-15 00:41:26mrabarnettcreate