classification
Title: sre_compile._optimize_unicode() needs a cleanup
Type: behavior Stage: resolved
Components: Library (Lib), Regular Expressions, Unicode Versions: Python 3.3
process
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: Nosy List: ezio.melotti, mrabarnett, pitrou, serhiy.storchaka, vstinner
Priority: normal Keywords:

Created on 2011-10-04 17:11 by vstinner, last changed 2013-10-21 11:46 by serhiy.storchaka. This issue is now closed.

Messages (3)
msg144905 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-10-04 17:11
The following comment is wrong, 

    except IndexError:
        # non-BMP characters; XXX now they should work
        return charset

sys.maxunicode != 65535 is now always true in Python 3.3

        if sys.maxunicode != 65535:
            # XXX: negation does not work with big charsets
            # XXX2: now they should work, but removing this will make the
            # charmap 17 times bigger
            return charset

See the related commit: f39b26ca7f3d (from issue #13054).
msg178896 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-01-03 02:02
I don't know what to do with this issue. The code looks to work anyway, so I guess that it's safer to not touch it :-)
msg200753 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-10-21 11:46
There are a lot of dead or suboptimal code in the re module. For example _sre.CODESIZE now can't be 2. We could cleanup the code as side effect of optimization.
History
Date User Action Args
2013-10-21 11:46:00serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg200753
2013-01-03 04:40:30ezio.melottisetnosy: + mrabarnett

type: behavior
stage: resolved
2013-01-03 02:02:38vstinnersetstatus: open -> closed
resolution: wont fix
messages: + msg178896
2011-10-04 17:11:58vstinnercreate