Author tchrist
Recipients Arfrever, ezio.melotti, jkloth, mrabarnett, pitrou, r.david.murray, tchrist, terry.reedy
Date 2011-08-14.17:09:27
SpamBayes Score 1.42857e-08
Marked as misclassified No
Message-id <26111.1313341755@chthon>
In-reply-to <>
Ezio Melotti <> wrote
   on Sun, 14 Aug 2011 07:15:09 -0000: 

> For example I don't think removing the 0x10FFFF upper limit is going to
> happen -- even if it might be useful for other things. 

I agree entirely.  That's why I appended a triple exclamation point to where I
said I certainly do not expect this.  It can only work fully on UTF-8ish systems
and up to 32 bits on UTF-32, and it is most emphatically *not* Unicode.  Yes,
there are things you can do with it, but it risks serious misunderstanding and
even noncomformance if not done very carefully.  The Standard does not forbid
such things internally, but you are not allowed to pass them around in
noninternal streams claiming they are real UTF streams.

> Also regular expressions are not part of the core and are not used
> that often, so I consider problems with narrow/wide builds, codecs and
> the unicode type much more important than problems with the re/regex
> module (they should be fixed too, but have lower priority IMHO).

One advantage of having an external library is the ability to update
it asynchronously.  Another is the possibility to swap in out altogether.
Perl only gained that ability, which Python has always had, some four
years ago with its 5.10 release.  To my knowledge, the only thing people
tend to use this for is to get Russ Cox's re2 library, which has very
different performance characteristics and guarantees that allow it to 
be used in potential starvation denial-of-service situations that the
normal Perl, Python, Java, etc regex engine cannot be safely used for.

Date User Action Args
2011-08-14 17:09:28tchristsetrecipients: + tchrist, terry.reedy, pitrou, jkloth, ezio.melotti, mrabarnett, Arfrever, r.david.murray
2011-08-14 17:09:27tchristlinkissue12729 messages
2011-08-14 17:09:27tchristcreate