Reopening and attaching a more ambitious patch, based on the
optimization of runs of ASCII characters. This time the speedup is much
more impressive, up to 75% faster on pure ASCII input -- actually faster
than latin1.

The worst case (tight interleaving of ASCII and non-ASCII chars) shows a
8% slowdown.

(performance measured with gcc and MSVC)
