I added the cutoff for strings >= 10 characters, and I converted the PR from a draft to "Ready to Review."

When running before and after the PR, I get these results:

    Unicode Before: 81.82   Bytes Before: 92.62 
    Unicode After:  64.70   Bytes after: 62.41

    Full results here:

And on the random zipf benchmarks: 

    14 cases slower (median 1.16x slower, at most 1.52x slower)
    601 cases faster (median 2.15x faster, at most 21.94x faster)
    Full results here:
