I suppose this takes advantage of the libc's optimized memchr(). Any benchmarks?
(patch looks fine, by the way)
