Here is a new patch without any dispatch shortcut in ceval.c, just
optimizations in unicodeobject.c and longobject.c. Net result on pybench:

Test                             minimum run-time        average  run-time
                                 this    other   diff    this    other 
                 CompareFloats:   166ms   170ms   -2.3%   169ms   174ms
         CompareFloatsIntegers:   230ms   231ms   -0.7%   233ms   231ms
               CompareIntegers:   247ms   270ms   -8.7%   248ms   272ms
        CompareInternedStrings:   196ms   254ms  -22.7%   197ms   255ms
                  CompareLongs:   143ms   158ms   -9.0%   143ms   158ms
                CompareStrings:   156ms   168ms   -7.4%   157ms   169ms
Totals:                          1139ms  1252ms   -9.1%  1148ms  1260ms

The patch seems fairly uncontroversial to me, I'll commit it soon if
there's no opposition.
