Second version of the patch uses fast patch only when builtin __import__ is 
not overridden. It is slightly slower (due to lookup of __import__).

>>> import timeit
>>> def f():
...     import locale
>>> min(timeit.repeat(f, number=100000, repeat=10))

The code is simpler, but still some cumbersome. It would be good to optimize 
also "from locale import getlocale".
