Performance numbers:

With patch:

$ ./python.exe -m timeit -s 'import thread; l = thread.allocate_lock()'
'with l: pass'
1000000 loops, best of 3: 1.99 usec per loop

100000 loops, best of 3: 2.15 usec per loop
