I am still not sure that it is worth to add 50 lines of the C code to optimize 0.7% of calls.

You have found 70000 examples of function calls with 3 or 4 constant arguments. Does it include tests? Could you please show several (or several hundreds) non-test examples? I am wondering how much of them are in tight loops and are not just executed once at module initialization time.
