New version (4) of the patch:

 - move the opaque pointer (now called "void *ctx", "context") as the first parameter instead of the last parameter, as done in zlib, lzma and Oracle's OCI APIs; ctx is also the first parameter of Py*_GetFunctions() and Py*_SetFunctions() instead of the last
 - rename public functions:

   * Py_GetAllocators() -> PyMem_GetAllocators(), PyObject_GetAllocators()
   * Py_SetAllocators() -> PyMem_SetAllocators(), PyObject_SetAllocators()
   * Py_GetBlockAllocators() -> PyObject_GetArenaAllocators()
   * Py_SetBlockAllocators() -> PyObject_SetArenaAllocators()

 - move declaration of PyObject_*() functions from pymem.h to objimpl.h
 - split _PyMem big structure into smaller structures: _PyMem, _PyObject, _PyObject_Arena
 - move "if (size == 0) size = 1;" from PyMem_Malloc() to _PyMem_Malloc(), so the custom allocator can decide how to implement PyMem_Malloc(0) (maybe something more efficient)

Does the new API look better? py_setallocators-4.patch is ready for a final review. If nobody complains, I'm going to commit it.
