This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author carljm
Recipients carljm, dino.viehland, itamaro
Date 2022-03-01.22:19:04
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1646173145.14.0.810360820171.issue46896@roundup.psfhosted.org>
In-reply-to
Content
CPython extensions providing optimized execution of Python bytecode (e.g. the Cinder JIT), or even CPython itself (e.g. the faster-cpython project) may wish to inline-cache access to frequently-read and rarely-changed namespaces, e.g. module globals. Rather than requiring a dict version guard on every cached read, the best-performing way to do this is is to mark the dictionary as “watched” and set a callback on writes to watched dictionaries. This optimizes the cached-read fast-path at a small cost to the (relatively infrequent and usually less perf sensitive) write path.

We have an implementation of this in Cinder ( https://docs.google.com/document/d/1l8I-FDE1xrIShm9eSNJqsGmY_VanMDX5-aK_gujhYBI/edit#heading=h.n2fcxgq6ypwl ), used already by the Cinder JIT and its specializing interpreter. We would like to make the Cinder JIT available as a third-party extension to CPython ( https://docs.google.com/document/d/1l8I-FDE1xrIShm9eSNJqsGmY_VanMDX5-aK_gujhYBI/ ), and so we are interested in adding dict watchers to core CPython.

The intention in this issue is not to add any specific optimization or cache (yet); just the ability to mark a dictionary as “watched” and set a write callback.

The callback will be global, not per-dictionary (no extra function pointer stored in every dict). CPython will track only one global callback; it is a well-behaved client’s responsibility to check if a callback is already set when setting a new one, and daisy-chain to the previous callback if so. Given that multiple clients may mark dictionaries as watched, a dict watcher callback may receive events for dictionaries that were marked as watched by other clients, and should handle this gracefully.

There is no provision in the API for “un-watching” a watched dictionary; such an API could not be used safely in the face of potentially multiple dict-watching clients.

The Cinder implementation marks dictionaries as watched using the least bit of the dictionary version (so version increments by 2); this also avoids any additional memory usage for marking a dict as watched.

Initial proposed API, comments welcome:

// Mark given dictionary as "watched" (global callback will be called if it is modified)
void PyDict_Watch(PyObject* dict);

// Check if given dictionary is already watched
int PyDict_IsWatched(PyObject* dict);

typedef enum {
  PYDICT_EVENT_CLEARED,
  PYDICT_EVENT_DEALLOCED,
  PYDICT_EVENT_MODIFIED
} PyDict_WatchEvent;

// Callback to be invoked when a watched dict is cleared, dealloced, or modified.
// In clear/dealloc case, key and new_value will be NULL. Otherwise, new_value will be the
// new value for key, NULL if key is being deleted.
typedef void(*PyDict_WatchCallback)(PyDict_WatchEvent event, PyObject* dict, PyObject* key, PyObject* new_value);

// Set new global watch callback; supply NULL to clear callback
void PyDict_SetWatchCallback(PyDict_WatchCallback callback);

// Get existing global watch callback
PyDict_WatchCallback PyDict_GetWatchCallback();

The callback will be called immediately before the modification to the dict takes effect, thus the callback will also have access to the prior state of the dict.
History
Date User Action Args
2022-03-01 22:19:05carljmsetrecipients: + carljm, dino.viehland, itamaro
2022-03-01 22:19:05carljmsetmessageid: <1646173145.14.0.810360820171.issue46896@roundup.psfhosted.org>
2022-03-01 22:19:05carljmlinkissue46896 messages
2022-03-01 22:19:04carljmcreate