This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author nnorwitz
Recipients arkanes, christian.heimes, ggenellina, gvanrossum, lpd, nnorwitz, ntoronto
Date 2008-02-25.05:39:40
SpamBayes Score 4.6279005e-05
Marked as misclassified No
Message-id <1203917999.42.0.652889462532.issue1518@psf.upfronthosting.co.za>
In-reply-to
Content
I've gone over this at a high-level and looked for various outstanding
issues.  Below are my comments.  I didn't delve into this in detail. 
There are a lot of questions too.

I think this is very interesting and hope that we can get it working,
100% robust, and verify the improved perf.

In Include/fastglobals.h, num_entries says it is len(names).  But
there is also an entries field.  This is confusing.  The name should
be num_names if it is the length of names.

isbuiltin is an array of integers that hold boolean values?  So most
of the memory is wasted?  How many entries do you expect in isbuiltin?
This field should probably be packed using each bit.  Its type should then
be unsigned.  If there were a small number of PyFastGlobalsObjects it
wouldn't be a big deal.  But from the comment it seems there will be
one of these objects for each function.

Did you measure the difference in memory size?  For example the size
at startup with and without the patch.  Did you measure the startup
time?  How big is python at the end of a regression test run with and
without this patch?

Is there ever an access to the old globals in the old code?  I wonder
if it's worthwhile to change the names in ceval.c, etc.

Does PyFastGlobals_GET_ITEM_INPLACE really need to be a macro?  It's a
pain to debug macros.  If it were a static function in the header
file, it would almost definitely get inlined.  
PyFastGlobals_ENTRY(op, index) should be captured in a local variable.

I'm thinking that entryversion should be unsigned.  Signed integer
overflow is not defined (we had some problems recently with this).
PyFastGlobals_VERSION_SET(op) can overflow.

I like the removal of one pointer from the frame object.

How does this interact with multiple interpreters?  I see the static
of __builtins__ in fastglobal.c.  See PyInterpreterState in
Include/pystate.h.  (I see that __builtins__ is a string, so this
access should be fine, but the question is still valid.  I just want
to make sure there aren't any bad interactions.)

Regarding:
/* XXX A clever adversary could prevent this from terminating */

How about the loop is limited to 10 runs or something reasonable?
That way there cannot be a DoS attack.  You can raise a RuntimeError
if the max iterations is reached.

Since PyFastGlobals_Update() is public, you should not assert that the
op is a fast globals, but rather check and set
PyErr_BadInternalCall().  That's true of all public APIs.  Internal
APIs are fine to have asserts.

In fg_check_builtins() you may be able to speed this up by checking
the most common paths first.  I know this code is taken from
frameobject.  See
http://coverage.livinglogic.de/Objects/frameobject.c.html .  Hmmm,
that isn't showing any data currently.  I'll ping Walter and ask him
if he can get that working again.

I don't understand this comment in PyFastGlobals_CheckBuiltins:
  /* Make sure fg->builtins_entry is updated first */

It says what the code does, but I don't understand *why* it does it.

Instead of:
		PyErr_NoMemory();
		Py_DECREF(newnames);
		return NULL;

You can do:
		Py_DECREF(newnames);
		return PyErr_NoMemory();

In PyFastGlobals_New(), you should verify that num_entries doesn't
overflow before creating a new tuple by increasing the size by 1.

In this code:
	isbuiltin = PyMem_NEW(int, num_entries);
	if (isbuiltin == NULL) {
		PyErr_NoMemory();
		Py_DECREF(newnames);
		PyMem_FREE(isbuiltin);
		return NULL;
	}

The free of isbuiltin should be entries.

If these arrays will be small (< 256 bytes), it would be better (faster)
to use
pymalloc rather than the PyMem interface.

PyDict_GetEntry should be prefixed with an underscore.  I don't think
this should become part of the public API.  How about passing an int
pointer to GetEntry that will be set if the key is found or clear if not
found?

This needs to be addressed:
+	/* This is a bad idea now: if gc breaks a cycle by clearing
+	f_fastglobals, when a generator is finally collected it does this
+	sequence of calls: gen_del->gen_close->gen_send_ex->
+	PyEval_EvalFrameEx. But PyEval_EvalFrameEx references
+	f_fastglobals->globals, which will be cleared by then, resuling
+	in a segfault.
+	??? Is there a way to preserve this traversal? */
+	/* Py_VISIT(f->f_fastglobals); */
History
Date User Action Args
2008-02-25 05:39:59nnorwitzsetspambayes_score: 4.6279e-05 -> 4.6279005e-05
recipients: + nnorwitz, gvanrossum, lpd, ggenellina, christian.heimes, arkanes, ntoronto
2008-02-25 05:39:59nnorwitzsetspambayes_score: 4.6279e-05 -> 4.6279e-05
messageid: <1203917999.42.0.652889462532.issue1518@psf.upfronthosting.co.za>
2008-02-25 05:39:42nnorwitzlinkissue1518 messages
2008-02-25 05:39:40nnorwitzcreate