Issue593627
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2002-08-11 10:41 by orenti, last changed 2022-04-10 16:05 by admin. This issue is now closed.
Files | ||||
---|---|---|---|---|
File name | Uploaded | Description | Edit | |
staticnames.patch | orenti, 2002-08-11 10:41 | |||
staticnames-2.patch | gvanrossum, 2002-08-15 15:58 | Forward patch for CVS of 8/15 |
Messages (11) | |||
---|---|---|---|
msg40899 - (view) | Author: Oren Tirosh (orenti) | Date: 2002-08-11 10:41 | |
This patch creates static string objects for all built-in names and interns then on initialization. The macro PyNAME is be used to access static names. PyNAME(__spam__) is equivalent to PyString_InternFromString("__spam__") but is a constant expression. It requires the name to be one of the built-in names. A linker error will be generated if it isn't. Most conversions of C strings into temporary string objects can be eliminated (PyString_FromString, PyString_InternFromString). Most string comparisons at runtime can also be eliminated. Instead of : if (strcmp(PyString_AsString(name), "__spam__")) ... This code can be used: PyString_INTERN(name) if (name == PyNAME(__spam__)) ... Where PyString_INTERN is a fast inline check if the string is already interned (and it usually is). To prevent unbounded accumulation of interned strings the mortal interned string patch should also be applied. The patch converts most of the builtin module to this new mode as an example. |
|||
msg40900 - (view) | Author: Neal Norwitz (nnorwitz) * ![]() |
Date: 2002-08-11 14:46 | |
Logged In: YES user_id=33168 Couple of initial comments: * this is a reverse patch * it seems like there are other changes in here - int ob_shash -> long - releasing interned strings? * dictobject.c is removed? * including python headers should use "" not <> Oren, could you generate a new patch with only the changes to support PyNAME? Thanks! |
|||
msg40901 - (view) | Author: Oren Tirosh (orenti) | Date: 2002-08-11 15:44 | |
Logged In: YES user_id=562624 Ok, I'll fix the problems with the patch. What's the best way to produce a patch that adds new files? Static string objects cannot be released, of course. This patch will eventually depend on on the mortal interned strings patch to fix it but in the meantime I just disabled releasing interned strings because I want to keep the two patches independent. The next version will add a new PyArg_ParseTuple format for an interned string to make it easier to use. |
|||
msg40902 - (view) | Author: Neal Norwitz (nnorwitz) * ![]() |
Date: 2002-08-11 15:54 | |
Logged In: YES user_id=33168 I'd like to see the releasing interned string patch applied. I think it's almost ready, isn't it? It would make patching easier and seems to be a good idea. For me, the easiest way to produce patches is to use cvs. You can keep multiple cvs trees around easy enough (for having multiple overlapping/independant patches). To create patches with cvs: cvs diff -C 5 [file1] [file2] .... |
|||
msg40903 - (view) | Author: Martin v. Löwis (loewis) * ![]() |
Date: 2002-08-12 07:43 | |
Logged In: YES user_id=21627 What is the rationale for this patch? If it is for performance, what real speed improvements can you report? |
|||
msg40904 - (view) | Author: Oren Tirosh (orenti) | Date: 2002-08-12 12:10 | |
Logged In: YES user_id=562624 If these changes are applied throughout the interpreter I expect a significant speedup but that is not my immediate goal. I am looking for redability, reduction of code size (both source and binary) and reliability (less things to check or forget to check). I am trying to get rid of code like if (docstr == NULL) { docstr= PyString_InternFromString("__doc__"); if (docstr == NULL) return NULL; And replace it with just PyNAME(__doc__) |
|||
msg40905 - (view) | Author: Martin v. Löwis (loewis) * ![]() |
Date: 2002-08-12 12:48 | |
Logged In: YES user_id=21627 It is difficult to see how this patch achieves the goal of readability: - many variables are not used at all, e.g. ArithmethicError; it is not clear how using them would improve readability. - the change from SETBUILTIN("None", Py_None); to SETBUILTIN(PyNAME(None), Py_None); makes it more difficult to read, not easier. Furthermore, the name "None" isn't used anywhere except this initialisation. - likewise, the changes from {"abs", builtin_abs, METH_O, abs_doc}, to {PyNAMEC(abs), builtin_abs, METH_O, abs_doc}, make the code harder to read. |
|||
msg40906 - (view) | Author: Guido van Rossum (gvanrossum) * ![]() |
Date: 2002-08-15 15:22 | |
Logged In: YES user_id=6380 Just for kicks I produced a forward diff, also adding the necessary changes to Makefile.pre.in that were mysteriously missing from the original, and fixing this for the very latest CVS (up to and including Michael Hudson's set_lineno patch). |
|||
msg40907 - (view) | Author: Guido van Rossum (gvanrossum) * ![]() |
Date: 2002-08-15 15:58 | |
Logged In: YES user_id=6380 Strangely, I measured a code size *increase*. Strangely, because most object files didn't increase in text size, but the resulting binary did, adding about 20K text and 17K data. The only object file that changed sizes at all (according to "size */*.o" on Linux) was Python/bltinmodule.o, which grew less than 500 bytes in text size. The only new object file, Python/staticnames.o, has 48 bytes text and 17K bytes data. Maybe the added text size could be because of more cross-file references added by the linker??? (Files that referenced a local static char string constant now reference a static object in Python/staticnames.o.) I do see about a 1% speed increase for pystone. But I agree with Martin's comments on the "readability" issue. There's also a localization property that's lost: whenever a new name is added, you must update staticnames.h, staticnames.c, *and* the file where it is used. That's not nice (and not just because it forces a recompilation of the world because a header file was touched thst everybody includes). *If* this were ever accepted, the mechanism to (re)generate staticnames.h automatically should be checked in as well. In general, I've found that string literals hidden inside macros using stringification (#) are a detriment to code maintainability -- I've often had the situation where I *knew* there had to be a string literal for some name somewhere, but I couldn't find it because of this. Same for name concatenation (##); it often means that you know there's a function name somewhere but a grep through the sources won't find it. Very painful when tracking down problems. I deployed a bunch of tricks like this in early versions of typeobject.c, and ended up expanding almost all of them: a little more typing perhaps, but explicit is better than implicit, and a search for slot_nb_add will at least find the macro that defines it; ditto a search for "__add__" (with the quotes) will find where it is used. I guess that's about a -0.5 from me. Unless someone else steps up to champion this soon, it's dead. :-) PS. I used cvs add for new files and then cvs diff -c -N to create diffs that include new files; cvs produces output looking like a diff against /dev/null, and patch understands those (see the fixed patch I uploaded). But maybe if you only have anonymous CVS the cvs add won't work, and cvs diff won't give you diffs for files it knows nothing about. IOW YMMV. :-) |
|||
msg40908 - (view) | Author: Oren Tirosh (orenti) | Date: 2002-08-15 17:45 | |
Logged In: YES user_id=562624 The code size increase is not surprising - all names appear twice in the executable: once as C strings and again as static PyStringObjects. This duplication can be eliminated. I'm surprised that there is *any* speed increase because I barely changed any code to make use this. This is very encouraging. The localization and forced recomplication issues you raise are not really relevant because this MUST NOT be used for anything but builtin names and builtins are not added so frequently. Even standard modules should not declare static names. The interning of static strings must be done before the interpreter is initialized to ensure that the static name is the interned name. If you intern a static name after the same name has already been interned elsewhere the static object will not be the one true interned version and static references to it will be incorrect. Actually, the macro PyNAME is not required any more and the actual symbol name can be used. I used the macro to do typecasting but it's no longer necessary because I found a way to make the static names real PyObjects (probably the only place where something is actually defined as a PyObject!) |
|||
msg40909 - (view) | Author: Guido van Rossum (gvanrossum) * ![]() |
Date: 2002-08-15 18:09 | |
Logged In: YES user_id=6380 > I'm surprised that there is *any* speed increase > because I barely changed any code to make use > this. This is very encouraging. Don't get too excited. Speedups and slowdowns in the order of 1% are usually random cache effects having to do with common portions of the VM main loop having a cache line conflict; I've seen a case where adding an *unreachable* printf() call predictably changed the pystone speed by 1%. > The localization and forced recomplication > issues you raise are not really relevant because > this MUST NOT be used for anything but builtin > names and builtins are not added so > frequently. Even standard modules should not > declare static names. Then why do I see all signal names in your list? And all exception names? > Actually, the macro PyNAME is not required any > more and the actual symbol name can be used. I > used the macro to do typecasting but it's no > longer necessary because I found a way to make > the static names real PyObjects (probably the > only place where something is actually defined > as a PyObject!) But the string is still more helpful in the code than the symbol name. Sorry, but none of this changes my position; you'll hve to find another champion. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-10 16:05:34 | admin | set | github: 37014 |
2002-08-11 10:41:22 | orenti | create |