Message201573
Let's try to identify some use cases in the Python test suite using gdb:
(gdb) b unicode_compare_eq
(gdb) condition 1 ((PyASCIIObject*)str1)->hash != -1 && ((PyASCIIObject*)str2)->hash != -1 && ((PyASCIIObject*)str1)->hash != ((PyASCIIObject*)str2)->hash
(gdb) run
I didn't dig to understand why hash of these strings are computed. Tell me if you need more examples.
Random examples:
(1) compare "constant" strings (strings from co_consts of code objects)
importlib._bootstrap: _setup():
os_details = ('posix', ['/']), ('nt', ['\\', '/'])
for builtin_os, path_separators in os_details:
...
...
if builtin_os == 'nt': <== HERE
...
(2) importlib._bootstrap: _LoaderBasics.is_package()
def is_package(self, fullname):
filename = _path_split(self.get_filename(fullname))[1]
filename_base = filename.rsplit('.', 1)[0]
tail_name = fullname.rpartition('.')[2]
return filename_base == '__init__' and ... <== HERE
It's surprising that filename_base has its hash computed. I suppose that all these functions (_path_split, .rsplit, .rpartition) return the string unmodified.
(3) importlib._bootstrap: PathFinder._path_importer_cache():
@classmethod
def _path_importer_cache(cls, path):
...
if path == '': <== HERE
path is an entry of sys.path.
(4) str in __all__ (list of str):
os.py:
if "putenv" not in __all__:
__all__.append("putenv")
__all__ is a list of strings.
(5) site.py:
if __name__ == '__main__': <== HERE
__name__ is 'site'.
(6) Python/ceval.py: PyEval_EvalCodeEx() called with arbitrary keyword
for (i = 0; i < kwcount; i++) {
PyObject **co_varnames;
PyObject *keyword = kws[2*i];
PyObject *value = kws[2*i + 1];
int j;
...
/* Speed hack: do raw pointer compares. As names are
normally interned this should almost always hit. */
co_varnames = ((PyTupleObject *)(co->co_varnames))->ob_item;
for (j = 0; j < total_args; j++) {
PyObject *nm = co_varnames[j];
if (nm == keyword)
goto kw_found;
}
/* Slow fallback, just in case */
for (j = 0; j < total_args; j++) {
PyObject *nm = co_varnames[j];
int cmp = PyObject_RichCompareBool( <== HERE
keyword, nm, Py_EQ);
if (cmp > 0)
goto kw_found;
else if (cmp < 0)
goto fail;
}
It looks like the "just in case" path is taken.
(gdb) where
#0 unicode_compare_eq (str1='isTest', str2='func') at Objects/unicodeobject.c:10532
#1 0x000000000052dd41 in PyUnicode_RichCompare (left='isTest', right='func', op=2) at Objects/unicodeobject.c:10609
#2 0x00000000004be4db in do_richcompare (v='isTest', w='func', op=2) at Objects/object.c:647
#3 0x00000000004be790 in PyObject_RichCompare (v='isTest', w='func', op=2) at Objects/object.c:696
#4 0x00000000004be832 in PyObject_RichCompareBool (v='isTest', w='func', op=2) at Objects/object.c:718
#5 0x00000000005a0f68 in PyEval_EvalCodeEx (...) at Python/ceval.c:3450
...
Traceback (most recent call first):
File "/home/haypo/prog/python/default/Lib/test/test_xml_etree.py", line 1669, in test_get_keyword_args
e1 = ET.Element('foo' , x=1, y=2, z=3)
ElementTree.Element() accepts arbitary keywords.
(7) letter==letter singletons:
xml.etree.ElementPath: iterfind()
def iterfind(elem, path, namespaces=None):
...
if path[-1:] == "/": <== HERE
Traceback (most recent call first):
File "/home/haypo/prog/python/default/Lib/xml/etree/ElementPath.py", line 254, in iterfind
if path[-1:] == "/":
path is ".//grandchild", path[-1] is 'd' which is a singleton, Python already computed the hash of 'd'.
Similar example in the same file:
def xpath_tokenizer(pattern, namespaces=None):
for token in xpath_tokenizer_re.findall(pattern):
tag = token[1]
if tag and tag[0] != "{" and ":" in tag: <== HERE
...
tag[0] != "{" <= tag is 'grandchild', tag[0] is a singleton.
Another example:
Traceback (most recent call first):
File "/home/haypo/prog/python/default/Lib/sre_parse.py", line 194, in __next
if char == "\\":
(8) str not in (list of str), test_descr.py: test_dir():
File "/home/haypo/prog/python/default/Lib/test/test_descr.py", line 2255, in <listcomp>
names = [x for x in dir(minstance) if x not in default_attributes]
minstance = M("m")
minstance.b = 2
minstance.a = 1
default_attributes = ['__name__', '__doc__', '__package__',
'__loader__']
names = [x for x in dir(minstance) if x not in default_attributes] |
|
Date |
User |
Action |
Args |
2013-10-28 20:18:37 | vstinner | set | recipients:
+ vstinner, rhettinger, gregory.p.smith, pitrou, christian.heimes, djc, ezio.melotti, meador.inge, serhiy.storchaka |
2013-10-28 20:18:37 | vstinner | set | messageid: <1382991517.7.0.585939755161.issue16286@psf.upfronthosting.co.za> |
2013-10-28 20:18:37 | vstinner | link | issue16286 messages |
2013-10-28 20:18:36 | vstinner | create | |
|