Title: code_hash() can be the same for different code objects
Type: behavior Stage:
Components: Interpreter Core Versions: Python 2.4, Python 2.3, Python 2.2.3, Python 2.2.2, Python 2.5, Python 2.2.1, Python 2.2, Python 2.1.2, Python 2.1.1
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: _doublep, belopolsky, jyasskin, sdeibel
Priority: normal Keywords:

Created on 2008-02-27 21:47 by sdeibel, last changed 2022-04-11 14:56 by admin. This issue is now closed.

File name Uploaded Description Edit
code_hash_bug.tgz sdeibel, 2008-02-27 21:47 Run to see the bug happen
Messages (5)
msg63083 - (view) Author: Stephan R.A. Deibel (sdeibel) Date: 2008-02-27 21:47
The algorithm in code_hash() in codeobject.c can return the same hash
value for different code objects.  

Presumably distinct code objects should be very unlikely to have the
same hash value.  This bug affected our debugger before we worked around
it, and it could affect other things like profilers.

Adding the co_filename to the hash would be one way to fix this but I'm
not sure if that was purposely avoided in this code?
msg63084 - (view) Author: Stephan R.A. Deibel (sdeibel) Date: 2008-02-27 21:51
I should have noted that adding co_firstlineno as well to the hash would
be necessary to distinguish like code objects within the same file.  The
example has them on the same lines in different files but changing the
first line of the defs doesn't matter.
msg63230 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2008-03-03 20:55
I would say filename/lineno are excluded from hash on purpose because
they are ignored in comparisons:

>>> compile("0", "a", "eval") == compile("0", "b", "eval")

Include/code.h has the following comment:

   /* The rest doesn't count for hash/cmp */ 
    PyObject *co_filename;      /* string (where it was loaded from) */ 
    PyObject *co_name;          /* string (name, for reference) */ 
    int co_firstlineno;         /* first source line number */ 

Can you describe your specific problem in more detail?  Why does your
debugger need to hash/compare code objects?
msg63235 - (view) Author: Paul Pogonyshev (_doublep) Date: 2008-03-03 21:34
Hashes being equal for different objects cannot be a bug.  At most an
enhancement request...
msg63865 - (view) Author: Jeffrey Yasskin (jyasskin) * (Python committer) Date: 2008-03-18 03:15
Given Alexander's comment, and the fact that x==x must imply
hash(x)==hash(x) but the reverse need not be true, this seems like
intentional behavior.
