Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make hash values the same width as a pointer (or Py_ssize_t) #53987

Closed
pitrou opened this issue Sep 4, 2010 · 52 comments
Closed

Make hash values the same width as a pointer (or Py_ssize_t) #53987

pitrou opened this issue Sep 4, 2010 · 52 comments
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement

Comments

@pitrou
Copy link
Member

pitrou commented Sep 4, 2010

BPO 9778
Nosy @malemburg, @tim-one, @loewis, @smontanaro, @birkenfeld, @rhettinger, @mdickinson, @abalkin, @pitrou, @benjaminp, @skrah
Files
  • hash_t.diff
  • hash_t.diff
  • deunsignify.diff
  • py_ssize_t_hash_patch.diff
  • Py_uhash_t_patch.diff
  • hashw64.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2010-10-23.19:09:16.110>
    created_at = <Date 2010-09-04.22:11:29.637>
    labels = ['interpreter-core', 'type-feature']
    title = 'Make hash values the same width as a pointer (or Py_ssize_t)'
    updated_at = <Date 2010-10-23.19:09:16.109>
    user = 'https://github.com/pitrou'

    bugs.python.org fields:

    activity = <Date 2010-10-23.19:09:16.109>
    actor = 'loewis'
    assignee = 'none'
    closed = True
    closed_date = <Date 2010-10-23.19:09:16.110>
    closer = 'loewis'
    components = ['Interpreter Core']
    creation = <Date 2010-09-04.22:11:29.637>
    creator = 'pitrou'
    dependencies = []
    files = ['19257', '19258', '19260', '19317', '19338', '19343']
    hgrepos = []
    issue_num = 9778
    keywords = ['patch']
    message_count = 52.0
    messages = ['115617', '115618', '115619', '115626', '115639', '115656', '115657', '115695', '118789', '118820', '118823', '118953', '118954', '118955', '118959', '118968', '118972', '118980', '118982', '118983', '118987', '118994', '118995', '119000', '119001', '119011', '119015', '119017', '119019', '119020', '119025', '119029', '119030', '119034', '119043', '119044', '119058', '119077', '119079', '119080', '119124', '119265', '119411', '119415', '119442', '119443', '119446', '119453', '119456', '119464', '119465', '119468']
    nosy_count = 14.0
    nosy_names = ['lemburg', 'tim.peters', 'loewis', 'skip.montanaro', 'georg.brandl', 'rhettinger', 'jimjjewett', 'mark.dickinson', 'belopolsky', 'pitrou', 'casevh', 'ked-tao', 'benjamin.peterson', 'skrah']
    pr_nums = []
    priority = 'normal'
    resolution = 'accepted'
    stage = None
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue9778'
    versions = ['Python 3.2', 'Python 3.3']

    @pitrou
    Copy link
    Member Author

    pitrou commented Sep 4, 2010

    Currently, Python produces hash values with fit in a C "long". This is fine at first sight, but in the context of dict and set implementations, it means that

    1. holding hashes and indices in the same field of a structure requires some care (see bpo-1646068)
    2. on platforms where a long is smaller than a Py_ssize_t (e.g. Win64), very big hash tables could suffer from lots of artificial collisions (the hash table being bigger than the range of possible hash values)
    3. when a long is smaller than Py_ssize_t, we don't save any size anyway, since having some pointers follow a C "long" in a structure implies some padding to keep all fields naturally aligned

    A future-proof option would be to change all hash values to be of Py_ssize_t values rather than C longs. Either directly, or by defining a new dedicated alias Py_hash_t. This would also impact the ABI, I suppose.

    @pitrou pitrou added interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement labels Sep 4, 2010
    @pitrou
    Copy link
    Member Author

    pitrou commented Sep 4, 2010

    As an example of padding behaviour (under Win64 with 32-bit longs and 64-bit pointers):

    Python 3.2a1+ (py3k, Sep 4 2010, 22:50:10) [MSC v.1500 64 bit (AMD64)] on win32

    Type "help", "copyright", "credits" or "license" for more information.
    >>> import struct
    >>> struct.calcsize("l")
    4
    >>> struct.calcsize("lP")
    16
    >>> struct.calcsize("lPP")
    24

    @rhettinger
    Copy link
    Contributor

    +1 for PyObject_Hash returning a Py_ssize_t.

    Nothing good can come from having a hash table larger than the range of a hash value. Tests may pass but performance would degrade catastrophically.

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Sep 5, 2010

    This would also impact the ABI, I suppose.

    Correct. So it either needs to happen before 3.2, or wait until 4.0,
    or the introduction of "wide" hashes needs to be done in a compatible
    manner, likely requiring two parallel hashing infrastructures.

    @loewis loewis mannequin changed the title Make hash values the same width as a pointer (or Py_ssize_t) Make hash values the same width as a pointer (or Py_ssize_t) Sep 5, 2010
    @pitrou
    Copy link
    Member Author

    pitrou commented Sep 5, 2010

    Correct. So it either needs to happen before 3.2, or wait until 4.0,

    Shouldn't there be a provision for ABI versioning?
    Or do you suggest bumping to the next major number (4.0, 5.0...) be done on the basis of ABI changes?

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Sep 5, 2010

    Am 05.09.2010 13:12, schrieb Antoine Pitrou:

    Antoine Pitrou <pitrou@free.fr> added the comment:

    > Correct. So it either needs to happen before 3.2, or wait until 4.0,

    Shouldn't there be a provision for ABI versioning?

    There is certainly support for ABI versioning; the ABI version defined
    in the PEP is called "python3.dll".

    Or do you suggest bumping to the next major
    number (4.0, 5.0...) be done on the basis of ABI changes?

    No, vice versa. The PEP promises that the ABI won't change until Python
    4. For any change that might break the ABI, either a
    backwards-compatible solution needs to be found, or the change be
    deferred to Python 4.

    @loewis loewis mannequin changed the title Make hash values the same width as a pointer (or Py_ssize_t) Make hash values the same width as a pointer (or Py_ssize_t) Sep 5, 2010
    @birkenfeld
    Copy link
    Member

    I would suggest making this change before the ABI freeze starts, then.

    @mdickinson
    Copy link
    Member

    ... change all hash values to be of Py_ssize_t values rather than C longs ...

    Yes, please! (Provided this change can go in before 3.2.) For the numeric types at least, this should be a straightforward adjustment.

    And while we're at it, we could clean up some of the undefined behaviour from integer overflow that's in the current hash functions (tuple.__hash__, for example).

    @birkenfeld
    Copy link
    Member

    Ping?

    @smontanaro
    Copy link
    Contributor

    Not that anybody needs my input on this, but...

    Given the range of people advocating for this change, this looks to me
    like it should be a release blocker for 3.2. Raymond's comment about
    performance seems especially important, and since the world seems to
    be moving toward 64-bit operating systems (certainly should happen in
    a big way during the lifetime of Python 3) it seems worthwhile to hold
    up further 3.2 releases until this is solved.

    @pitrou
    Copy link
    Member Author

    pitrou commented Oct 15, 2010

    Given the range of people advocating for this change, this looks to me
    like it should be a release blocker for 3.2. Raymond's comment about
    performance seems especially important, and since the world seems to
    be moving toward 64-bit operating systems (certainly should happen in
    a big way during the lifetime of Python 3) it seems worthwhile to hold
    up further 3.2 releases until this is solved.

    I think this is a bit exagerated. The performance issues will only
    appear if you have huge dicts and sets.
    The issue Raymond raised is the potential impossibility of making the
    change /after/ we settle on a stable ABI. The question is whether the
    ABI will be enforced starting from 3.2, or from a later date.

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Oct 17, 2010

    The issue Raymond raised is the potential impossibility of making the
    change /after/ we settle on a stable ABI. The question is whether the
    ABI will be enforced starting from 3.2, or from a later date.

    I'd like to repeat that it will not be impossible to fix this after the
    ABI is in place.

    @pitrou
    Copy link
    Member Author

    pitrou commented Oct 17, 2010

    Le dimanche 17 octobre 2010 à 17:40 +0000, Martin v. Löwis a écrit :

    Martin v. Löwis <martin@v.loewis.de> added the comment:

    > The issue Raymond raised is the potential impossibility of making the
    > change /after/ we settle on a stable ABI. The question is whether the
    > ABI will be enforced starting from 3.2, or from a later date.

    I'd like to repeat that it will not be impossible to fix this after the
    ABI is in place.

    At the cost of very annoying complications, though.

    @birkenfeld
    Copy link
    Member

    Can't we just do it now, and be done with it regardless of the stable ABI?

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Oct 17, 2010

    Can't we just do it now, and be done with it regardless of the stable ABI?

    Sure. Somebody needs to implement it (and consider what consequences
    this has on third-party modules - I'm uncertain).

    @casevh
    Copy link
    Mannequin

    casevh mannequin commented Oct 17, 2010

    I'm the maintainer of a third-party library (gmpy) that would be impacted by this and I'm definately in favor of this change. With ever increasing amounts of memory becoming standard in computers, more users will encounter performance issues with large dictionaries. I see 3.2 as the version of Python 3.x that gains wide-spread use. Regardless of the timing of a stable ABI, it's popularity will make it more difficult change the hash size in the future.

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Oct 17, 2010

    I'm the maintainer of a third-party library (gmpy) that would be
    impacted by this and I'm definitely in favor of this change.

    Assume Python would make such a change, and users would build released
    gmpy versions for such a Python release. What would happen (in terms
    of observable errors)?

    @benjaminp
    Copy link
    Contributor

    Here's a patch. Please review.

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Oct 17, 2010

    A few comments:

    • in complex_hash, why did you drop the cast?
    • in object.rst, add versionchanged. It should also explain that Py_hash_t is a signed integer with the same width as size_t.

    Otherwise, it looks fine.

    @benjaminp
    Copy link
    Contributor

    Dropping the cast was inadvertent. I've now added versionchanged and committed it in r85664.

    @smontanaro
    Copy link
    Contributor

    I added a placeholder to the What's New document. Since it will affect extension module authors it should be mentioned at a higher level than just Misc/NEWS.

    @casevh
    Copy link
    Mannequin

    casevh mannequin commented Oct 18, 2010

    The patch does not address that "unsigned long" is still used to calculate the hash values. This breaks numeric hashing and leads to incorrect values.

    Python 3.2a3+ (py3k, Oct 17 2010, 19:03:38) [MSC v.1500 64 bit (AMD64)] on win32
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import sys
    >>> sys.hash_info
    sys.hash_info(width=64, modulus=-1, inf=314159, nan=0, imag=1000003)
    >>>

    Would "size_t" be the appropriate type to use?

    @benjaminp
    Copy link
    Contributor

    Good point. Here's another patch:

    @malemburg
    Copy link
    Member

    Wouldn't it be better to use Py_hash_t instead of size_t for calculating the hash values in those hash functions ?

    @malemburg
    Copy link
    Member

    And related to my previous comment: shouldn't Py_hash_t map to size_t instead of Py_ssize_t ?

    @casevh
    Copy link
    Mannequin

    casevh mannequin commented Oct 18, 2010

    Some quick comments on the latest patch.

    1. I don't think you can remove the type cast used when comparing the hash value against -1 and -2. IIRC, GCC considers that undefined behavior.

    2. In sysmodule.c, we need to use PyLong_FromSsize_t when storing _PyHASH_MODULUS.

    3. In pyport.h, we need a better way to define _PyHASH_MODULUS. The existing "((1UL << _PyHASH_BITS) - 1)" fails when unsigned long is smaller than Py_ssize_t. "((1ULL << _PyHASH_BITS) - 1)" works on 64-bit Windows but is not a good solution for systems that don't have an unsigned long long type. I haven't thought of a better solution for one.

    @abalkin
    Copy link
    Member

    abalkin commented Oct 18, 2010

    On Mon, Oct 18, 2010 at 8:45 AM, Benjamin Peterson
    <report@bugs.python.org> wrote:
    ..

    No, negative values have to be allowed.

    Why? As far as I can tell, negative values are only used as sentinels
    and we can use say (size_t)-1 instead of -1L. Are there cases where
    hash values' ordering is important?

    @pitrou
    Copy link
    Member Author

    pitrou commented Oct 18, 2010

    Le lundi 18 octobre 2010 à 13:56 +0000, Alexander Belopolsky a écrit :

    Alexander Belopolsky <belopolsky@users.sourceforge.net> added the comment:

    On Mon, Oct 18, 2010 at 8:45 AM, Benjamin Peterson
    <report@bugs.python.org> wrote:
    ..
    > No, negative values have to be allowed.
    >

    Why? As far as I can tell, negative values are only used as sentinels
    and we can use say (size_t)-1 instead of -1L.

    You can, except that changing the sentinel value will probably break a
    lot of code out there (and there's no benefit AFAICT).

    @abalkin
    Copy link
    Member

    abalkin commented Oct 18, 2010

    On Mon, Oct 18, 2010 at 9:59 AM, Antoine Pitrou <report@bugs.python.org> wrote:
    ..

    > Why?  As far as I can tell, negative values are only used as sentinels
    > and we can use say (size_t)-1 instead of -1L.

    You can, except that changing the sentinel value will probably break a
    lot of code out there (and there's no benefit AFAICT).

    AFAICT, a change from (Py_ssize_t)-1 to (size_t)-1 is less likely to
    break code than a change from -1L to (Py_ssize_t)-1. (Assuming a
    sizeof(long) != sizeof(void*) platform.) The benefit, though is that
    hash computations can be performed natively on the hash values without
    casting to an unrelated type. If Py_hash_t needs to be signed, I
    would like to see a typedef Py_uhash_t size_t next to that for
    Py_hash_t and Py_uhash_t used in hash computations rather than
    explicit size_t.

    @malemburg
    Copy link
    Member

    Benjamin Peterson wrote:

    Benjamin Peterson <benjamin@python.org> added the comment:
    > Marc-Andre Lemburg <mal@egenix.com> added the comment:
    >
    > And related to my previous comment: shouldn't Py_hash_t map to size_t instead of Py_ssize_t ?

    No, negative values have to be allowed.

    Ah, right... we use -1 as error indicator.

    @casevh
    Copy link
    Mannequin

    casevh mannequin commented Oct 18, 2010

    Also, note that hash(-12) is -12.

    @pitrou
    Copy link
    Member Author

    pitrou commented Oct 18, 2010

    AFAICT, a change from (Py_ssize_t)-1 to (size_t)-1 is less likely to
    break code than a change from -1L to (Py_ssize_t)-1. (Assuming a
    sizeof(long) != sizeof(void*) platform.)

    That's true.

    The benefit, though is that
    hash computations can be performed natively on the hash values without
    casting to an unrelated type.

    I don't understand what you mean by "native" and "unrelated". Signed
    integers are not less native than unsigned ones.

    @abalkin
    Copy link
    Member

    abalkin commented Oct 18, 2010

    On Mon, Oct 18, 2010 at 11:27 AM, Antoine Pitrou <report@bugs.python.org> wrote:
    ..

    > The benefit, though is that
    > hash computations can be performed natively on the hash values without
    > casting to an unrelated type.

    I don't understand what you mean by "native" and "unrelated". Signed
    integers are not less native than unsigned ones.

    Sorry, I could have been clearer indeed. Consider the following code:

    static Py_hash_t
    long_hash(PyLongObject *v)
    {
        unsigned long x;
    ...
        x = x * sign;
        if (x == (unsigned long)-1)
            x = (unsigned long)-2;
        return (Py_hash_t)x;
    }

    Wouldn't it be cleaner if x was the same type as hash? Note that
    unsigned long is now wrong. What is needed is "unsigned integer type
    of the same size as Py_hash_t." If Py_hash_t has to stay signed, I
    think we should at least not rely of sizeof(Py_hash_t) to always
    remain the same as sizeof(size_t).

    @casevh
    Copy link
    Mannequin

    casevh mannequin commented Oct 18, 2010

    Sorry, I could have been clearer indeed. Consider the following code:

    static Py_hash_t
    long_hash(PyLongObject *v)
    {
    unsigned long x;
    ...
    x = x * sign;
    if (x == (unsigned long)-1)
    x = (unsigned long)-2;
    return (Py_hash_t)x;
    }

    Wouldn't it be cleaner if x was the same type as hash? Note that
    unsigned long is now wrong. What is needed is "unsigned integer type
    of the same size as Py_hash_t." If Py_hash_t has to stay signed, I
    think we should at least not rely of sizeof(Py_hash_t) to always
    remain the same as sizeof(size_t).

    The calculation of long_hash assumes an unsigned temporary type to get correct results for the bit shifting and masking. The calculation is done on the absolute value of the long and then the sign is applied. We either needed to (1) add an unsigned Py_hash_t type or (2) just use size_t and Py_ssize_t.

    @abalkin
    Copy link
    Member

    abalkin commented Oct 18, 2010

    On Mon, Oct 18, 2010 at 1:25 PM, Case Van Horsen <report@bugs.python.org> wrote:
    ..

    We either needed to (1) add an unsigned Py_hash_t type or (2) just use size_t and Py_ssize_t.

    Option (2) may actually be preferable because dict and set
    implementations rely on Py_hash_t being compatible with Py_ssize_t.
    See bpo-1646068.

    @rhettinger
    Copy link
    Contributor

    My expectation is that Py_hash_t is the same as Py_ssize_t. The goal is to make hashes match the range of possible table sizes.

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Oct 18, 2010

    Am 18.10.2010 17:27, schrieb Antoine Pitrou:

    Antoine Pitrou <pitrou@free.fr> added the comment:

    > AFAICT, a change from (Py_ssize_t)-1 to (size_t)-1 is less likely to
    > break code than a change from -1L to (Py_ssize_t)-1. (Assuming a
    > sizeof(long) != sizeof(void*) platform.)

    That's true.

    I don't think it is. Code that is changed to use the new return type
    will not break in a change from long to Py_ssize_t, since all values
    will sign-expand correctly, in all cases; the same is true if the new
    return type was size_t. So for code that gets adjusted in its return
    value, no breakage in either case.

    For code that *doesn't* get adjusted, I'm still uncertain what will
    happen. However, ISTM that if there is a cast to the "correct" function
    pointer type, you get an incorrect detection of the guard value in
    either case: e.g. on AMD64, the return value is in RAX, yet the hash
    function may only fill out EAX. I'm not sure what values the upper
    32 bits will have, but I doubt it gets sign-expanded - which it should
    regardless of whether the expected value is (size_t)-1 or
    (Py_ssize_t)-1. So you get breakage in either case.

    Please correct me if I'm wrong.

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Oct 18, 2010

    Wouldn't it be cleaner if x was the same type as hash? Note that
    unsigned long is now wrong. What is needed is "unsigned integer type
    of the same size as Py_hash_t." If Py_hash_t has to stay signed, I
    think we should at least not rely of sizeof(Py_hash_t) to always
    remain the same as sizeof(size_t).

    But this is an absolute requirement, a guarantee that we provide
    forever, and the whole point of this patch. Py_hash_t *will* be
    a signed version of size_t, just as Py_ssize_t. Not by chance, but
    by careful, inherent design.

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Oct 18, 2010

    My expectation is that Py_hash_t is the same as Py_ssize_t. The goal
    is to make hashes match the range of possible table sizes.

    Please read the patch:

    typedef Py_ssize_t Py_hash_t;

    @mdickinson
    Copy link
    Member

    The calculation of long_hash assumes an unsigned temporary type to get
    correct results for the bit shifting and masking.

    Yes, exactly.

    The calculation is
    done on the absolute value of the long and then the sign is applied. We
    either needed to (1) add an unsigned Py_hash_t type or (2) just use
    size_t and Py_ssize_t.

    I like (2); the use of Py_hash_t suggests to me that the type used for the hash is configurable independently of Py_ssize_t, which isn't true.

    Also, with Py_hash_t it's no longer clear that printing a hash value (e.g., using PyErr_Format and friends) should use the '%zd' modifier.

    @casevh
    Copy link
    Mannequin

    casevh mannequin commented Oct 21, 2010

    I've attached a patch that fixes hashing for numerical types, sys.hash_info is now correct, fixes typeobject.c/wrap_hashfunc and tupleobject.c/tuplehash to use Py_ssize_t instead of long, and uses Py_ssize_t instead of Py_hash_t.

    I think it is clearer to use Py_ssize_t instead of Py_hash_t. I found two occurances where PyLong_FromLong needed to be replaced by PyLong_FromSsize_t and I think bugs like that would be easier to catch if Py_ssize_t is used.

    @casevh
    Copy link
    Mannequin

    casevh mannequin commented Oct 23, 2010

    I maintain gmpy and it needs to calculate hash values for integers, floats, and rationals. I converted my hash calculations to use Py_ssize_t in a 64-bit Windows enviroment. All my tests pass when I build Python with my previous patch.

    In hindsight, I think I made a mistake in my previous patch by eliminating Py_hash_t and using Py_ssize_t/size_t. I ended up defining Py_hash_t and Py_uhash_t in gmpy to simplify the code and to easily support older versions of Python.

    I will work on a patch that defines Py_hash_t and Py_uhash_t and upload it later this evening.

    @casevh
    Copy link
    Mannequin

    casevh mannequin commented Oct 23, 2010

    I've uploaded a patch against the current svn trunk that:

    1. Defines a Py_uhash_t as equivalent to size_t.
    2. Correctly defines _PyHASH_MODULUS on Win64.
    3. Replaces several PyLong_FromLong with PyLong_FromSsize_t.
    4. Change typeobject/wrap_hashfunc to use Py_hash_t instead of long.
    5. Change tupleobject/tuplehash to use Py_hast_t instead of long.
    6. Change long/double/complex hash functions to use Py_uhash_t instead of unsigned long.

    @benjaminp
    Copy link
    Contributor

    Thank you. Applied in r85803.

    @pitrou
    Copy link
    Member Author

    pitrou commented Oct 23, 2010

    Case's patch fixes test_builtin and test_complex failures on Windows 7 64-bit. But there's still a failure in test_dictviews:

    ======================================================================
    FAIL: test_items_set_operations (test.test_dictviews.DictSetTest)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "Z:\__svn__\lib\test\test_dictviews.py", line 153, in test_items_set_operations
        {('a', 1), ('b', 2)})
    AssertionError: Items in the first set but not the second:
    ('a', 1)
    ('b', 2)

    Which boils down to the following issue:

    >>> d1 = {'a': 1, 'b': 2}
    >>> d1.items()
    dict_items([('a', 1), ('b', 2)])
    >>> set(d1.items())
    {('a', 1), ('b', 2)}
    >>> d1.items() | set(d1.items())
    {('a', 1), ('a', 1), ('b', 2), ('b', 2)}

    There are also a bunch of possibly related failures in test_weakset:

    ======================================================================
    FAIL: test_inplace_on_self (test.test_weakset.TestWeakSet)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "Y:\py3k\__svn__\lib\test\test_weakset.py", line 293, in test_inplace_on_
    self
        self.assertEqual(t, self.s)
    AssertionError: <_weakrefset.WeakSet object at 0x000000000283CDC0> != <_weakrefs
    et.WeakSet object at 0x000000000283B2C8>

    ======================================================================
    FAIL: test_or (test.test_weakset.TestWeakSet)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "Y:\py3k\__svn__\lib\test\test_weakset.py", line 72, in test_or
        self.assertEqual(self.s | set(self.items2), i)
    AssertionError: <_weakrefset.WeakSet object at 0x000000000285B400> != <_weakrefs
    et.WeakSet object at 0x000000000285B260>

    ======================================================================
    FAIL: test_symmetric_difference (test.test_weakset.TestWeakSet)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "Y:\py3k\__svn__\lib\test\test_weakset.py", line 110, in test_symmetric_d
    ifference
        self.assertEqual(c in i, (c in self.d) ^ (c in self.items2))
    AssertionError: False != True

    ======================================================================
    FAIL: test_union (test.test_weakset.TestWeakSet)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "Y:\py3k\__svn__\lib\test\test_weakset.py", line 61, in test_union
        self.assertEqual(c in u, c in self.d or c in self.items2)
    AssertionError: False != True

    ======================================================================
    FAIL: test_xor (test.test_weakset.TestWeakSet)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "Y:\py3k\__svn__\lib\test\test_weakset.py", line 117, in test_xor
        self.assertEqual(self.s ^ set(self.items2), i)
    AssertionError: <_weakrefset.WeakSet object at 0x000000000292CA18> != <_weakrefs
    et.WeakSet object at 0x000000000292C878>

    Another bunch of test_pyclbr failures:

    ======================================================================
    FAIL: test_decorators (test.test_pyclbr.PyclbrTest)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "Y:\py3k\__svn__\lib\test\test_pyclbr.py", line 152, in test_decorators
        self.checkModule('test.pyclbr_input', ignore=['om'])
      File "Y:\py3k\__svn__\lib\test\test_pyclbr.py", line 101, in checkModule
        self.assertListEq(real_bases, pyclbr_bases, ignore)
      File "Y:\py3k\__svn__\lib\test\test_pyclbr.py", line 28, in assertListEq
        self.fail("%r missing" % missing.pop())
    AssertionError: 'object' missing

    ======================================================================
    FAIL: test_easy (test.test_pyclbr.PyclbrTest)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "Y:\py3k\__svn__\lib\test\test_pyclbr.py", line 142, in test_easy
        self.checkModule('pyclbr')
      File "Y:\py3k\__svn__\lib\test\test_pyclbr.py", line 101, in checkModule
        self.assertListEq(real_bases, pyclbr_bases, ignore)
      File "Y:\py3k\__svn__\lib\test\test_pyclbr.py", line 28, in assertListEq
        self.fail("%r missing" % missing.pop())
    AssertionError: 'object' missing

    ======================================================================
    FAIL: test_others (test.test_pyclbr.PyclbrTest)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "Y:\py3k\__svn__\lib\test\test_pyclbr.py", line 158, in test_others
        cm('random', ignore=('Random',))  # from _random import Random as CoreGenera
    tor
      File "Y:\py3k\__svn__\lib\test\test_pyclbr.py", line 101, in checkModule
        self.assertListEq(real_bases, pyclbr_bases, ignore)
      File "Y:\py3k\__svn__\lib\test\test_pyclbr.py", line 28, in assertListEq
        self.fail("%r missing" % missing.pop())
    AssertionError: 'Random' missing

    And a test_sys failure which is probably easy to fix:

    File "Z:\svn\lib\test\test_sys.py", line 831, in test_objecttypes
    check(s, basicsize)
    File "Z:\svn\lib\test\test_sys.py", line 601, in check_sizeof
    self.assertEqual(result, size, msg)
    AssertionError: wrong size for <class 'str'>: got 74, expected 66

    All these tests pass in 32-bit mode.
    Case, you can run the regression test suite with:
    python.exe -m test.regrtest
    and invidual tests with e.g.:
    python.exe -m test.regrtest -v test_dictviews

    @pitrou pitrou reopened this Oct 23, 2010
    @pitrou
    Copy link
    Member Author

    pitrou commented Oct 23, 2010

    This patch seems to fix all aforementioned failures.

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Oct 23, 2010

    These changes also look all reasonable to me.

    @pitrou
    Copy link
    Member Author

    pitrou commented Oct 23, 2010

    Ok, I've committed them in r85808.

    @casevh
    Copy link
    Mannequin

    casevh mannequin commented Oct 23, 2010

    On Win64, I get two unexpected failures:

    I terminated test_capi after a Windows exception box popped up.

    [ 37/349] test_capi
    test test_capi failed -- Traceback (most recent call last):
      File "C:\svn\py3k\lib\test\test_capi.py", line 50, in test_no_FatalError_infinite_loop
        b'Fatal Python error:'
    AssertionError: b"Fatal Python error: PyThreadState_Get: no current thread\r\n\r\nThis application has requested the Runtime to term
    inate it in an unusual way.\nPlease contact the application's support team for more information." != b'Fatal Python error: PyThreadS
    tate_Get: no current thread'
    
    [ 67/349] test_concurrent_futures
    test test_concurrent_futures failed -- Traceback (most recent call last):
      File "C:\svn\py3k\lib\test\test_concurrent_futures.py", line 442, in test_timeout
        future1]), finished)
    AssertionError: Items in the second set but not the first:
    <Future at 0x5510fd0 state=finished raised AssertionError>

    The dictviews test passes successfully.

    @pitrou
    Copy link
    Member Author

    pitrou commented Oct 23, 2010

    The test_capi problem is not 64-bit-specific (see bpo-9116).

    As for test_concurrent_futures, I also have a failure (not the same one) in both 32-bit and 64-bit builds here. I'm gonna open a separate issue.

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Oct 23, 2010

    I recommend to declare this issue closed, and keep it closed unless somebody wants to propose to revert the change (widening the hash type) completely. Any remaining issues that people want to attribute to this change (correctly or incorrectly) should be reported as separate issues.

    @loewis loewis mannequin closed this as completed Oct 23, 2010
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    8 participants