Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dict_keys purports to implement the Set ABC, but is missing the isdisjoint method #53458

Closed
stutzbach mannequin opened this issue Jul 9, 2010 · 14 comments
Closed

dict_keys purports to implement the Set ABC, but is missing the isdisjoint method #53458

stutzbach mannequin opened this issue Jul 9, 2010 · 14 comments
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error

Comments

@stutzbach
Copy link
Mannequin

stutzbach mannequin commented Jul 9, 2010

BPO 9212
Nosy @rhettinger, @terryjreedy, @merwok, @durban
Files
  • issue9212.diff: Patch (py3k branch)
  • issue9212a.diff: Patch (iterate over the smaller) (py3k branch)
  • issue9212b.diff: Patch (with corrections)
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2010-09-14.16:54:23.895>
    created_at = <Date 2010-07-09.19:03:35.203>
    labels = ['interpreter-core', 'type-bug']
    title = 'dict_keys purports to implement the Set ABC, but is missing the isdisjoint method'
    updated_at = <Date 2010-09-14.16:54:23.893>
    user = 'https://bugs.python.org/stutzbach'

    bugs.python.org fields:

    activity = <Date 2010-09-14.16:54:23.893>
    actor = 'stutzbach'
    assignee = 'stutzbach'
    closed = True
    closed_date = <Date 2010-09-14.16:54:23.895>
    closer = 'stutzbach'
    components = ['Interpreter Core']
    creation = <Date 2010-07-09.19:03:35.203>
    creator = 'stutzbach'
    dependencies = []
    files = ['18158', '18172', '18702']
    hgrepos = []
    issue_num = 9212
    keywords = ['patch']
    message_count = 14.0
    messages = ['109783', '109941', '109944', '109952', '111362', '111376', '111431', '114458', '115301', '115320', '115322', '115384', '115385', '115386']
    nosy_count = 5.0
    nosy_names = ['rhettinger', 'terry.reedy', 'stutzbach', 'eric.araujo', 'daniel.urban']
    pr_nums = []
    priority = 'high'
    resolution = 'accepted'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue9212'
    versions = ['Python 3.2']

    @stutzbach
    Copy link
    Mannequin Author

    stutzbach mannequin commented Jul 9, 2010

    >>> isinstance({}.keys(), collections.Set)
    True
    >>> [method for method in set(dir(collections.Set)) - set(dir({}.keys()))
    ...  if not method.startswith('_')]
    ['isdisjoint']

    (in Python 2.7, use "viewkeys()" instead of "keys")

    dict_items has the same problem.

    @stutzbach stutzbach mannequin self-assigned this Jul 9, 2010
    @stutzbach stutzbach mannequin added interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error labels Jul 9, 2010
    @rhettinger
    Copy link
    Contributor

    Concrete classes are allowed to have more features than the corresponding ABC. The ABCs are not intended to be full-featured; instead, they specify the part of the API that will be guaranteed. For example, the union() method for built-in sets allows two or more set arguments, but the corresponding method for the ABC is limited to two arguments. This was an intentional difference, designed to make it easier for people to implement a Set class.

    That being said, the omission of isdisjoint() was an oversight and I see no reason that this shouldn't be fixed.

    @stutzbach
    Copy link
    Mannequin Author

    stutzbach mannequin commented Jul 10, 2010

    In this case, the concrete class is the one missing a method.

    Concrete classes are allowed to provide more features than the corresponding ABC, but the converse is not true to the best of my knowledge.

    dict_keys .register()s as supporting the Set ABC, so it does not automatically pick up the method through inheritance. Put another way:

    >>> # dict_keys provides the Set ABC API
    >>> isinstance({}.keys(), collections.Set)
    True
    
    >>> # The Set ABC provides isdisjoint
    >>> hasattr(collections.Set, 'isdisjoint') 
    True
    
    >>> # Ergo, dict_keys should provide isdisjoint ... but it doesn't
    >>> hasattr({}.keys(), 'isdisjoint')       
    False

    See also bpo-9213 for another case where a concrete class is missing a method provided by an ABC it claims to support.

    I sort of wonder if .register() should verify that the concrete class provides all of the methods of the ABC.

    @rhettinger
    Copy link
    Contributor

    I had misread you original post. Thought you we saying that the Set ABC was missing disjoint. Disregard my last post.

    @durban
    Copy link
    Mannequin

    durban mannequin commented Jul 23, 2010

    The attached patch adds the isdisjoint method to dict_keys and dict_items.
    Pseudocode for the method:

    def isdisjoint(self, other):
        if self is other:
            if len(self) == 0:
                return True
            else:
                return False
        else:
            for item in other:
                if item in self:
                    return False
            return True

    @terryjreedy
    Copy link
    Member

    Titles that fit in the box, like 'Dict.keys lacks .isjoint method.' are easier to read and keep track of.

    Unless there is a reason I have missed, I would iterate through the smaller set, which might even be empty or nearly so, rather than either in particular.

    @durban
    Copy link
    Mannequin

    durban mannequin commented Jul 24, 2010

    Unless there is a reason I have missed, I would iterate through the
    smaller set, which might even be empty or nearly so, rather than
    either in particular.

    You're right, here is a new patch. Pseudocode:

    def isdisjoint(self, other):
        if self is other:
            if len(self) == 0:
                return True
            else:
                return False
        else:
            if len(other) > len(self):
                self, other = other, self
            for item in other:
                if item in self:
                    return False
            return True

    @stutzbach
    Copy link
    Mannequin Author

    stutzbach mannequin commented Aug 20, 2010

    Thank you for the patch.

    We should only iterate over the shorter set if the longer set is really a set and not just a sequence. PySequence_Contains may take O(n) time on a list, making the algorithm an expensive O(n**2) overall. I note that set_isdisjoint doesn't try to examine the lengths.

    Also, since PyIter_Next returns NULL to indicate the end of the iteration OR to indicate an exception, the end of the function should look like this:

        Py_DECREF(it);
        if (PyErr_Occurred())
            return NULL;
        Py_RETURN_TRUE;

    Other than those two issues, the patch looks good to me.

    @durban
    Copy link
    Mannequin

    durban mannequin commented Sep 1, 2010

    Thanks for the corrections. I'm attaching the new patch as issue9212b.diff. I'm using PyAnySet_Check to determine if the other argument is really a set, but I'm not entirely sure, that this is correct. Please let me know if other corrections are needed.

    @rhettinger
    Copy link
    Contributor

    It would be nice to get this fixed before the next release.

    @stutzbach
    Copy link
    Mannequin Author

    stutzbach mannequin commented Sep 1, 2010

    I will aim to spend some time with this (and the similar Issue bpo-9213) today and/or tomorrow, so that it can be committed in time for 3.2a2.

    @stutzbach
    Copy link
    Mannequin Author

    stutzbach mannequin commented Sep 2, 2010

    Committed to py3k in r84435. Raymond, do you want to look the commit over before I merge it into 3.1 and 2.7?

    @stutzbach
    Copy link
    Mannequin Author

    stutzbach mannequin commented Sep 2, 2010

    Meant to add:

    I made some relatively minor changes to Daniel Urban's patch. Mostly, I rearranged the order of a few things to avoid unnecessary work (e.g., only compute "len_other" if we've already checked that "other" is a set).

    Thank you Daniel for the patch! :-)

    @stutzbach
    Copy link
    Mannequin Author

    stutzbach mannequin commented Sep 2, 2010

    Also, credited Daniel Urban for the patch in r84436 (forgot that the first time around -- sorry!).

    @stutzbach stutzbach mannequin closed this as completed Sep 14, 2010
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants