This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author larry
Recipients Dennis Sweeney, eric.smith, larry, pablogsal, tim.peters
Date 2022-04-02.21:50:29
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1648936230.08.0.18343575057.issue47145@roundup.psfhosted.org>
In-reply-to
Content
I agree that the API should have as few surprises as possible.  AFAICT you haven't made any terminal pronouncements like "it's impossible to add this feature without too many unacceptable surprises", so I'll proceed assuming we can find an API (and semantics) that we're all happy with.

I've come around on the idea that forgetting "done" nodes is too surprising.  The least surprising behavior is: once we've added a node to the graph, the graph remembers it.

Now I'll propose a second simple rule, which you've already touched on: you can't add a dependency after a node has been returned by get_ready().  Attempting this always throws an exception; which exception specifically TBD.  "Tough beans".

I think those two rules are easy enough to remember and to reason about.  It meaningfully expands the utility of the library with minimal surprise.  The library behaves identically for users of the existing API but permits new use cases too.  Adding this functionality would therefore mean fewer users would discover too late their use case isn't supported.


As far as my "long-lived graph objects will consume more and more memory over time" caveat, there's a better solution than "forget": graph.remove(*nodes).  (My version already has a "remove" method, and I forgot that graphlib's doesn't.)  Allowing the user to remove a node from the graph gives them explicit control, and the semantics should be obvious and unsurprising; if you--the user--remove a node, and later you--the user--re-add that node to the graph, it behaves identically to any other node the graph has never seen before.  Removing a node intuitively removes all edges to that node.

Two notes on "remove" if we decide to go that route.  First, I'd ensure you can remove a node at any time. Nodes have three externally visible states wrt TopologicalSort:

1) added but not published by get_ready,
2) published by get_ready but not returned using done, and
3) done.  You should be able to remove a node in any of those three states.

Removing a node in 2) should be equivalent to calling done before calling remove; that is, if you're removing the node anyway, you don't need to call done.

Second, the current underlying implementation would make remove really slow.  Nodes don't remember their predecessors, only their successors, so removing a node would be O(n).  If we expected remove to get a lot of use, we'd probably want to change how the graph is stored to speed up this operation.
History
Date User Action Args
2022-04-02 21:50:30larrysetrecipients: + larry, tim.peters, eric.smith, pablogsal, Dennis Sweeney
2022-04-02 21:50:30larrysetmessageid: <1648936230.08.0.18343575057.issue47145@roundup.psfhosted.org>
2022-04-02 21:50:30larrylinkissue47145 messages
2022-04-02 21:50:29larrycreate