Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better repr for threading objects #68579

Closed
serhiy-storchaka opened this issue Jun 6, 2015 · 19 comments
Closed

Better repr for threading objects #68579

serhiy-storchaka opened this issue Jun 6, 2015 · 19 comments
Assignees
Labels
3.11 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@serhiy-storchaka
Copy link
Member

BPO 24391
Nosy @rhettinger, @terryjreedy, @pitrou, @vstinner, @larryhastings, @rbtcollins, @berkerpeksag, @serhiy-storchaka, @applio
PRs
  • bpo-24391: Better reprs for threading objects. #20534
  • Files
  • threading_objects_reprs.patch
  • threading_objects_reprs_2.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/serhiy-storchaka'
    closed_at = None
    created_at = <Date 2015-06-06.10:21:42.594>
    labels = ['type-feature', 'library', '3.11']
    title = 'Better repr for threading objects'
    updated_at = <Date 2021-09-29.10:08:23.064>
    user = 'https://github.com/serhiy-storchaka'

    bugs.python.org fields:

    activity = <Date 2021-09-29.10:08:23.064>
    actor = 'serhiy.storchaka'
    assignee = 'serhiy.storchaka'
    closed = False
    closed_date = None
    closer = None
    components = ['Library (Lib)']
    creation = <Date 2015-06-06.10:21:42.594>
    creator = 'serhiy.storchaka'
    dependencies = []
    files = ['39637', '39642']
    hgrepos = []
    issue_num = 24391
    keywords = ['patch']
    message_count = 18.0
    messages = ['244896', '244900', '244902', '244903', '244905', '244907', '244910', '244911', '244912', '244917', '244958', '245267', '248084', '250446', '275649', '370327', '402699', '402843']
    nosy_count = 9.0
    nosy_names = ['rhettinger', 'terry.reedy', 'pitrou', 'vstinner', 'larry', 'rbcollins', 'berker.peksag', 'serhiy.storchaka', 'davin']
    pr_nums = ['20534']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue24391'
    versions = ['Python 3.11']

    @serhiy-storchaka
    Copy link
    Member Author

    Proposed patch adds makes reprs of threading objects Semaphore, BoundedSemaphore, Event, and Barrier expose their public states. This will help for debugging.

    Examples:

    <Semaphore: 10 at 0xb710ec8c>
    <BoundedSemaphore: 7/10 at 0xb6ff1d6c>
    <unset Event at 0xb710ec8c>
    <set Event at 0xb710ec8c>
    <Barrier: 0/10 at 0xb6ff1d6c>
    <Barrier: 3/10 at 0xb6ff1d6c>
    <broken Barrier: 3/10 at 0xb6ff1d6c>

    @serhiy-storchaka serhiy-storchaka added stdlib Python modules in the Lib dir type-feature A feature request or enhancement labels Jun 6, 2015
    @brettcannon
    Copy link
    Member

    While I like the concept, I don't like the reprs where the type is not the first thing listed, e.g., "set Event" not having "Event" first. The default repr and typical practice of having the type come first makes my brain initially read that repr as a "set of Events" instead of "an Event that is set".

    @serhiy-storchaka
    Copy link
    Member Author

    This is for consistency with reprs of lock objects (bpo-21137):

    <unlocked _thread.lock object at 0xb6f2b188>
    <locked _thread.lock object at 0xb6f2b188>
    <unlocked _thread.RLock object owner=0 count=0 at 0xb6f2b0b0>
    <locked _thread.RLock object owner=-1219680512 count=2 at 0xb6f2b0b0>

    @pitrou
    Copy link
    Member

    pitrou commented Jun 6, 2015

    +1. The patch looks good to me.

    @pitrou
    Copy link
    Member

    pitrou commented Jun 6, 2015

    I'd like to see this in 3.5. This is a no-brainer improvement and has very little chance of breaking anything compared to a new OrderedDict implementation.

    @larryhastings
    Copy link
    Contributor

    I'll allow it. But I agree with Brett, I really would prefer that the type be the first thing in the repr. I'd rather fix the lock objects than compound our error.

    @serhiy-storchaka
    Copy link
    Member Author

    Would following reprs good?

    <Event: set at 0xb710ec8c>
    <Event: unset at 0xb710ec8c>

    (what is better: "unset" or "clear"?)

    Should the repr contain the module name (<threading.Event: set at 0xb710ec8c>)?

    @larryhastings
    Copy link
    Contributor

    FWIW I find "unset" more obvious. I don't think it needs the module name.

    @pitrou
    Copy link
    Member

    pitrou commented Jun 6, 2015

    I think the module name is useful, since there are also multiprocessing events and semaphores. Also, Event is a fairly generic name which may be used in other contexts (e.g. GUI libraries).

    "set" and "unset" sound good to me.

    @serhiy-storchaka
    Copy link
    Member Author

    Here is updated patch. Now the repr always starts with the qualified class name.

    <threading.Semaphore: 10 at 0xb710ec8c>
    <threading.BoundedSemaphore: 7/10 at 0xb6ff1d6c>
    <threading.Event: unset at 0xb710ec8c>
    <threading.Event: set at 0xb710ec8c>
    <threading.Barrier: 0/10 at 0xb6ff1d6c>
    <threading.Barrier: 3/10 at 0xb6ff1d6c>
    <threading.Barrier: 0/10, broken at 0xb6ff1d6c>

    But there is other question. Should reprs be consistent with reprs of corresponding classes in multiprocessing and asyncio modules?

    In multiprocessing:

    <Semaphore(value=10)>
    <BoundedSemaphore(value=10, maxvalue=10)>

    In asyncio:

    <asyncio.locks.Event object at 0xb6fa180c [unset]>
    <asyncio.locks.Event object at 0xb6fa180c [set]>
    <asyncio.locks.Event object at 0xb6fa180c [unset,waiters:3]>
    <asyncio.locks.Semaphore object at 0xb6cec26c [unlocked,value:10]>
    <asyncio.locks.Semaphore object at 0xb6cec26c [locked]>
    <asyncio.locks.Semaphore object at 0xb6cec26c [locked, waiters:3]>

    @larryhastings
    Copy link
    Contributor

    The world of reprs already isn't particularly consistent. If you make your reprs consistent with module X, it'll be *inconsistent* with module Y, and vice versa. I say let's just worry about making it nice and readable for humans.

    That said, I like the asyncio reprs. Specifically, I like how the interesting things (class, interesting values) are at the front and rear, and the address is in the middle. While it's occasionally helpful (or even crucial!) to show the address, I *almost* never need to look at it. Arranging the data in this way makes it easy to skip over the address when reading.

    However, I don't think the parentheses or square brackets are necessary. It's not like this is a computer-readable syntax, and for human beings it's just visual clutter. I suggest it's sufficient to have comma separated values with a space after each comma. Also: I definitely don't like the colon after the class name.

    So here's how *I* might reformat all your examples:

    <threading.Semaphore at 0xb710ec8c 10>
    <threading.BoundedSemaphore at 0xb6ff1d6c 7/10>
    <threading.Event at 0xb710ec8c unset>
    <threading.Event at 0xb710ec8c set>
    <threading.Barrier at 0xb6ff1d6c 0/10>
    <threading.Barrier at 0xb6ff1d6c 3/10>
    <threading.Barrier at 0xb6ff1d6c 0/10, broken>

    p.s. Thank you for posting these samples of other approaches! It definitely shed some light into the discussion.

    @terryjreedy
    Copy link
    Member

    The Semaphore and BoundedSemphore examples show why including the module is helpful. I prefer the function-call syntax, as long as it is accurate, even though the string as a whole cannot be eval()-ed. In any case, threading and multiprocessing are somewhat 'parallel' modules, so their reps should follow the same pattern.

    @rbtcollins
    Copy link
    Member

    So, I think the majority opinion seems to be:
    <module.type at address value1, value2, ...>

    AIUI the patch doesn't quite do that, so I'm going to pop this back to patch review.

    That said, Serhiy, if you fix it up, just commit it please :)

    @applio
    Copy link
    Member

    applio commented Sep 11, 2015

    Echoing Larry's insightful summary, it would not be particularly rewarding to pursue a goal of making the whole world of reprs consistent. But as Antoine and Serhiy began to point out, the multiprocessing module not only has its own Event, Semaphore, BoundedSemaphore, and Barrier, it attempts to directly mirror the threading module's equivalents. If there's a change to the reprs for these in the threading module, then given the special relationship between multiprocessing and threading, I suggest it is worth the effort to update multiprocessing accordingly.

    I have created bpo-25066 to separately address this proposed change to multiprocessing.

    I've gone ahead and provided a patch for bpo-25066 to reflect the "majority opinion" on the reprs' desired format. That patch should pass all tests once Serhiy's patch is similarly updated to match what Larry summarized.

    If this issue does reach a happy conclusion and Serhiy finds time to update his patch, would someone please also take a moment to review the patch in bpo-25066?

    @serhiy-storchaka
    Copy link
    Member Author

    I agree, that the address is the least interesting thing, and I think this is an argument to make it the last thing in the repr (at least for those of us who read from left to right). Many objects in different modules outputs the address last. The only exceptions are asyncio (just because it constructs reprs from super().__repr__(), including verbose "object" after type name), concurrent.futures and weakref.finalize (the latter contains multiple addresses in any case). Do we need "object" after type name?

    @brettcannon
    Copy link
    Member

    Assigning to Serhiy in case he wants to finish this.

    @serhiy-storchaka
    Copy link
    Member Author

    In the last version of PR 20534, the reprs will be similar to proposed by Larry in msg244958, except that a colon is used to separate an address from status, and keyword names are used for values.

    <threading.Semaphore at 0xb710ec8c: value=10>
    <threading.BoundedSemaphore at 0xb6ff1d6c: value=7/10>
    <threading.Event at 0xb710ec8c: unset>
    <threading.Event at 0xb710ec8c: set>
    <threading.Barrier at 0xb6ff1d6c: waiters=0/10>
    <threading.Barrier at 0xb6ff1d6c: waiters=3/10>
    <threading.Barrier at 0xb6ff1d6c: broken>

    It is closer to existing reprs, I'm going to rewrite reprs of locks, conditions and synchronization primitives in asyncio and multiprocessing to match this style: move status after type and address, remove parenthesis, brackets and commas, use "=" instead of ":" for named values, use "/" with maximal values, etc.

    @serhiy-storchaka serhiy-storchaka added the 3.11 only security fixes label Sep 27, 2021
    @serhiy-storchaka
    Copy link
    Member Author

    New changeset eed32df by Serhiy Storchaka in branch 'main':
    bpo-24391: Better reprs for threading objects. (GH-20534)
    eed32df

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @hauntsaninja
    Copy link
    Contributor

    Thanks, looks like this has been completed. There's a separate issue for corresponding changes to multiprocessing

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.11 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    Development

    No branches or pull requests

    8 participants