Title: Better repr for threading objects
Type: enhancement Stage: patch review
Components: Library (Lib) Versions: Python 3.6, Python 3.5
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: berker.peksag, brett.cannon, davin, larry, pitrou, rbcollins, rhettinger, serhiy.storchaka, terry.reedy, vstinner
Priority: normal Keywords: patch

Created on 2015-06-06 10:21 by serhiy.storchaka, last changed 2016-09-10 15:57 by serhiy.storchaka.

File name Uploaded Description Edit
threading_objects_reprs.patch serhiy.storchaka, 2015-06-06 10:23 review
threading_objects_reprs_2.patch serhiy.storchaka, 2015-06-06 15:42 review
Messages (15)
msg244896 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-06-06 10:21
Proposed patch adds makes reprs of threading objects Semaphore, BoundedSemaphore, Event, and Barrier expose their public states. This will help for debugging.


<Semaphore: 10 at 0xb710ec8c>
<BoundedSemaphore: 7/10 at 0xb6ff1d6c>
<unset Event at 0xb710ec8c>
<set Event at 0xb710ec8c>
<Barrier: 0/10 at 0xb6ff1d6c>
<Barrier: 3/10 at 0xb6ff1d6c>
<broken Barrier: 3/10 at 0xb6ff1d6c>
msg244900 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2015-06-06 12:06
While I like the concept, I don't like the reprs where the type is not the first thing listed, e.g., "set Event" not having "Event" first. The default repr and typical practice of having the type come first makes my brain initially read that repr as a "set of Events" instead of "an Event that is set".
msg244902 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-06-06 13:01
This is for consistency with reprs of lock objects (issue21137):

<unlocked _thread.lock object at 0xb6f2b188>
<locked _thread.lock object at 0xb6f2b188>
<unlocked _thread.RLock object owner=0 count=0 at 0xb6f2b0b0>
<locked _thread.RLock object owner=-1219680512 count=2 at 0xb6f2b0b0>
msg244903 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2015-06-06 13:06
+1. The patch looks good to me.
msg244905 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2015-06-06 13:49
I'd like to see this in 3.5. This is a no-brainer improvement and has very little chance of breaking anything compared to a new OrderedDict implementation.
msg244907 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2015-06-06 14:01
I'll allow it.  But I agree with Brett, I really would prefer that the type be the first thing in the repr.  I'd rather fix the lock objects than compound our error.
msg244910 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-06-06 14:28
Would following reprs good?

<Event: set at 0xb710ec8c>
<Event: unset at 0xb710ec8c>

(what is better: "unset" or "clear"?)

Should the repr contain the module name (<threading.Event: set at 0xb710ec8c>)?
msg244911 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2015-06-06 14:28
FWIW I find "unset" more obvious.  I don't think it needs the module name.
msg244912 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2015-06-06 14:59
I think the module name is useful, since there are also multiprocessing events and semaphores. Also, Event is a fairly generic name which may be used in other contexts (e.g. GUI libraries).

"set" and "unset" sound good to me.
msg244917 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-06-06 15:42
Here is updated patch. Now the repr always starts with the qualified class name.

<threading.Semaphore: 10 at 0xb710ec8c>
<threading.BoundedSemaphore: 7/10 at 0xb6ff1d6c>
<threading.Event: unset at 0xb710ec8c>
<threading.Event: set at 0xb710ec8c>
<threading.Barrier: 0/10 at 0xb6ff1d6c>
<threading.Barrier: 3/10 at 0xb6ff1d6c>
<threading.Barrier: 0/10, broken at 0xb6ff1d6c>

But there is other question. Should reprs be consistent with reprs of corresponding classes in multiprocessing and asyncio modules?

In multiprocessing:

<BoundedSemaphore(value=10, maxvalue=10)>

In asyncio:

<asyncio.locks.Event object at 0xb6fa180c [unset]>
<asyncio.locks.Event object at 0xb6fa180c [set]>
<asyncio.locks.Event object at 0xb6fa180c [unset,waiters:3]>
<asyncio.locks.Semaphore object at 0xb6cec26c [unlocked,value:10]>
<asyncio.locks.Semaphore object at 0xb6cec26c [locked]>
<asyncio.locks.Semaphore object at 0xb6cec26c [locked, waiters:3]>
msg244958 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2015-06-07 14:38
The world of reprs already isn't particularly consistent.  If you make your reprs consistent with module X, it'll be *inconsistent* with module Y, and vice versa.  I say let's just worry about making it nice and readable for humans.

That said, I like the asyncio reprs.  Specifically, I like how the interesting things (class, interesting values) are at the front and rear, and the address is in the middle.  While it's occasionally helpful (or even crucial!) to show the address, I *almost* never need to look at it.  Arranging the data in this way makes it easy to skip over the address when reading.

However, I don't think the parentheses or square brackets are necessary.  It's not like this is a computer-readable syntax, and for human beings it's just visual clutter.  I suggest it's sufficient to have comma separated values with a space after each comma.  Also: I definitely don't like the colon after the class name.

So here's how *I* might reformat all your examples:

<threading.Semaphore at 0xb710ec8c 10>
<threading.BoundedSemaphore at 0xb6ff1d6c 7/10>
<threading.Event at 0xb710ec8c unset>
<threading.Event at 0xb710ec8c set>
<threading.Barrier at 0xb6ff1d6c 0/10>
<threading.Barrier at 0xb6ff1d6c 3/10>
<threading.Barrier at 0xb6ff1d6c 0/10, broken>

p.s. Thank you for posting these samples of other approaches!  It definitely shed some light into the discussion.
msg245267 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2015-06-12 18:39
The Semaphore and BoundedSemphore examples show why including the module is helpful.  I prefer the function-call syntax, as long as it is accurate, even though the string as a whole cannot be eval()-ed.  In any case, threading and multiprocessing are somewhat 'parallel' modules, so their reps should follow the same pattern.
msg248084 - (view) Author: Robert Collins (rbcollins) * (Python committer) Date: 2015-08-05 22:02
So, I think the majority opinion seems to be:
<module.type at address value1, value2, ...>

AIUI the patch doesn't quite do that, so I'm going to pop this back to patch review.

That said, Serhiy, if you fix it up, just commit it please :)
msg250446 - (view) Author: Davin Potts (davin) * (Python committer) Date: 2015-09-11 04:44
Echoing Larry's insightful summary, it would not be particularly rewarding to pursue a goal of making the whole world of reprs consistent.  But as Antoine and Serhiy began to point out, the multiprocessing module not only has its own Event, Semaphore, BoundedSemaphore, and Barrier, it attempts to directly mirror the threading module's equivalents.  If there's a change to the reprs for these in the threading module, then given the special relationship between multiprocessing and threading, I suggest it is worth the effort to update multiprocessing accordingly.

I have created issue25066 to separately address this proposed change to multiprocessing.

I've gone ahead and provided a patch for issue25066 to reflect the "majority opinion" on the reprs' desired format.  That patch should pass all tests once Serhiy's patch is similarly updated to match what Larry summarized.

If this issue does reach a happy conclusion and Serhiy finds time to update his patch, would someone please also take a moment to review the patch in issue25066?
msg275649 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-09-10 15:57
I agree, that the address is the least interesting thing, and I think this is an argument to make it the last thing in the repr (at least for those of us who read from left to right). Many objects in different modules outputs the address last. The only exceptions are asyncio (just because it constructs reprs from super().__repr__(), including verbose "object" after type name), concurrent.futures and weakref.finalize (the latter contains multiple addresses in any case). Do we need "object" after type name?
Date User Action Args
2016-09-10 15:57:28serhiy.storchakasetmessages: + msg275649
2015-09-14 22:39:56vstinnersetnosy: + vstinner
2015-09-11 04:44:18davinsetnosy: + davin
messages: + msg250446
2015-09-11 04:26:01davinlinkissue25066 dependencies
2015-08-05 22:02:38rbcollinssetnosy: + rbcollins

messages: + msg248084
stage: commit review -> patch review
2015-06-12 18:39:44terry.reedysetnosy: + terry.reedy
messages: + msg245267
2015-06-07 14:38:08larrysetmessages: + msg244958
2015-06-06 15:42:02serhiy.storchakasetfiles: + threading_objects_reprs_2.patch

messages: + msg244917
2015-06-06 14:59:29pitrousetmessages: + msg244912
2015-06-06 14:28:50larrysetmessages: + msg244911
2015-06-06 14:28:10serhiy.storchakasetmessages: + msg244910
2015-06-06 14:01:03larrysetmessages: + msg244907
2015-06-06 13:49:30pitrousetnosy: + larry

messages: + msg244905
versions: + Python 3.5
2015-06-06 13:47:13berker.peksagsetstage: patch review -> commit review
2015-06-06 13:06:02pitrousetmessages: + msg244903
2015-06-06 13:01:42serhiy.storchakasetmessages: + msg244902
2015-06-06 12:06:55brett.cannonsetmessages: + msg244900
2015-06-06 10:23:38serhiy.storchakasetfiles: + threading_objects_reprs.patch
keywords: + patch
2015-06-06 10:21:42serhiy.storchakacreate