Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Threading: add builtin TID attribute to Thread objects #80265

Closed
jaketesler mannequin opened this issue Feb 22, 2019 · 40 comments
Closed

Threading: add builtin TID attribute to Thread objects #80265

jaketesler mannequin opened this issue Feb 22, 2019 · 40 comments
Labels
3.8 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@jaketesler
Copy link
Mannequin

jaketesler mannequin commented Feb 22, 2019

BPO 36084
Nosy @pitrou, @vstinner, @eryksun, @aixtools, @miss-islington, @jaketesler
PRs
  • bpo-36084: add native thread ID (TID) to threading.Thread objects #11993
  • Revert "bpo-36084: Add native thread ID to threading.Thread objects (GH-11993)" #13458
  • bpo-36084: add native thread ID (TID) to threading.Thread objects (V2) #13463
  • bpo-37087: Adding native ID support for OpenBSD. #13654
  • bpo-36084: Add threading Native ID information to What's New documentation #14845
  • [3.8] bpo-36084: Add threading Native ID information to What's New documentation (GH-14845) #15028
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2019-11-05.22:36:39.620>
    created_at = <Date 2019-02-22.22:46:11.392>
    labels = ['3.8', 'type-feature', 'library']
    title = 'Threading: add builtin TID attribute to Thread objects'
    updated_at = <Date 2019-11-07.16:38:36.624>
    user = 'https://github.com/jaketesler'

    bugs.python.org fields:

    activity = <Date 2019-11-07.16:38:36.624>
    actor = 'vstinner'
    assignee = 'none'
    closed = True
    closed_date = <Date 2019-11-05.22:36:39.620>
    closer = 'jaketesler'
    components = ['Library (Lib)']
    creation = <Date 2019-02-22.22:46:11.392>
    creator = 'jaketesler'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 36084
    keywords = ['patch']
    message_count = 40.0
    messages = ['336348', '339358', '342024', '342116', '342263', '342264', '342292', '342293', '342346', '342396', '342920', '343000', '343001', '343002', '343003', '343046', '343059', '343060', '343151', '343159', '343168', '343169', '343175', '343191', '343192', '343203', '343204', '343206', '343207', '343209', '343224', '343225', '343307', '343313', '343513', '348769', '348770', '355764', '355992', '356201']
    nosy_count = 6.0
    nosy_names = ['pitrou', 'vstinner', 'eryksun', 'Michael.Felt', 'miss-islington', 'jaketesler']
    pr_nums = ['11993', '13458', '13463', '13654', '14845', '15028']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue36084'
    versions = ['Python 3.8']

    @jaketesler
    Copy link
    Mannequin Author

    jaketesler mannequin commented Feb 22, 2019

    This functionality adds a native Thread ID to threading.Thread objects. This ID (TID), similar to the PID of a process, is assigned by the OS (kernel) and is generally used for externally monitoring resources consumed by the running thread (or process).
    This does not replace the ident attribute within Thread objects, which is assigned by the Python interpreter and is guaranteed as unique for the lifetime of the Python instance.

    @jaketesler jaketesler mannequin added 3.8 only security fixes type-feature A feature request or enhancement labels Feb 22, 2019
    @jaketesler
    Copy link
    Mannequin Author

    jaketesler mannequin commented Apr 2, 2019

    *bump*

    Could someone look into reviewing this bug/PR? Thanks!

    @vstinner
    Copy link
    Member

    It seems like the feature is only supported by a few operating systems and only if required functions are available:

    #ifdef __APPLE__
        volatile uint64_t tid;
        pthread_threadid_np(NULL, &tid);
    #elif defined(__linux__)
        volatile pid_t tid;
        tid = syscall(__NR_gettid);

    I understand that it's not available on FreeBSD nor Windows?

    In that case, I would prefer to only add a threading.get_tid() *function*.

    Is it different than threading.get_ident() on Linux?

    @jaketesler
    Copy link
    Mannequin Author

    jaketesler mannequin commented May 10, 2019

    The feature is supported on Windows: the file supporting Windows threading is thread_nt.h, not thread_pthread.h since Windows doesn't use POSIX-style threads.

    Also it is different from threading.get_ident() - ident is a Python-issued unique identifier, TID is issued by the OS and is tracable system-wide.

    @pitrou
    Copy link
    Member

    pitrou commented May 12, 2019

    New changeset 4959c33 by Antoine Pitrou (Jake Tesler) in branch 'master':
    bpo-36084: Add native thread ID to threading.Thread objects (GH-11993)
    4959c33

    @pitrou
    Copy link
    Member

    pitrou commented May 12, 2019

    Thank you Jake for your contribution!

    @pitrou pitrou added the stdlib Python modules in the Lib dir label May 12, 2019
    @pitrou pitrou closed this as completed May 12, 2019
    @vstinner
    Copy link
    Member

    I dislike adding a function which always return 0 when the feature is not supported:

    unsigned long
    PyThread_get_thread_native_id(void)
    {
        ...
    #if ...
        ...
    #else
        unsigned long native_id;
        native_id = 0;
    #endif
        return (unsigned long) native_id;
    }

    I would prefer to not define the function if it's not available, so the attribute would be missing.

    With the commited change, how can I know if native thread identifier is supported or not?

    @vstinner vstinner reopened this May 13, 2019
    @vstinner
    Copy link
    Member

    """Native integral thread ID of this thread or 0 if it has not been started. (...)

    Why not set the attribute to None before a thread starts? It would be more consistent with the the "ident" attribute behavior, no? Extract __repr__():

        def __repr__(self):
            assert self._initialized, "Thread.__init__() was not called"
            status = "initial"
            if self._started.is_set():
                status = "started"
            ...
            if self._ident is not None:
                status += " %s" % self._ident
            return "<%s(%s, %s)>" % (self.__class__.__name__, self._name, status)

    "is None" is a reliable check to decide if the attribute is set or not.

    With the current implementation, it's hard to guess since PyThread_get_thread_native_id() returns on some platforms... But again, I would prefer to not have the attribute if PyThread_get_thread_native_id() is not available on a platform.

    @vstinner
    Copy link
    Member

    This change broke the AIX buildbot:
    https://buildbot.python.org/all/#/builders/132/builds/486

    ======================================================================
    FAIL: test_various_ops (test.test_threading.ThreadTests)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/test/test_threading.py", line 109, in test_various_ops
        self.assertEqual(len(native_ids), NUMTASKS + 1)
    AssertionError: 1 != 11

    ======================================================================
    FAIL: test_various_ops_large_stack (test.test_threading.ThreadTests)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/test/test_threading.py", line 159, in test_various_ops_large_stack
        self.test_various_ops()
      File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/test/test_threading.py", line 109, in test_various_ops
        self.assertEqual(len(native_ids), NUMTASKS + 1)
    AssertionError: 1 != 11

    ======================================================================
    FAIL: test_various_ops_small_stack (test.test_threading.ThreadTests)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/test/test_threading.py", line 147, in test_various_ops_small_stack
        self.test_various_ops()
      File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/test/test_threading.py", line 109, in test_various_ops
        self.assertEqual(len(native_ids), NUMTASKS + 1)
    AssertionError: 1 != 11

    @jaketesler
    Copy link
    Mannequin Author

    jaketesler mannequin commented May 13, 2019

    So where do we go from here?

    @vstinner
    Copy link
    Member

    So where do we go from here?

    I propose to only add attribute if it's supported.

    If nobody comes with a fix, I would prefer to remove the feature to repair the AIX buildbot.

    @vstinner
    Copy link
    Member

    I wrote PR 13458 to revert the change which broke the CI for longer than one week, see the rationale there:
    #13458 (comment)

    Email thread about the regression:
    https://mail.python.org/pipermail/python-buildbots/2019-May/000280.html

    @vstinner
    Copy link
    Member

    Jake Tesler: In the meanwhile, can you please try to rewrite your change to make the attribute optional?

    C PyThread_get_thread_native_id() function and Python _thread.get_native_id() should not be defined if it's not supported by
    the platform.

    I suggest to add the following #define in Include/pythread.h:

    #if (... list of supported platforms ...
    #  define PY_HAVE_THREAD_NATIVE_ID 1
    #endif

    It would be great if you can come up with a solution before the end of the month, otherwise we will miss Python 3.8 deadline for new feature :-(

    Tell me if you need help to rework your PR. IMHO it's a nice feature. There is just an issue in your first implementation.

    @vstinner
    Copy link
    Member

    New changeset d12e757 by Victor Stinner in branch 'master':
    Revert "bpo-36084: Add native thread ID to threading.Thread objects (GH-11993)" (GH-13458)
    d12e757

    @pitrou
    Copy link
    Member

    pitrou commented May 21, 2019

    Jake, would you like to submit a new PR that implements Victor's suggestion (i.e. only define and test the attribute on supported systems)?

    @jaketesler
    Copy link
    Mannequin Author

    jaketesler mannequin commented May 21, 2019

    I will implement these changes - let’s try to hit that end-of-month target!

    @jaketesler
    Copy link
    Mannequin Author

    jaketesler mannequin commented May 21, 2019

    New PR created with requested edits. :)

    @jaketesler
    Copy link
    Mannequin Author

    jaketesler mannequin commented May 21, 2019

    Victor Stinner: would you mind taking a look at the new PR? Is this more along the lines of what you had in mind?

    @aixtools
    Copy link
    Contributor

    Repeating a bot of what I added to PR13463

    AIX has native support for thread-id since at least AIX 4.1 (1994-1995) where every process has an initial TID (PID are even numbers and "own" the resources, TID are odd and are the "workers" - very simply put). Hence, by default, an AIX process is "mono-threaded".

    The key concern - when calling thread related issues, is to ensure that all compiled code uses the same definitions in include files - namely, with _THREAD_SAFE defined.

    There are couple of ways to accomplish this:
    a) ensure that #include <pthread.h> is first (everywhere!)
    b) use cc_r, xlc_r, etc_r (with IBM compiler)
    c) -D_THREAD_SAFE

    As a) seems unpractical to ensure all the time; b) is also unpractical (no idea how/if gcc, e.g., has a way to 'signal' the need to be thread_safe - so a change in configure.ac, and maybe in setup.py, and thinking further yet - in the CFLAGS that get passed to extrnal modules and extensions - adding -D_THREAD_SAFE seems the most complete (aka safe) approach.

    And, of course, for this specific function a call to

    Syntax
    #include <pthread.h>
    pthread_t pthread_self (void);

    Description
    The pthread_self subroutine returns the calling thread's ID.

    @vstinner
    Copy link
    Member

    The AIX case sounds way more complex. I suggest to start without AIX
    support, since it is already complex enough, and then open a new issue to
    discuss/implement AIX support, once this issue is done.

    @aixtools
    Copy link
    Contributor

    I do not know if it is that much mode complex. Unless I missed something it seems to be that this bit - needs three lines added after the FREEBSD block - per below:
    All the other "assurances" are just things that need to be assured. Adding a -D_XXX to CFLAGS is not all that complex either. Perhaps getting the need for the flag documented is 'complex'.

    #ifdef PY_HAVE_THREAD_NATIVE_ID
    unsigned long
    PyThread_get_thread_native_id(void)
    {
        if (!initialized)
            PyThread_init_thread();
    #ifdef __APPLE__
        uint64_t native_id;
        (void) pthread_threadid_np(NULL, &native_id);
    #elif defined(__linux__)
        pid_t native_id;
        native_id = syscall(SYS_gettid);
    #elif defined(__FreeBSD__)
        int native_id;
        native_id = pthread_getthreadid_np();
    #elif defined(_AIX)
        pthread_t native_id;
        native_id = pthread_self()

    More may be needed, but I expect it the include file mentioned is already included - but perhaps without the assurance that AIX says it wants/needs for real thread safe builds. And fixing that is just a bonus!

    @aixtools
    Copy link
    Contributor

    On 22/05/2019 12:22, Michael Felt wrote:

    All the other "assurances" are just things that need to be assured. Adding a -D_XXX to CFLAGS is not all that complex either. Perhaps getting the need for the flag documented is 'complex'.

    Now that I think about it, perhaps I should make a separate PR just for
    the -D_THREAD_SAFE definition. Python does reference the AIX "threading"
    include files and documentation has always indicated they should be
    first, or have this define. If it is already there, please excuse the
    noise; if not, imho - it should be there.

    @eryksun
    Copy link
    Contributor

    eryksun commented May 22, 2019

    The pthread_self subroutine returns the calling thread's ID.

    I don't know much about AIX, but according to the docs 1 the kernel thread ID should be thread_self(), which is not the same as pthread_self().

    @jaketesler
    Copy link
    Mannequin Author

    jaketesler mannequin commented May 22, 2019

    In general, I’ve concluded most ‘editions’ of pthread_self() are not the same value as this feature aims to implement. I’m not familiar enough with AIX to be certain about that platform, though.

    If there’s an equivalent function in AIX to capture the actual integral thread ID in the same 3-line way as Linux, FreeBSD, et al., are implemented (the big block in PyThread_get_thread_native_id), then it’s worth including. If the mechanism required is a more complex one than just adding a few lines, then it might be best left for a new PR. :)

    @jaketesler
    Copy link
    Mannequin Author

    jaketesler mannequin commented May 22, 2019

    I will look into whether adding thread_self() for AIX would be simple enough for this PR.

    @vstinner
    Copy link
    Member

    New changeset b121f63 by Victor Stinner (Jake Tesler) in branch 'master':
    bpo-36084: Add native thread ID (TID) to threading.Thread (GH-13463)
    b121f63

    @vstinner
    Copy link
    Member

    Currently, Threading.start() ignores _start_new_thread() return value which is an identifier. Is it the same value than threading.get_native_id()? It might be good to clarify _thread._start_new_thread() documentation since we now have 2 kinds of "identifier".

    Later it might be interesting to attempt to support NetBSD, OpenBSD, and more operating systems. Well, that can be done... later :-)

    @vstinner
    Copy link
    Member

    glibc 2.30 scheduled in August 2019 will finally provide a gettid() function!

    Once it will be released, it would be interesting to use it rather than using a direct syscall.

    @aixtools
    Copy link
    Contributor

    On 22/05/2019 15:15, Jake Tesler wrote:

    Jake Tesler <jake.tesler@gmail.com> added the comment:

    I will look into whether adding thread_self() for AIX would be simple enough for this PR.

    ----------


    Python tracker <report@bugs.python.org>
    <https://bugs.python.org/issue36084\>


    Blush. Maybe I should have read further (to chapter T)

    Here is a bare bones example - showing the pthread_self() is not the
    right value, but thread_self is.

    michael@x071:[/data/prj/aixtools/tests/posix]cat thread*.c
    #include <pthread.h>
    #include <sys/thread.h>
    #include <stdio.h>
    main()
    {
            pid_t pid;
            tid_t tid;
            pthread_t ptid;

            pid = getpid();
            tid = thread_self();
            ptid = pthread_self();

            fprintf(stderr,"thread_self: pid:%d tid:%d ptid:%d\n", pid, tid,
    ptid);
            sleep(300);     /* give time to run ps -mo THREAD */
    }

    michael@x071:[/data/prj/aixtools/tests/posix]./thread_self
    thread_self: pid:*4129010 *tid:*23724099 *ptid:1

    michael@x071:[/data/prj/aixtools/tests/posix]ps -mo THREAD
        USER     PID    PPID       TID S  CP PRI SC    WCHAN        F     TT
    BND COMMAND
     michael 3408006 7012502         - A   3  61  1        -   200001 
    pts/0  -1 ps -mo THREAD
           -       -       -  22282455 R   3  61  1        -   400000     
    -  -1 -
     michael *4129010 *7012502         - A   0  60  1 f1000a03e16533b0 
    8200011  pts/0  -1 ./thread_self
           -       -       -  *23724099 *S   0  60  1 f1000a03e16533b0  
    410400      -  -1 -

    @vstinner
    Copy link
    Member

    Michael Felt: it's annoying when you ignore Antoine's comment and my comment.

    The AIX case is very special and required a separated issue. Please open a separated if you want to discuss/implement get_native_id() on AIX.

    @jaketesler
    Copy link
    Mannequin Author

    jaketesler mannequin commented May 22, 2019

    Victor – the return value of _start_new_thread is the the ident parameter, and its not the same as the native id.

    See here: #11993 (comment)

    @aixtools
    Copy link
    Contributor

    On 22/05/2019 18:08, STINNER Victor wrote:

    STINNER Victor <vstinner@redhat.com> added the comment:

    Michael Felt: it's annoying when you ignore Antoine's comment and my comment.

    The AIX case is very special and required a separated issue. Please open a separated if you want to discuss/implement get_native_id() on AIX.

    My apologies. I was not ignoring anyone. I am sorry you had that impression.

    I had already taken it as a given that it would not be in this PR (re:
    https://bugs.python.org/issue36084#msg343159)

    And, I had expressed my hope - it would not be too complex in
    https://bugs.python.org/issue36084#msg343168. Which also made me realize
    that an 'issue' around the "define" described in the "pthread"
    documentation (so in hindsight, not applicable to an implementation of
    "get_native_id()".

    And, while I am sorry you feel I have ignored you - it is hardly the
    case. Everyone's comments have given me a reason to look further. And it
    grieves me that my intentions are misunderstood.

    Thank you for your honesty!

    ----------


    Python tracker <report@bugs.python.org>
    <https://bugs.python.org/issue36084\>


    @jaketesler
    Copy link
    Mannequin Author

    jaketesler mannequin commented May 23, 2019

    Michael Felt -
    If you would like some help with adding/building AIX support for this functionality, tag me, I'd be glad to help out! :)

    @vstinner
    Copy link
    Member

    If someone wants to document that _start_new_thread() return value is similar to threading.get_ident() but not threading.get_native_id(): please go ahead and propose a PR :-)

    Adding support for AIX to get_native_id() should be done in a separated issue since AIX seems to come with specific challenges :-)

    The initial feature request is now implemented. Well done! I know that many people were awaiting to expose "gettid()" in Python. Now it's there!

    Thanks everyone who was involved in this issue.

    Sorry for the CI hiccup which required to revert the change temporarily. At least, it didn't miss Python 3.8 feature freeze!

    @aixtools
    Copy link
    Contributor

    On 23/05/2019 18:16, Jake Tesler wrote:

    Jake Tesler <jake.tesler@gmail.com> added the comment:

    Michael Felt -
    If you would like some help with adding/building AIX support for this functionality, tag me, I'd be glad to help out! :)

    ----------
    Thanks - I'll try to look at this early next week.


    Python tracker <report@bugs.python.org>
    <https://bugs.python.org/issue36084\>


    @vstinner
    Copy link
    Member

    New changeset 84846b0 by Victor Stinner (Jake Tesler) in branch 'master':
    bpo-36084: Add threading Native ID information to What's New documentation (GH-14845)
    84846b0

    @miss-islington
    Copy link
    Contributor

    New changeset 7026737 by Miss Islington (bot) in branch '3.8':
    bpo-36084: Add threading Native ID information to What's New documentation (GH-14845)
    7026737

    @jaketesler
    Copy link
    Mannequin Author

    jaketesler mannequin commented Oct 31, 2019

    I have encountered a minor bug with this new feature. The bug occurs when creating a new multiprocessing.Process object on Unix (or on any platform where the multiprocessing start_method is 'fork' or 'forkserver').

    When creating a new process via fork, the Native ID in the new MainThread is incorrect. The new forked process' threading.MainThread object inherits the Native ID from the parent process' MainThread instead of capturing/updating its own (new) Native ID.

    See the following snippet:

    >>> import threading, multiprocessing
    >>> multiprocessing.set_start_method('fork') # or 'forkserver'
    >>> def proc(): print(threading.get_native_id(), threading.main_thread().native_id) # get_native_id(), mainthread.native_id
    >>> proc()
    22605 22605 # get_native_id(), mainthread.native_id
    >>> p = multiprocessing.Process(target=proc)
    >>> p.start()
    22648 22605 # get_native_id(), mainthread.native_id
    >>>
    >>> def update(): threading.main_thread()._set_native_id()
    >>> def print_and_update(): proc(); update(); proc()
    >>> print_and_update()
    22605 22605 # get_native_id(), mainthread.native_id
    22605 22605 
    >>> p2=multiprocessing.Process(target=print_and_update); p2.start()
    22724 22605 # get_native_id(), mainthread.native_id
    22724 22724
    >>> print_and_update()
    22605 22605 # get_native_id(), mainthread.native_id
    22605 22605

    As you can see, the new Process object's MainThread.native_id attribute matches that of the MainThread of its parent process.

    Unfortunately, I'm not too familiar with the underlying mechanisms that Multiprocessing uses to create forked processes.
    I believe this behavior occurs because (AFAIK) a forked multiprocessing.Process copies the MainThread object from its parent process, rather than reinitializing a new one. Looking further into the multiprocessing code, it appears the right spot to fix this would be in the multiprocessing.Process.bootstrap() function.

    I've created a branch containing a working fix - I'm also open to suggestions of how a fix might otherwise be implemented.
    If it looks correct I'll create a PR against the CPython 3.8 branch.

    See the branch here: https://github.com/jaketesler/cpython/tree/fix-mp-native-id

    Thanks all!
    -Jake

    @jaketesler jaketesler mannequin reopened this Oct 31, 2019
    @vstinner
    Copy link
    Member

    vstinner commented Nov 5, 2019

    I have encountered a minor bug with this new feature.

    Please open a new issue.

    @jaketesler jaketesler mannequin closed this as completed Nov 5, 2019
    @vstinner
    Copy link
    Member

    vstinner commented Nov 7, 2019

    Jake created bpo-38707.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.8 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants