This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: del _limbo[self] KeyError
Type: behavior Stage:
Components: Library (Lib) Versions: Python 3.7, Python 3.6, Python 3.5, Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Dima.Tisnek, rooter, tim.peters
Priority: normal Keywords: patch

Created on 2016-08-31 07:27 by Dima.Tisnek, last changed 2022-04-11 14:58 by admin.

Files
File name Uploaded Description Edit
issue27908.patch rooter, 2016-09-13 18:58 review
issue27908_2.patch rooter, 2016-09-15 19:35 review
Messages (6)
msg274005 - (view) Author: Dima Tisnek (Dima.Tisnek) * Date: 2016-08-31 07:27
To reproduce:

```
import threading
import time


class U(threading.Thread):
    def run(self):
        time.sleep(1)
        if not x.ident:
            x.start()


x = U()
for u in [U() for i in range(10000)]: u.start()

time.sleep(10)
```

Chance to reproduce ~20% in my setup.
This script has a data race (check then act on x.ident).
I expected it to occasionally hit `RuntimeError: threads can only be started once`

Instead, I get:
```
Unhandled exception in thread started by <bound method Thread._bootstrap of <U(Thread-1, started 139798116361984)>>
Traceback (most recent call last):
  File "/usr/lib64/python3.5/threading.py", line 882, in _bootstrap
    self._bootstrap_inner()
  File "/usr/lib64/python3.5/threading.py", line 906, in _bootstrap_inner
    del _limbo[self]
KeyError: <U(Thread-1, started 139798116361984)>
```
msg276325 - (view) Author: Maciej Urbański (rooter) * Date: 2016-09-13 18:58
Successfully reproduced on 2.7.12 and 3.5.2 .
Currently there seems to be no protection against starting the same thread twice at the same time. What was checked was only if start operation already finished once.

Attached patch makes it so limbo, our starting threads' waiting room, is checked first.
msg276331 - (view) Author: Dima Tisnek (Dima.Tisnek) * Date: 2016-09-13 19:24
@rooter, if you go this way, you should also `self._started.set()` with lock held, together with removing thread from `_limbo`
msg276333 - (view) Author: Maciej Urbański (rooter) * Date: 2016-09-13 19:40
@Dima.Tisnek, only reason for having both of these conditions together is so I won't have to repeat the same error message in the code at little price of the performance in this edge case (trying to start the same thread multiple times).

Unless I'm missing something there should be no way how it would make this lock required for setting `self._started`.
msg276337 - (view) Author: Dima Tisnek (Dima.Tisnek) * Date: 2016-09-13 20:31
Your logic is accurate; _started is in fact separate from _limbo.

As such taking a lock for "test-then-set" only would suffice.

Now when you bring the other primitive under this lock in one place, it would look cleaner if it was also brought in the other.



There's one more issue with proposed change:

Before the change, if "already started" exception is ever raised for a given Thread object, then it's guaranteed that that object was started successfully.

With the change, it is possible that exception is raised, and thread fails to start, leaving the object in initial state.


If it were up to me, I would buy this limitation as price of safety.
Someone may disagree.
msg276605 - (view) Author: Maciej Urbański (rooter) * Date: 2016-09-15 19:35
To address @Dima.Tisnek concern I have changed exception message in case thread start process is merely in progress.

I kept `self._started` check under a lock so we can avoid more extreme race condition of one thread checking `self._started` right before another sets it and exits the limbo.

As for testing `self._started` under a lock, but setting it without one. I'm avoiding it only because of performance reasons. The read is super cheap, while notification of `.set()` is more complex, so if aesthetics are only reasons for doing it there then I would advise against holding that lock while executing it.

Of course I could also do a `self in _active` check under a lock, but that is slightly more costly, than `self._started` check and not any more useful.

I may be prematurely optimizing things, but people tend to like they threads to startup ASAP.
History
Date User Action Args
2022-04-11 14:58:35adminsetgithub: 72095
2016-09-15 19:35:06rootersetfiles: + issue27908_2.patch

messages: + msg276605
2016-09-13 20:31:39Dima.Tisneksetmessages: + msg276337
2016-09-13 19:54:35brett.cannonsetnosy: - brett.cannon

versions: + Python 3.7
2016-09-13 19:54:06brett.cannonsetcomponents: - Extension Modules
2016-09-13 19:40:29rootersetmessages: + msg276333
2016-09-13 19:24:44Dima.Tisneksetmessages: + msg276331
2016-09-13 18:58:24rootersetfiles: + issue27908.patch
versions: + Python 2.7, Python 3.5
nosy: + rooter

messages: + msg276325

keywords: + patch
2016-08-31 08:54:09pitrousetnosy: + tim.peters
2016-08-31 08:05:15SilentGhostsetnosy: + brett.cannon
type: behavior
components: + Extension Modules
2016-08-31 07:27:38Dima.Tisnekcreate