aclover
2020-03-04
Since bpo-15038, waiting to acquire locks/events/etc from _thread/threading on Windows can fail to return long past the requested timeout. Cause:

using 32-bit GetTickCount/DWORD, which will overflow at around 49.7 days of uptime.

If the WaitForSingleObjectEx call in PyCOND_TIMEDWAIT returns later than the 'target' time, and the tick count overflows in that gap, 'milliseconds' will become very large (up to another 49.7 days) and the next PyCOND_TIMEDWAIT will be stuck for a long time.

Where we've seen it is where it's most likely to happen: when the machine is hibernated during the WaitForSingleObjectEx call. I believe the TickCount continues to increase during hibernation so there is a much bigger gap between 'target' and 'now' for the overflow to happen in.

Simplest fix is probably to switch to GetTickCount64/ULONGLONG. We should be able to get away with using this now we no longer support WinXP.
