Message 103310 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	dabeaz
Recipients	beazley, dabeaz, flox, kristjan.jonsson, loewis, pitrou, r.david.murray, techtonik, torsten
Date	2010-04-16.10:46:41
SpamBayes Score	0.0
Marked as misclassified	No
Message-id	<1271414804.99.0.28366188812.issue8299@psf.upfronthosting.co.za>
In-reply-to

Content
I'm sorry, but even in the presence of fair locking, I still don't like this patch. The main problem is that it confuses fair locking with fair CPU use---something that this patch does not and can not achieve on any platform. The main problem is that everything is still based on the execution of interpreter ticks. However, interpreter ticks have wildly varying execution times dependent upon the code that's running. Thus, executing 1000 ticks might take significantly longer in one thread than another. Under a FIFO scheduler based on "fair" locking, the thread with the longer-running ticks is going to unfairly hog the GIL and the CPU. For example, if thread 1 takes 95 usec to execute 1000 ticks and thread 2 takes 5 usec to execute 1000 ticks, then thread 1 is going to end up hogging about 95% of the CPU cycles, starving thread 2. To me, that doesn't sound especially "fair." It would be much better to have fairness where threads are guaranteed to get an equal time slice of CPU cycles regardless of how many ticks they're executing. In other words, it would be much better if the two threads above each got 50% of the CPU cycles. The only way you're ever going to be able to do that is to base thread scheduling on timing. The new GIL in Python 3 makes an effort to do this even though some issues are still being worked out with it. On a slightly unrelated note, I just tried some experiments on Linux with the GIL implemented as condition variables and with semaphores. I honestly didn't see any noticeable performance difference between the two versions. I also didn't see any kind of purported "fair" scheduling of threads using the semaphore version. Both versions exhibit the same performance problems as described in my GIL talk (albeit not to the same extreme as on OS-X). Based on my own reading of the pthreads source code (yes, I have looked), I can't really draw any conclusion about the fairness of semaphores. Under the covers, it's all based on futex locks (the "f" in futex referring to "fast", not "fair" by the way). I know that the original paper on futexes has some experiments with fair lock scheduling, but I honestly don't know if that is being used in the Linux kernel or by pthreads. My understanding is that by default, futexes do not guarantee fairness. To know for certain with semaphores, much more low-level investigation would be required.

I'm sorry, but even in the presence of fair locking, I still don't like this patch. The main problem is that it confuses fair locking with fair CPU use---something that this patch does not and can not achieve on any platform.

The main problem is that everything is still based on the execution of interpreter ticks. However, interpreter ticks have wildly varying execution times dependent upon the code that's running. Thus, executing 1000 ticks might take significantly longer in one thread than another. Under a FIFO scheduler based on "fair" locking, the thread with the longer-running ticks is going to unfairly hog the GIL and the CPU. For example, if thread 1 takes 95 usec to execute 1000 ticks and thread 2 takes 5 usec to execute 1000 ticks, then thread 1 is going to end up hogging about 95% of the CPU cycles, starving thread 2. To me, that doesn't sound especially "fair."

It would be much better to have fairness where threads are guaranteed to get an equal time slice of CPU cycles regardless of how many ticks they're executing. In other words, it would be much better if the two threads above each got 50% of the CPU cycles. The only way you're ever going to be able to do that is to base thread scheduling on timing. The new GIL in Python 3 makes an effort to do this even though some issues are still being worked out with it.

On a slightly unrelated note, I just tried some experiments on Linux with the GIL implemented as condition variables and with semaphores. I honestly didn't see any noticeable performance difference between the two versions. I also didn't see any kind of purported "fair" scheduling of threads using the semaphore version. Both versions exhibit the same performance problems as described in my GIL talk (albeit not to the same extreme as on OS-X). Based on my own reading of the pthreads source code (yes, I have looked), I can't really draw any conclusion about the fairness of semaphores. Under the covers, it's all based on futex locks (the "f" in futex referring to "fast", not "fair" by the way). I know that the original paper on futexes has some experiments with fair lock scheduling, but I honestly don't know if that is being used in the Linux kernel or by pthreads. My understanding is that by default, futexes do not guarantee fairness. To know for certain with semaphores, much more low-level investigation would be required.

History
Date	User	Action	Args
2010-04-16 10:46:45	dabeaz	set	recipients: + dabeaz, loewis, beazley, pitrou, kristjan.jonsson, techtonik, r.david.murray, flox, torsten
2010-04-16 10:46:44	dabeaz	set	messageid: <1271414804.99.0.28366188812.issue8299@psf.upfronthosting.co.za>
2010-04-16 10:46:42	dabeaz	link	issue8299 messages
2010-04-16 10:46:41	dabeaz	create