Message366825
> I would expect that the negative impact on branch predictability would easily outweigh the cost of the memory write (A guaranteed L1 hit)
If that were true then Spectre and Meltdown wouldn't have been so interesting :)
Pipelining processors are going to speculatively execute both paths, and will skip the write much more quickly than by doing it, and meanwhile nobody should have tried to read the value so it hasn't had to block for that path. I'm not aware of any that detect no-op writes and skip synchronising across cores - the dirty bit of the cache line is just set unconditionally.
Benchmarking already showed that the branching version is faster. It's possible that "refcount += (refcount & IMMORTAL) ? 0 : 1" could generate different code (should be mov,test,lea,cmovz rather than mov,and,add,mov or mov,and,jz,add,mov), but it's totally reasonable for a branch to be faster than unconditionally modifying memory. |
|
Date |
User |
Action |
Args |
2020-04-20 13:33:01 | steve.dower | set | recipients:
+ steve.dower, tim.peters, nascheme, gregory.p.smith, pitrou, vstinner, carljm, dino.viehland, Mark.Shannon, corona10, pablogsal, eelizondo, shihai1991 |
2020-04-20 13:33:01 | steve.dower | set | messageid: <1587389581.51.0.25436145899.issue40255@roundup.psfhosted.org> |
2020-04-20 13:33:01 | steve.dower | link | issue40255 messages |
2020-04-20 13:33:01 | steve.dower | create | |
|