Author steve.dower
Recipients Mark.Shannon, carljm, corona10, dino.viehland, eelizondo, gregory.p.smith, nascheme, pablogsal, pitrou, shihai1991, steve.dower, tim.peters, vstinner
Date 2020-04-20.13:33:01
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1587389581.51.0.25436145899.issue40255@roundup.psfhosted.org>
In-reply-to
Content
> I would expect that the negative impact on branch predictability would easily outweigh the cost of the memory write (A guaranteed L1 hit)

If that were true then Spectre and Meltdown wouldn't have been so interesting :)

Pipelining processors are going to speculatively execute both paths, and will skip the write much more quickly than by doing it, and meanwhile nobody should have tried to read the value so it hasn't had to block for that path. I'm not aware of any that detect no-op writes and skip synchronising across cores - the dirty bit of the cache line is just set unconditionally.

Benchmarking already showed that the branching version is faster. It's possible that "refcount += (refcount & IMMORTAL) ? 0 : 1" could generate different code (should be mov,test,lea,cmovz rather than mov,and,add,mov or mov,and,jz,add,mov), but it's totally reasonable for a branch to be faster than unconditionally modifying memory.
History
Date User Action Args
2020-04-20 13:33:01steve.dowersetrecipients: + steve.dower, tim.peters, nascheme, gregory.p.smith, pitrou, vstinner, carljm, dino.viehland, Mark.Shannon, corona10, pablogsal, eelizondo, shihai1991
2020-04-20 13:33:01steve.dowersetmessageid: <1587389581.51.0.25436145899.issue40255@roundup.psfhosted.org>
2020-04-20 13:33:01steve.dowerlinkissue40255 messages
2020-04-20 13:33:01steve.dowercreate