Ah, cunning: I can make sense of it in hex.

>>> hex(to_ulps(expected))
>>> hex(to_ulps(got))
>>> hex( to_ulps(got) - to_ulps(expected) )

... and what you've done with ulp then follows.

In my version a format like "{:d} ulps" was a bad idea when the error was a gross one, but your to_ulps is only piece-wise linear -- large differences are compressed.

I'm pleased my work has mostly survived: here's hoping the house build-bots agree. erfc() is perhaps the last worry, but math & cmath  pass on my machine.
