On Jul28, 2011, at 04:51 , Robert Haas wrote:
> One fly in the ointment is that 8-byte
> stores are apparently done as two 4-byte stores on some platforms.
> But if the counter runs backward, I think even that is OK. If you
> happen to read an 8 byte value as it's being written, you'll get 4
> bytes of the intended value and 4 bytes of zeros. The value will
> therefore appear to be less than what it should be. However, if the
> value was in the midst of being written, then it's still in the midst
> of committing, which means that that XID wasn't going to be visible
> anyway. Accidentally reading a smaller value doesn't change the
> answer.
That only works if the update of the most-significant word is guaranteed
to be visible before the update to the lest-significant one. Which
I think you can only enforce if you update the words individually
(and use a fence on e.g. PPC32). Otherwise you're at the mercy of the
compiler.
Otherwise, the following might happen (with a 2-byte value instead of an
8-byte one, and the assumption that 1-byte stores are atomic while 2-bytes
ones aren't. Just to keep the numbers smaller. The machine is assumed to be
big-endian)
The counter is at 0xff00
Backends 1 decrements, i.e. does
(1) STORE [counter+1] 0xff
(2) STORE [counter], 0x00
Backend 2 reads
(1') LOAD [counter+1]
(2') LOAD [counter]
If the sequence of events is (1), (1'), (2'), (2), backend 2 will read
0xffff which is higher than it should be.
But we could simply use a spin-lock to protect the read on machines where
we don't know for sure that 64-bit reads and write are atomic. That'll
only really hurt on machines with 16+ cores or so, and the number of
architectures which support that isn't that high anyway. If we supported
spinlock-less operation on SPARC, x86-64, PPC64 and maybe Itanium, would we
miss any important one?
best regards,
Florian Pflug