Re: Non-reproducible AIO failure
От | Nico Williams |
---|---|
Тема | Re: Non-reproducible AIO failure |
Дата | |
Msg-id | aKujZZxougyvtZ2U@ubby обсуждение исходный текст |
Ответ на | Re: Non-reproducible AIO failure (Thomas Munro <thomas.munro@gmail.com>) |
Ответы |
Re: Non-reproducible AIO failure
|
Список | pgsql-hackers |
On Mon, Aug 25, 2025 at 10:43:21AM +1200, Thomas Munro wrote: > On Mon, Aug 25, 2025 at 6:11 AM Konstantin Knizhnik <knizhnik@garret.ru> wrote: > > In theory even replacing bitfield with in should not > > avoid race condition, because they are still shared the same cache line. > > I'm no expert in this stuff, but that's not my understanding of how it > works. Plain stores to normal memory go into the store buffer and are > eventually flushed to the memory hierarchy, but all modifications that > reach the cache hierarchy have a consistent view of memory created by > the cache coherency protocol (in ARM's case MOESI[1]): only one core > can change a cache line at a time while it has exclusive access (with > some optimisations, owner mode, snooping, etc but AFAIK that doesn't > change the basic consistency). While modifying memory contents, the > core has the most up to date contents of the cache line and it doesn't > touch any other bytes. What I mean is that cores don't make changes > to bytes that you didn't explicitly change with a store instruction, > it's just the order and timing that the contents of the store buffer > hits the memory system that is in question. So I guess the key thing > here is that there is more than one store. I think the issue is that if the compiler decides to coalesce what we think of as distinct (but neighboring) bitfields, then when you update one of the bitfields you could be updating the other with stale data from an earlier read where the cached stale data is cached in a _register_. Thus the fact that the cache line should have the most up to date data for that other field is irrelevant because the stale data is in a _register_. The very fact that this can happen, that the C specs allow it, argues that one must never have adjacent distinct (for some value of "distinct") bitfields for anything that requires atomics. IIRC we're talking about this: ``` struct PgAioHandle { /* all state updates should go through pgaio_io_update_state() */ PgAioHandleState state:8; /* what are we operating on */ PgAioTargetID target:8; /* which IO operation */ PgAioOp op:8; /* bitfield of PgAioHandleFlags */ uint8 flags; ``` To us humans each of the first three fields are distinct because sure they are bitfields, and bitfields can be coalesced, but their _types_ are different. But from the compiler's perspective the types are all enums with small positive values, therefore they might as well all be `int`, like this: ``` struct PgAioHandle { /* all state updates should go through pgaio_io_update_state() */ int state:8; /* what are we operating on */ int target:8; /* which IO operation */ int op:8; /* bitfield of PgAioHandleFlags */ uint8 flags; ``` and now you can see that a write to any one of those three bytes can also be a write to either or both of the other two. Although... it seems unlikely that a write to `state` could write to `op` given that a 32-bit write would also write `flags`, and `flags` is not a bitfields (though its contents should be thought of as a bitfield). It's also hard to see how `op` gets written to when writing to `target` either because that crosses a 16-bit boundary, and again a 32-bit write would be wrong. So even if the compiler could merge these three bitfields it should not be able to ever write to `op` when writing to the other two unless there were a compiler bug that also caused the compiler to coalesce `flags` into these three bitfields. The fact that a compiler can do this is... well, I think to some degree desirable, but in this case horrifying. BTW, I _think_ you can keep the adjacent enum bitfields and prevent this with this [slightly ugly?] hack: ``` struct PgAioHandle { /* all state updates should go through pgaio_io_update_state() */ struct { PgAioHandleState v:8; } state; /* what are we operating on */ struct { PgAioTargetID v:8; } target; /* which IO operation */ struct { PgAioOp v:8; } op; /* bitfield of PgAioHandleFlags */ uint8 flags; ``` because wrapping each bitfield in a struct makes it so the compiler doesn't see each of these adjacent fields of PgAioHandle as being bitfields when coding accesses to them. Nico --
В списке pgsql-hackers по дате отправления: