Re: Non-reproducible AIO failure

Поиск
Список
Период
Сортировка
От Thomas Munro
Тема Re: Non-reproducible AIO failure
Дата
Msg-id CA+hUKGKpLbWZT2DDy=hP4fX_c9X88KRV=AwNb5UzLs-xsgNdRg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Non-reproducible AIO failure  (Konstantin Knizhnik <knizhnik@garret.ru>)
Ответы Re: Non-reproducible AIO failure
Список pgsql-hackers
On Mon, Aug 25, 2025 at 6:11 AM Konstantin Knizhnik <knizhnik@garret.ru> wrote:
> In theory even replacing bitfield with in should not
> avoid race condition, because they are still shared the same cache line.

I'm no expert in this stuff, but that's not my understanding of how it
works.  Plain stores to normal memory go into the store buffer and are
eventually flushed to the memory hierarchy, but all modifications that
reach the cache hierarchy have a consistent view of memory created by
the cache coherency protocol (in ARM's case MOESI[1]): only one core
can change a cache line at a time while it has exclusive access (with
some optimisations, owner mode, snooping, etc but AFAIK that doesn't
change the basic consistency).  While modifying memory contents, the
core has the most up to date contents of the cache line and it doesn't
touch any other bytes.  What I mean is that cores don't make changes
to bytes that you didn't explicitly change with a store instruction,
it's just the order and timing that the contents of the store buffer
hits the memory system that is in question.  So I guess the key thing
here is that there is more than one store.

If the Apple's fork of LLVM 17 generated the same code as the MacPorts
LLVM 17 output I disassembled above, then I guess this double store
with the Apple toolchain until recently, but if you upgraded Xcode in
the past few months[2] then it would have stopped happening, which is
probably why Alexander had to use Homebrew's clang 17 to show the
problem.  (For the record, what I disassembled earlier was from
MacPorts's package for clang 17.0.6, so I guess it's remotely possible
that some difference between Homebrew's package and MacPorts' package
is causing my attempts to reproduce it to fail; it claims to be the
same version though and I don't expect them to patch anything.).

I don't understand these things, but maybe this commit or something
near it caused the change:


https://github.com/llvm/llvm-project/blob/b64499c6ef73881378b7e767b31c0fa06e3fe7b4/llvm/lib/Transforms/Scalar/EarlyCSE.cpp

> I tried to create small test reproducing this data flow.
> But still failed to find and problem with this test.
> May be somebody can change this test to be more similar with real AIO
> dataflows?

(lldb) dis --name update_state
a.out`update_state:
a.out[0x100000548] <+0>:  sub    sp, sp, #0x10
a.out[0x10000054c] <+4>:  str    x0, [sp, #0x8]
a.out[0x100000550] <+8>:  str    w1, [sp, #0x4]
a.out[0x100000554] <+12>: dmb    ish
a.out[0x100000558] <+16>: ldrb   w10, [sp, #0x4]
a.out[0x10000055c] <+20>: ldr    x9, [sp, #0x8]
a.out[0x100000560] <+24>: ldr    w8, [x9]
a.out[0x100000564] <+28>: and    w8, w8, #0xffffff00
a.out[0x100000568] <+32>: orr    w8, w8, w10
a.out[0x10000056c] <+36>: str    w8, [x9]           <--- single store
a.out[0x100000570] <+40>: add    sp, sp, #0x10
a.out[0x100000574] <+44>: ret

I think you need to add the following uint8 member to the struct, so
that it has to store 24 bits with two stores (assuming you are testing
with LLVM 17 and not LLVM 19 which chooses to clobber only the first
16 bits anyway for some reason).  In that code it's storing 32 bits at
once so although it's still clobbering op, it's doing it atomically so
nothing goes wrong.

If that doesn't work, maybe try adding a random number of other
stores, to try to find the timing that overflows the store queue
between the two stores, or maybe it has to do with delivering signals
or context switches that cause memory barriers, or who knows...  Hmm,
yeah maybe it's actually context switches that are doing this, as
Alexander is running a huge number of processes.

[1] https://en.wikipedia.org/wiki/MOESI_protocol
[2] https://bfbot.cputube.org/project/apple_xcode/



В списке pgsql-hackers по дате отправления: