Обсуждение: IO in wrong state on riscv64

Поиск
Список
Период
Сортировка

IO in wrong state on riscv64

От
Alexander Lakhin
Дата:
Hello Andres,

I've spotted one more interesting AIO-related failure on a riscv64 machine
[1]:
=== dumping /home/gburd/build/HEAD/pgsql.build/testrun/recovery/027_stream_regress/data/regression.diffs ===
diff -U3 /home/gburd/build/HEAD/pgsql/src/test/regress/expected/tsearch.out 
/home/gburd/build/HEAD/pgsql.build/testrun/recovery/027_stream_regress/data/results/tsearch.out
--- /home/gburd/build/HEAD/pgsql/src/test/regress/expected/tsearch.out 2025-10-09 00:06:10.612959611 +0000
+++ /home/gburd/build/HEAD/pgsql.build/testrun/recovery/027_stream_regress/data/results/tsearch.out 2025-10-09 
00:27:46.697588198 +0000
@@ -163,11 +163,7 @@
  (1 row)

  SELECT count(*) FROM test_tsvector WHERE a @@ '!qe <2> qt';
- count
--------
-     6
-(1 row)
-
+ERROR:  IO in wrong state: 0
  SELECT count(*) FROM test_tsvector WHERE a @@ '!(pl <-> yh)';
   count
  -------

I've managed to reproduce it using qemu-system-riscv64 with Debian trixie
(6.11.10-riscv64, clang 19.1.4):
$ CC=clang-19 ./configure --enable-debug --enable-cassert --enable-tap-tests && time make -j8 -s
...
configure: using CFLAGS=-Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla 
-Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security 
-Wmissing-variable-declarations -fno-strict-aliasing -fwrapv -fexcess-precision=standard 
-Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-format-truncation 
-Wno-cast-function-type-strict -g -O2

when running `make check` in a loop (it fails within 10 iterations for me)
with the reduced parallel_schedule:
test: test_setup

test: boolean char name varchar text int2 int4 int8 oid float8
x10

and extra.config:
log_line_prefix = '%m [%p:%l] %q%a '
log_connections = 'true'
log_disconnections = 'true'
log_statement = 'all'
fsync = off
restart_after_crash = off

debug_parallel_query = regress

shared_buffers =  512kB

backtrace_functions = 'pgaio_io_wait'
# mostly reflects greenfly's config, plus shared_buffers decreased
# below 027_stream_regress's 1MB.

With the following debug logging added:
--- a/src/backend/storage/aio/aio.c
+++ b/src/backend/storage/aio/aio.c
@@ -578,6 +578,20 @@ pgaio_io_wait(PgAioHandle *ioh, uint64 ref_generation)
         PgAioHandleState state;
         bool            am_owner;

+int inistate = ioh->state;
+if (inistate == PGAIO_HS_IDLE)
+{
+  fprintf(stderr, "!!!pgaio_io_wait[%d]| initial ioh->state: %d\n", getpid(), inistate);
+}
+for (int i = 0; i < 10; i++)
+{
+  int curstate = ioh->state;
+  if (curstate != inistate)
+  {
+    elog(PANIC, "!!!pgaio_io_wait| ioh->state changed from %d to %d at iteration %d", inistate, curstate, i);
+    break;
+  }
+}
         am_owner = ioh->owner_procno == MyProcNumber;

         if (pgaio_io_was_recycled(ioh, ref_generation, &state))
@@ -599,6 +613,10 @@ pgaio_io_wait(PgAioHandle *ioh, uint64 ref_generation)
         {
                 if (pgaio_io_was_recycled(ioh, ref_generation, &state))
                         return;
+if (ioh->state == PGAIO_HS_IDLE)
+{
+  fprintf(stderr, "!!!pgaio_io_wait[%d]| ioh->state: %d (initial state: %d)\n", getpid(), ioh->state, inistate);
+}

                 switch ((PgAioHandleState) state)
                 {

I'm observing several cases:
1)
2025-10-11 11:45:17.947 UTC [1191355:258] pg_regress/float8 LOG: statement: SELECT x,
                sind(x),
                sind(x) IN (-1,-0.5,0,0.5,1) AS sind_exact
         FROM (VALUES (0), (30), (90), (150), (180),
               (210), (270), (330), (360)) AS t(x);
2025-10-11 11:45:17.947 UTC [1191359:206] pg_regress/int8 LOG: statement: SELECT * FROM 
generate_series('+4567890123456789'::int8, '+4567890123456799'::int8);
!!!pgaio_io_wait[1191744]| ioh->state: 0 (initial state: 5)
2025-10-11 11:45:17.993 GMT [1191744:1] ERROR:  IO in wrong state: 0
2025-10-11 11:45:17.993 GMT [1191744:2] BACKTRACE:
         postgres: parallel worker for PID 1191355 (+0x4372a0) [0x555561b8a2a0]
         postgres: parallel worker for PID 1191355 (+0x442c66) [0x555561b95c66]
         postgres: parallel worker for PID 1191355 (StartBufferIO+0x66) [0x555561b95ac4]
         postgres: parallel worker for PID 1191355 (+0x43e014) [0x555561b91014]
         postgres: parallel worker for PID 1191355 (StartReadBuffer+0x308) [0x555561b90958]
...
         postgres: parallel worker for PID 1191355 (+0x3194b2) [0x555561a6c4b2]
         /lib/riscv64-linux-gnu/libc.so.6(+0x2791c) [0x7fffb9ef791c]
         /lib/riscv64-linux-gnu/libc.so.6(__libc_start_main+0x74) [0x7fffb9ef79c4]
         postgres: parallel worker for PID 1191355 (_start+0x20) [0x5555618382c8]
...
2025-10-11 11:45:17.999 UTC [1191355:259] pg_regress/float8 ERROR: IO in wrong state: 0
2025-10-11 11:45:17.999 UTC [1191355:260] pg_regress/float8 BACKTRACE:
         postgres: debian regression [local] SELECT(ProcessParallelMessages+0x114) [0x5555618dca6c]
         postgres: debian regression [local] SELECT(ProcessInterrupts+0x340) [0x555561bce864]
         postgres: debian regression [local] SELECT(+0x2e0e9c) [0x555561a33e9c]
         postgres: debian regression [local] SELECT(standard_ExecutorRun+0x10c) [0x555561a19c20]
...
         /lib/riscv64-linux-gnu/libc.so.6(+0x2791c) [0x7fffb9ef791c]
         /lib/riscv64-linux-gnu/libc.so.6(__libc_start_main+0x74) [0x7fffb9ef79c4]
         postgres: debian regression [local] SELECT(_start+0x20) [0x5555618382c8]
2025-10-11 11:45:17.999 UTC [1191355:261] pg_regress/float8 STATEMENT:  SELECT x,
                sind(x),
                sind(x) IN (-1,-0.5,0,0.5,1) AS sind_exact
         FROM (VALUES (0), (30), (90), (150), (180),
               (210), (270), (330), (360)) AS t(x);
2025-10-11 11:45:17.999 UTC [1189893:98] LOG:  background worker "parallel worker" (PID 1191744) exited with exit code
1

2)
!!!pgaio_io_wait[1353521]| ioh->state: 0 (initial state: 4)
...
2025-10-11 12:48:00.503 UTC [1353521:1] ERROR:  IO in wrong state: 0
2025-10-11 12:48:00.503 UTC [1353521:2] BACKTRACE:
         postgres: parallel worker for PID 1353449 (+0x4372a0) [0x5555905562a0]
         postgres: parallel worker for PID 1353449 (+0x442c66) [0x555590561c66]
         postgres: parallel worker for PID 1353449 (StartBufferIO+0x66) [0x555590561ac4]
         postgres: parallel worker for PID 1353449 (+0x43e014) [0x55559055d014]
         postgres: parallel worker for PID 1353449 (StartReadBuffer+0x308) [0x55559055c958]
...
         /lib/riscv64-linux-gnu/libc.so.6(+0x2791c) [0x7fffaacf791c]
         /lib/riscv64-linux-gnu/libc.so.6(__libc_start_main+0x74) [0x7fffaacf79c4]
         postgres: parallel worker for PID 1353449 (_start+0x20) [0x5555902042c8]
2025-10-11 12:48:00.503 UTC [1353521:3] STATEMENT:  SELECT c.f1 FROM NAME_TBL c WHERE c.f1 ~ '[0-9]';
...
2025-10-11 12:48:00.512 UTC [1352470:51] LOG:  background worker "parallel worker" (PID 1353521) exited with exit code
1
2025-10-11 12:48:00.513 UTC [1353449:24] pg_regress/name ERROR:  IO in wrong state: 0
2025-10-11 12:48:00.513 UTC [1353449:25] pg_regress/name BACKTRACE:
         postgres: debian regression [local] SELECT(ProcessParallelMessages+0x114) [0x5555902a8a6c]
         postgres: debian regression [local] SELECT(ProcessInterrupts+0x340) [0x55559059a864]
         postgres: debian regression [local] SELECT(+0x2e0e9c) [0x5555903ffe9c]
         postgres: debian regression [local] SELECT(standard_ExecutorRun+0x10c) [0x5555903e5c20]
         postgres: debian regression [local] SELECT(+0x481a30) [0x5555905a0a30]
         postgres: debian regression [local] SELECT(PortalRun+0x180) [0x5555905a06e6]
         postgres: debian regression [local] SELECT(+0x480880) [0x55559059f880]
         postgres: debian regression [local] SELECT(PostgresMain+0xdfc) [0x55559059d7d8]
         postgres: debian regression [local] SELECT(+0x4793f0) [0x5555905983f0]
         postgres: debian regression [local] SELECT(postmaster_child_launch+0x132) [0x5555904f787c]
         postgres: debian regression [local] SELECT(+0x3dcd52) [0x5555904fbd52]
         postgres: debian regression [local] SELECT(InitProcessGlobals+0) [0x5555904f9c96]
         postgres: debian regression [local] SELECT(+0x3194b2) [0x5555904384b2]
         /lib/riscv64-linux-gnu/libc.so.6(+0x2791c) [0x7fffaacf791c]
         /lib/riscv64-linux-gnu/libc.so.6(__libc_start_main+0x74) [0x7fffaacf79c4]
         postgres: debian regression [local] SELECT(_start+0x20) [0x5555902042c8]
2025-10-11 12:48:00.513 UTC [1353449:26] pg_regress/name STATEMENT: SELECT c.f1 FROM NAME_TBL c WHERE c.f1 ~ '[0-9]';

I also observed authentication failed due to this error.

3)
!!!pgaio_io_wait[1169773]| initial ioh->state: 0

2025-10-11 11:34:46.793 GMT [1169773:1] PANIC:  !!!pgaio_io_wait| ioh->state changed from 0 to 1 at iteration 0
# no other iteration number observed
2025-10-11 11:34:46.793 GMT [1169773:2] BACKTRACE:
         postgres: parallel worker for PID 1169684 (+0x4372a0) [0x55555c3492a0]
         postgres: parallel worker for PID 1169684 (+0x442c66) [0x55555c354c66]
         postgres: parallel worker for PID 1169684 (StartBufferIO+0x66) [0x55555c354ac4]
         postgres: parallel worker for PID 1169684 (+0x43e014) [0x55555c350014]
         postgres: parallel worker for PID 1169684 (StartReadBuffer+0x308) [0x55555c34f958]
...
         /lib/riscv64-linux-gnu/libc.so.6(+0x2791c) [0x7fff8faf791c]
         /lib/riscv64-linux-gnu/libc.so.6(__libc_start_main+0x74) [0x7fff8faf79c4]
         postgres: parallel worker for PID 1169684 (_start+0x20) [0x55555bff72c8]

2025-10-11 11:34:46.804 UTC [1169684:93] pg_regress/text ERROR: !!!pgaio_io_wait| ioh->state changed from 0 to 1 at 
iteration 0
2025-10-11 11:34:46.804 UTC [1169684:94] pg_regress/text BACKTRACE:
         postgres: debian regression [local] SELECT(ProcessParallelMessages+0x114) [0x55555c09ba6c]
         postgres: debian regression [local] SELECT(ProcessInterrupts+0x340) [0x55555c38d864]
         postgres: debian regression [local] SELECT(+0x2e0e9c) [0x55555c1f2e9c]
         postgres: debian regression [local] SELECT(standard_ExecutorRun+0x10c) [0x55555c1d8c20]
...
         /lib/riscv64-linux-gnu/libc.so.6(+0x2791c) [0x7fff8faf791c]
         /lib/riscv64-linux-gnu/libc.so.6(__libc_start_main+0x74) [0x7fff8faf79c4]
         postgres: debian regression [local] SELECT(_start+0x20) [0x55555bff72c8]
With CFLAGS="-O0", it fails much sooner...

Probably the latter case is OK, I just wanted to check if the field
is apparently changed on the fly...


[1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=greenfly&dt=2025-10-09%2000%3A06%3A03

Best regards,
Alexander



Re: IO in wrong state on riscv64

От
Thomas Munro
Дата:
On Sun, Oct 12, 2025 at 2:00 AM Alexander Lakhin <exclusion@gmail.com> wrote:
> 2025-10-11 11:34:46.793 GMT [1169773:1] PANIC:  !!!pgaio_io_wait| ioh->state changed from 0 to 1 at iteration 0
> # no other iteration number observed

Can you please disassemble pgaio_io_update_state() and
pgaio_io_was_recycled()?  I wonder if the memory barriers are not
being generated correctly, causing the state and generation to be
loaded out of order, or something like that...

The previous failure on greenfly was a TIMEOUT in the same test, as if
a query was hanging.

I've also been trying to understand two failures on CI on x86 with the
same symptom, but that might be something else, hard to say yet...

https://cirrus-ci.com/task/5551651043409920
https://www.postgresql.org/message-id/CA%2BhUKG%2BHh7u3YpPLKQS%2BB7KJu0Bw99EjhuPuvBRVccDfgod95g%40mail.gmail.com



Re: IO in wrong state on riscv64

От
Thomas Munro
Дата:
On Sun, Oct 12, 2025 at 2:00 AM Alexander Lakhin <exclusion@gmail.com> wrote:
> I've managed to reproduce it using qemu-system-riscv64 with Debian trixie

Huh, that's interesting.  What is the host architecture?  When I saw
that error myself and wondered about memory order, I dismissed the
idea of trying with qemu, figuring that my x86 host's TSO would affect
the coherency, but thinking again about that... I guess the compiler
might still reorder during riscv codegen if there is something wrong
with the barrier support, and even if it doesn't, the binary
translation to x86 might also feel free to reorder stuff if there are
no barrier instructions to prevent it?  Or maybe that doesn't happen
but your host is ARM?



Re: IO in wrong state on riscv64

От
Alexander Lakhin
Дата:
Hello Thomas,

12.10.2025 06:35, Thomas Munro wrote:
> On Sun, Oct 12, 2025 at 2:00 AM Alexander Lakhin <exclusion@gmail.com> wrote:
>> 2025-10-11 11:34:46.793 GMT [1169773:1] PANIC:  !!!pgaio_io_wait| ioh->state changed from 0 to 1 at iteration 0
>> # no other iteration number observed
> Can you please disassemble pgaio_io_update_state() and
> pgaio_io_was_recycled()?  I wonder if the memory barriers are not
> being generated correctly, causing the state and generation to be
> loaded out of order, or something like that...

Please find those attached (gdb "disass/m pgaio_io_update_state" misses
the start of the function (but it's still disassembled below), so I
decided to share the whole output).
This is from clean master, without any modifications, but with the issue
confirmed for this build:
2025-10-11 20:29:11.724 UTC [1679534:1] [unknown] LOG:  connection received: host=[local]
2025-10-11 20:29:11.725 UTC [1679536:1] [unknown] LOG:  connection received: host=[local]
2025-10-11 20:29:11.726 UTC [1679538:1] [unknown] LOG:  connection received: host=[local]
2025-10-11 20:29:11.724 UTC [1679533:1] [unknown] LOG:  connection received: host=[local]
2025-10-11 20:29:11.724 UTC [1679537:1] [unknown] LOG:  connection received: host=[local]
2025-10-11 20:29:11.724 UTC [1679535:1] [unknown] LOG:  connection received: host=[local]
2025-10-11 20:29:11.729 UTC [1679539:1] [unknown] LOG:  connection received: host=[local
...
2025-10-11 20:29:11.778 UTC [1679537:3] [unknown] LOG:  connection authorized: user=debian database=regression 
application_name=pg_regress/create_schema
2025-10-11 20:29:11.778 UTC [1679533:3] [unknown] LOG:  connection authorized: user=debian database=regression 
application_name=pg_regress/create_type
2025-10-11 20:29:11.778 UTC [1679539:3] [unknown] LOG:  connection authorized: user=debian database=regression 
application_name=pg_regress/create_procedure
2025-10-11 20:29:11.778 UTC [1679536:3] [unknown] LOG:  connection authorized: user=debian database=regression 
application_name=pg_regress/create_misc
2025-10-11 20:29:11.778 UTC [1679538:3] [unknown] LOG:  connection authorized: user=debian database=regression 
application_name=pg_regress/create_table
2025-10-11 20:29:11.778 UTC [1679535:3] [unknown] LOG:  connection authorized: user=debian database=regression 
application_name=pg_regress/create_function_c
2025-10-11 20:29:11.790 UTC [1679534:2] [unknown] FATAL:  IO in wrong state: 0
2025-10-11 20:29:11.790 UTC [1679534:3] [unknown] BACKTRACE:
         postgres: debian regression [local] authentication(+0x4371ec) [0x555565cd61ec]
         postgres: debian regression [local] authentication(+0x442bb2) [0x555565ce1bb2]
         postgres: debian regression [local] authentication(StartBufferIO+0x66) [0x555565ce1a10]
         postgres: debian regression [local] authentication(+0x43df60) [0x555565cdcf60]
         postgres: debian regression [local] authentication(StartReadBuffer+0x308) [0x555565cdc8a4]
         postgres: debian regression [local] authentication(ReadBufferExtended+0x64) [0x555565cdaace]
...
         postgres: debian regression [local] authentication(postmaster_child_launch+0x132) [0x555565c7787c]
         postgres: debian regression [local] authentication(+0x3dcd52) [0x555565c7bd52]
         postgres: debian regression [local] authentication(InitProcessGlobals+0) [0x555565c79c96]
         postgres: debian regression [local] authentication(+0x3194b2) [0x555565bb84b2]
         /lib/riscv64-linux-gnu/libc.so.6(+0x2791c) [0x7fffa690891c]
         /lib/riscv64-linux-gnu/libc.so.6(__libc_start_main+0x74) [0x7fffa69089c4]
         postgres: debian regression [local] authentication(_start+0x20) [0x5555659842c8]

> The previous failure on greenfly was a TIMEOUT in the same test, as if
> a query was hanging.

Yeah, I'll try to reproduce it too...

> On Sun, Oct 12, 2025 at 2:00 AM Alexander Lakhin <exclusion@gmail.com> wrote:
>> I've managed to reproduce it using qemu-system-riscv64 with Debian trixie
> Huh, that's interesting.  What is the host architecture?  When I saw
> that error myself and wondered about memory order, I dismissed the
> idea of trying with qemu, figuring that my x86 host's TSO would affect
> the coherency, but thinking again about that... I guess the compiler
> might still reorder during riscv codegen if there is something wrong
> with the barrier support, and even if it doesn't, the binary
> translation to x86 might also feel free to reorder stuff if there are
> no barrier instructions to prevent it?  Or maybe that doesn't happen
> but your host is ARM?

I use AMD Ryzen 9 7900X, Ubuntu 24.04 and run the risc machine with:
qemu-system-riscv64 -machine virt -m 1G -smp 8 -cpu rv64 -device virtio-blk-device,drive=hd \
-drive file=.../dqib_riscv64-virt/image.qcow2,if=none,id=hd -device virtio-net-device,netdev=net \
-netdev user,id=net,hostfwd=tcp::22222-:22 -bios /usr/lib/riscv64-linux-gnu/opensbi/generic/fw_jump.elf \
-kernel /usr/lib/u-boot/qemu-riscv64_smode/uboot.elf -object rng-random,filename=/dev/urandom,id=rng \
-device virtio-rng-device,rng=rng -nographic -append "root=LABEL=rootfs console=ttyS0"

I don't know what hardware greenfly uses (CC'ing Greg in case he'd like to
share some info on this), but timings are similar to what I'm seeing:
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=greenfly&dt=2025-10-09%2000%3A06%3A03&stg=check
ok 160       - select_parallel                          5019 ms

`make check` on mine, with select_parallel repeated:
ok 160       - select_parallel                          6042 ms
ok 161       - select_parallel                          6089 ms
ok 162       - select_parallel                          6116 ms

copperhead and boomslang, which are using real RISC hardware [1], show
better timings:
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=copperhead&dt=2025-10-11%2019%3A36%3A33&stg=check
ok 160       - select_parallel                          2677 ms

https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=boomslang&dt=2025-10-12%2002%3A17%3A53&stg=check
ok 160       - select_parallel                          3651 ms

Thank you for looking into this!

[1] https://www.postgresql.org/message-id/3db97903-884f-4b0c-b1cd-d7442e71ea75%40app.fastmail.com

Best regards,
Alexander
Вложения

Re: IO in wrong state on riscv64

От
Greg Burd
Дата:
On Sun, Oct 12, 2025, at 1:00 AM, Alexander Lakhin wrote:
> Hello Thomas,
>
> 12.10.2025 06:35, Thomas Munro wrote:
>> On Sun, Oct 12, 2025 at 2:00 AM Alexander Lakhin
>> <exclusion@gmail.com> wrote:
>>> 2025-10-11 11:34:46.793 GMT [1169773:1] PANIC:  !!!pgaio_io_wait|
>>> ioh->state changed from 0 to 1 at iteration 0
>>> # no other iteration number observed
>> Can you please disassemble pgaio_io_update_state() and
>> pgaio_io_was_recycled()?  I wonder if the memory barriers are not
>> being generated correctly, causing the state and generation to be
>> loaded out of order, or something like that...
>
> Please find those attached (gdb "disass/m pgaio_io_update_state" misses
> the start of the function (but it's still disassembled below), so I
> decided to share the whole output).
> This is from clean master, without any modifications, but with the issue
> confirmed for this build:
> 2025-10-11 20:29:11.724 UTC [1679534:1] [unknown] LOG:  connection
> received: host=[local]
> 2025-10-11 20:29:11.725 UTC [1679536:1] [unknown] LOG:  connection
> received: host=[local]
> 2025-10-11 20:29:11.726 UTC [1679538:1] [unknown] LOG:  connection
> received: host=[local]
> 2025-10-11 20:29:11.724 UTC [1679533:1] [unknown] LOG:  connection
> received: host=[local]
> 2025-10-11 20:29:11.724 UTC [1679537:1] [unknown] LOG:  connection
> received: host=[local]
> 2025-10-11 20:29:11.724 UTC [1679535:1] [unknown] LOG:  connection
> received: host=[local]
> 2025-10-11 20:29:11.729 UTC [1679539:1] [unknown] LOG:  connection
> received: host=[local
> ...
> 2025-10-11 20:29:11.778 UTC [1679537:3] [unknown] LOG:  connection
> authorized: user=debian database=regression
> application_name=pg_regress/create_schema
> 2025-10-11 20:29:11.778 UTC [1679533:3] [unknown] LOG:  connection
> authorized: user=debian database=regression
> application_name=pg_regress/create_type
> 2025-10-11 20:29:11.778 UTC [1679539:3] [unknown] LOG:  connection
> authorized: user=debian database=regression
> application_name=pg_regress/create_procedure
> 2025-10-11 20:29:11.778 UTC [1679536:3] [unknown] LOG:  connection
> authorized: user=debian database=regression
> application_name=pg_regress/create_misc
> 2025-10-11 20:29:11.778 UTC [1679538:3] [unknown] LOG:  connection
> authorized: user=debian database=regression
> application_name=pg_regress/create_table
> 2025-10-11 20:29:11.778 UTC [1679535:3] [unknown] LOG:  connection
> authorized: user=debian database=regression
> application_name=pg_regress/create_function_c
> 2025-10-11 20:29:11.790 UTC [1679534:2] [unknown] FATAL:  IO in wrong
> state: 0
> 2025-10-11 20:29:11.790 UTC [1679534:3] [unknown] BACKTRACE:
>          postgres: debian regression [local] authentication(+0x4371ec)
> [0x555565cd61ec]
>          postgres: debian regression [local] authentication(+0x442bb2)
> [0x555565ce1bb2]
>          postgres: debian regression [local]
> authentication(StartBufferIO+0x66) [0x555565ce1a10]
>          postgres: debian regression [local] authentication(+0x43df60)
> [0x555565cdcf60]
>          postgres: debian regression [local]
> authentication(StartReadBuffer+0x308) [0x555565cdc8a4]
>          postgres: debian regression [local]
> authentication(ReadBufferExtended+0x64) [0x555565cdaace]
> ...
>          postgres: debian regression [local]
> authentication(postmaster_child_launch+0x132) [0x555565c7787c]
>          postgres: debian regression [local] authentication(+0x3dcd52)
> [0x555565c7bd52]
>          postgres: debian regression [local]
> authentication(InitProcessGlobals+0) [0x555565c79c96]
>          postgres: debian regression [local] authentication(+0x3194b2)
> [0x555565bb84b2]
>          /lib/riscv64-linux-gnu/libc.so.6(+0x2791c) [0x7fffa690891c]
>          /lib/riscv64-linux-gnu/libc.so.6(__libc_start_main+0x74)
> [0x7fffa69089c4]
>          postgres: debian regression [local]
> authentication(_start+0x20) [0x5555659842c8]
>
>> The previous failure on greenfly was a TIMEOUT in the same test, as if
>> a query was hanging.
>
> Yeah, I'll try to reproduce it too...
>
>> On Sun, Oct 12, 2025 at 2:00 AM Alexander Lakhin
>> <exclusion@gmail.com> wrote:
>>> I've managed to reproduce it using qemu-system-riscv64 with Debian trixie
>> Huh, that's interesting.  What is the host architecture?  When I saw
>> that error myself and wondered about memory order, I dismissed the
>> idea of trying with qemu, figuring that my x86 host's TSO would affect
>> the coherency, but thinking again about that... I guess the compiler
>> might still reorder during riscv codegen if there is something wrong
>> with the barrier support, and even if it doesn't, the binary
>> translation to x86 might also feel free to reorder stuff if there are
>> no barrier instructions to prevent it?  Or maybe that doesn't happen
>> but your host is ARM?
>
> I use AMD Ryzen 9 7900X, Ubuntu 24.04 and run the risc machine with:
> qemu-system-riscv64 -machine virt -m 1G -smp 8 -cpu rv64 -device
> virtio-blk-device,drive=hd \
> -drive file=.../dqib_riscv64-virt/image.qcow2,if=none,id=hd -device
> virtio-net-device,netdev=net \
> -netdev user,id=net,hostfwd=tcp::22222-:22 -bios
> /usr/lib/riscv64-linux-gnu/opensbi/generic/fw_jump.elf \
> -kernel /usr/lib/u-boot/qemu-riscv64_smode/uboot.elf -object
> rng-random,filename=/dev/urandom,id=rng \
> -device virtio-rng-device,rng=rng -nographic -append
> "root=LABEL=rootfs
> console=ttyS0"
>
> I don't know what hardware greenfly uses (CC'ing Greg in case he'd
> like to share some info on this),

Hello Alexander, thanks for digging into this issue.

The animal "greenfly" is an OrangePi RV2 [1] with 8GB of RAM.  It has
the optional 256GB eMMC module and uses that for the root filesystem.
It's running Ubuntu 24.04.3 and I'm compiling with clang 18-1.3.  I
think the rest of the details are in the animal registration or in the
logs, but let me know if you have questions or if there are things that
you'd like me to test.

> but timings are similar to what
> I'm seeing:
> https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=greenfly&dt=2025-10-09%2000%3A06%3A03&stg=check
> ok 160       - select_parallel                          5019 ms
>
> `make check` on mine, with select_parallel repeated:
> ok 160       - select_parallel                          6042 ms
> ok 161       - select_parallel                          6089 ms
> ok 162       - select_parallel                          6116 ms
>
> copperhead and boomslang, which are using real RISC hardware [1], show
> better timings:
> https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=copperhead&dt=2025-10-11%2019%3A36%3A33&stg=check
> ok 160       - select_parallel                          2677 ms
>
> https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=boomslang&dt=2025-10-12%2002%3A17%3A53&stg=check
> ok 160       - select_parallel                          3651 ms
>
> Thank you for looking into this!
>
> [1]
> https://www.postgresql.org/message-id/3db97903-884f-4b0c-b1cd-d7442e71ea75%40app.fastmail.com
>
> Best regards,
> Alexander
> Attachments:
> * pgaio_io_update_state.asm
> * pgaio_io_was_recycled.asm

best.

-greg

[1]
http://www.orangepi.org/html/hardWare/computerAndMicrocontrollers/details/Orange-Pi-RV2.html




Re: IO in wrong state on riscv64

От
Thomas Munro
Дата:
On Sun, Oct 12, 2025 at 6:00 PM Alexander Lakhin <exclusion@gmail.com> wrote:
> Please find those attached (gdb "disass/m pgaio_io_update_state" misses
> the start of the function (but it's still disassembled below), so I
> decided to share the whole output).

Could you please also disassemble pgaio_io_reclaim()?



Re: IO in wrong state on riscv64

От
Alexander Lakhin
Дата:
13.10.2025 01:44, Thomas Munro wrote:
> On Sun, Oct 12, 2025 at 6:00 PM Alexander Lakhin <exclusion@gmail.com> wrote:
>> Please find those attached (gdb "disass/m pgaio_io_update_state" misses
>> the start of the function (but it's still disassembled below), so I
>> decided to share the whole output).
> Could you please also disassemble pgaio_io_reclaim()?

Sure, the output of disass/m pgaio_io_reclaim is attached.

A side note: I could not reproduce the hang so far — 8 out of 30
iterations of 027_stream_regress failed, but all completed within
700-800 seconds.

Best regards,
Alexander
Вложения

Re: IO in wrong state on riscv64

От
Thomas Munro
Дата:
On Mon, Oct 13, 2025 at 5:00 PM Alexander Lakhin <exclusion@gmail.com> wrote:
> 13.10.2025 01:44, Thomas Munro wrote:
> > On Sun, Oct 12, 2025 at 6:00 PM Alexander Lakhin <exclusion@gmail.com> wrote:
> >> Please find those attached (gdb "disass/m pgaio_io_update_state" misses
> >> the start of the function (but it's still disassembled below), so I
> >> decided to share the whole output).
> > Could you please also disassemble pgaio_io_reclaim()?
>
> Sure, the output of disass/m pgaio_io_reclaim is attached.

Thanks.  All seems to have something plausible in the right places,
but I know nothing about RISC-V... hmm, what happens if you replace
pg_{read,write}_barrier() with pg_memory_barrier(), in those three
functions?  And if it happens to help, perhaps you could try to figure
out which one(s) help?  Not that it should be necessary but as a clue
or to rule out this line of enquiry... I guess that should generate
FENCE RW,RW, meaning wait for all preceding reads and writes to
complete and don't let any following reads or writes begin, but that's
just from googling...



Re: IO in wrong state on riscv64

От
Alexander Lakhin
Дата:
Hello Thomas,

13.10.2025 08:40, Thomas Munro wrote:
> Thanks.  All seems to have something plausible in the right places,
> but I know nothing about RISC-V... hmm, what happens if you replace
> pg_{read,write}_barrier() with pg_memory_barrier(), in those three
> functions?  And if it happens to help, perhaps you could try to figure
> out which one(s) help?  Not that it should be necessary but as a clue
> or to rule out this line of enquiry... I guess that should generate
> FENCE RW,RW, meaning wait for all preceding reads and writes to
> complete and don't let any following reads or writes begin, but that's
> just from googling...

The replacements doesn't work for me, unfortunately: I have 3 out of 30
027_stream_regress test runs failed:
2025-10-13 21:27:26.161 UTC [4181290:5] pg_regress/brin ERROR:  IO in wrong state: 0
2025-10-13 21:27:26.161 UTC [4181290:6] pg_regress/brin STATEMENT: CREATE TABLE brintest (byteacol bytea,

Disassembly of those three functions is attached.

I'll try to find some simple C programs to test memory barriers outside of
postgres...

Best regards,
Alexander
Вложения

Re: IO in wrong state on riscv64

От
Thomas Munro
Дата:
On Tue, Oct 14, 2025 at 5:00 PM Alexander Lakhin <exclusion@gmail.com> wrote:
> The replacements doesn't work for me, unfortunately: I have 3 out of 30
> 027_stream_regress test runs failed:
> 2025-10-13 21:27:26.161 UTC [4181290:5] pg_regress/brin ERROR:  IO in wrong state: 0
> 2025-10-13 21:27:26.161 UTC [4181290:6] pg_regress/brin STATEMENT: CREATE TABLE brintest (byteacol bytea,

I wonder if the problem could be with buf->io_wref.  In WaitIO() we do:

        iow = buf->io_wref;
        UnlockBufHdr(buf, buf_state);
        ...
            pgaio_wref_wait(&iow);

... but UnlockBufHdr() is only concerned with write ordering:

static inline void
UnlockBufHdr(BufferDesc *desc, uint32 buf_state)
{
    pg_write_barrier();
    pg_atomic_write_u32(&desc->state, buf_state & (~BM_LOCKED));
}

What happens if you insert pg_memory_barrier() after the line that
reads buf->io_wref?

Does this reproduce on REL_18_STABLE?