Re: Potential "AIO / io workers" inter-worker locking issue in PG18?
От | Marco Boeringa |
---|---|
Тема | Re: Potential "AIO / io workers" inter-worker locking issue in PG18? |
Дата | |
Msg-id | 1c76a5f4-1b9f-46b2-b49b-2ff2b285fb99@boeringa.demon.nl обсуждение исходный текст |
Ответ на | Re: Potential "AIO / io workers" inter-worker locking issue in PG18? (Marco Boeringa <marco@boeringa.demon.nl>) |
Ответы |
Re: Potential "AIO / io workers" inter-worker locking issue in PG18?
|
Список | pgsql-bugs |
Hi Andres, I have now been able to successfully run 'perf' after a VM restart and booting with the linux 6.8 kernel. I have copied in two results for 'perf top' and a recording of one of the about 10 stuck processes by supplying a specific pid of the postgres backend. Let me know if this is of any use. Of course, the recordings themselves are much bigger, I just copied the top part as displayed by the 'perf' report option. If you need the actual underlying files, let me know. Also, if you have specific suggestions for 'perf' command options you would like to see run, please post them in your response and I will try if I can run them. Andres: I posted separately to you I had a suspicion a recent linux kernel update (6.8 --> 6.14) might be involved that coincided with the PG18/PostGIS 3.6.0 upgrade when I initially had multiple successful runs of my tool without the temporary stalling / delay in returning of postgres after booting to the older 6.8 kernel. This has now proven wrong: these results are of the 6.8 kernel runs, so the kernel is not involved, and I also see the issue with the older kernel. *** sudo perf top: *** Samples: 1M of event 'cpu-clock:ppp', 4000 Hz, Event count (approx.): 346785156250 lost: 0/0 drop: 0/0 Overhead Shared Object Symbol 17,33% postgres [.] 0x0000000000584216 12,21% postgres [.] 0x000000000058737a 11,80% [kernel] [k] pv_native_safe_halt 10,24% postgres [.] ReleaseAndReadBuffer 4,92% postgres [.] heap_hot_search_buffer 3,63% postgres [.] hash_search_with_hash_value 2,36% postgres [.] 0x00000000005841f3 1,85% postgres [.] 0x0000000000554edd 1,61% postgres [.] 0x0000000000554e1f 1,40% postgres [.] 0x00000000005567f9 1,32% postgres [.] StartReadBuffer 1,24% postgres [.] heap_page_prune_opt 1,23% postgres [.] LWLockAcquire 1,14% postgres [.] HeapTupleSatisfiesVisibility 1,10% postgres [.] 0x00000000003977dd 0,92% postgres [.] index_getnext_tid 0,87% postgres [.] index_fetch_heap 0,83% postgres [.] ExecScan 0,82% postgres [.] ReleaseBuffer 0,81% postgres [.] LWLockRelease 0,71% postgres [.] XidInMVCCSnapshot 0,66% postgres [.] _bt_checkkeys 0,66% postgres [.] _bt_next 0,55% libxorgxrdp.so [.] crc_process_data 0,51% postgres [.] btgettuple 0,41% postgres [.] 0x00000000005567c6 0,40% postgres [.] 0x00000000003975d0 0,35% postgres [.] ExecStoreBufferHeapTuple 0,34% postgres [.] hash_bytes 0,33% postgres [.] MemoryContextReset 0,33% postgres [.] 0x00000000003975d3 0,32% postgres [.] ExecEvalSysVar 0,28% postgres [.] 0x00000000003975e3 0,27% postgres [.] ResourceOwnerForget 0,27% postgres [.] GlobalVisTestFor 0,26% postgres [.] 0x00000000003975b6 0,26% postgres [.] HeapTupleIsSurelyDead 0,23% postgres [.] 0x00000000003975c3 0,23% postgres [.] 0x00000000001ff80a 0,22% postgres [.] PredicateLockTID 0,21% postgres [.] slot_getsomeattrs_int 0,20% postgres [.] 0x000000000058421d 0,19% postgres [.] ReadBufferExtended 0,16% libc.so.6 [.] __memcmp_sse2 0,15% postgres [.] 0x0000000000382e9e *** sudo perf -p <PID of one stuck postgres backend> -g -d 10 *** *** sudo perf report -g *** Samples: 40K of event 'task-clock:ppp', Event count (approx.): 10018000000 Children Self Command Shared Object Symbol + 100,00% 0,00% postgres postgres [.] _start + 100,00% 0,00% postgres libc.so.6 [.] __libc_start_main@@GLIBC_2.34 + 100,00% 0,00% postgres libc.so.6 [.] __libc_start_call_main + 100,00% 0,00% postgres postgres [.] main + 100,00% 0,00% postgres postgres [.] PostmasterMain + 100,00% 0,00% postgres postgres [.] 0x00005f35711b913e + 100,00% 0,00% postgres postgres [.] postmaster_child_launch + 100,00% 0,00% postgres postgres [.] 0x00005f3570fb9dbf + 100,00% 0,00% postgres postgres [.] PostgresMain + 100,00% 0,00% postgres postgres [.] 0x00005f3570fb6cb9 + 100,00% 0,00% postgres postgres [.] PortalRun + 100,00% 0,00% postgres postgres [.] 0x00005f3570fbeca3 + 100,00% 0,00% postgres postgres [.] 0x00005f3570fbcf85 + 100,00% 0,00% postgres postgres [.] standard_ExecutorRun + 100,00% 0,00% postgres postgres [.] 0x00005f3570de682b + 100,00% 0,00% postgres postgres [.] 0x00005f3570de966d + 97,51% 0,00% postgres postgres [.] 0x00005f3570de959d + 97,17% 0,95% postgres postgres [.] ExecScan + 82,75% 0,00% postgres postgres [.] 0x00005f3570dda135 + 82,35% 1,04% postgres postgres [.] index_fetch_heap + 28,80% 0,00% postgres postgres [.] 0x00005f3570bf3f87 + 23,67% 0,00% postgres postgres [.] 0x00005f3570bf3ee8 + 22,19% 0,00% postgres postgres [.] 0x00005f3570fa1216 + 22,18% 22,18% postgres postgres [.] 0x0000000000584216 + 17,06% 0,89% postgres postgres [.] LWLockRelease + 16,18% 0,00% postgres postgres [.] 0x00005f3570bf3f4f + 15,59% 0,00% postgres postgres [.] 0x00005f3570fa437a + 15,57% 15,57% postgres postgres [.] 0x000000000058737a + 14,90% 0,22% postgres postgres [.] ReadBufferExtended + 14,18% 13,05% postgres postgres [.] ReleaseAndReadBuffer + 14,00% 1,82% postgres postgres [.] StartReadBuffer + 6,59% 0,00% postgres postgres [.] 0x00005f3570bf3f18 + 5,97% 3,78% postgres postgres [.] heap_hot_search_buffer + 5,23% 0,00% postgres postgres [.] 0x00005f3570da0613 + 5,20% 0,00% postgres postgres [.] 0x00005f3570dda121 + 5,19% 0,22% postgres postgres [.] slot_getsomeattrs_int + 5,18% 0,68% postgres postgres [.] index_getnext_tid + 4,52% 0,65% postgres postgres [.] btgettuple + 3,92% 0,99% postgres postgres [.] _bt_next + 3,28% 3,28% postgres postgres [.] hash_search_with_hash_value + 2,96% 0,38% postgres postgres [.] ExecStoreBufferHeapTuple + 2,93% 0,00% postgres postgres [.] 0x00005f3570bf3fe0
В списке pgsql-bugs по дате отправления: