Re: Make NUM_XLOGINSERT_LOCKS configurable
От | Jakub Wartak |
---|---|
Тема | Re: Make NUM_XLOGINSERT_LOCKS configurable |
Дата | |
Msg-id | CAKZiRmzVi9Z+bSpqqj44ySd1U1RXKQvM2N5kJ6_n0+09CdaviQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Make NUM_XLOGINSERT_LOCKS configurable (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>) |
Список | pgsql-hackers |
On Fri, Jan 12, 2024 at 7:33 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Wed, Jan 10, 2024 at 11:43 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > > Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> writes: > > > On Wed, Jan 10, 2024 at 10:00 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > >> Maybe. I bet just bumping up the constant by 2X or 4X or so would get > > >> most of the win for far less work; it's not like adding a few more > > >> LWLocks is expensive. But we need some evidence about what to set it to. > > > > > I previously made an attempt to improve WAL insertion performance with > > > varying NUM_XLOGINSERT_LOCKS. IIRC, we will lose what we get by > > > increasing insertion locks (reduction in WAL insertion lock > > > acquisition time) to the CPU overhead of flushing the WAL in > > > WaitXLogInsertionsToFinish as referred to by the following comment. > > > > Very interesting --- this is at variance with what the OP said, so > > we definitely need details about the test conditions in both cases. > > > > > Unfortunately, I've lost the test results, I'll run them up again and > > > come back. > > > > Please. > > Okay, I'm back with some testing [..] > Results with varying NUM_XLOGINSERT_LOCKS (note that we can't allow it > be more than MAX_SIMUL_LWLOCKS): > Locks TPS WAL Insert Lock Acquire Time in Milliseconds WAL > Wait for In-progress Inserts to Finish Time in Milliseconds > 8 18669 12532 8775 > 16 18076 10641 13491 > 32 18034 6635 13997 > 64 17582 3937 14718 > 128 17782 4563 20145 > > Also, check the attached graph. Clearly there's an increase in the > time spent in waiting for in-progress insertions to finish in > WaitXLogInsertionsToFinish from 8.7 seconds to 20 seconds. Whereas, > the time spent to acquire WAL insertion locks decreased from 12.5 > seconds to 4.5 seconds. Overall, this hasn't resulted any improvement > in TPS, in fact observed slight reduction. Hi, I've hastily tested using Bharath's patches too as I was thinking it would be a fast win due to contention, however it seems that (at least on fast NVMEs?) increasing NUM_XLOGINSERT_LOCKS doesn't seem to help. With pgbench -P 5 -c 32 -j 32 -T 30 and - 64vCPU Lsv2 (AMD EPYC), on single NVMe device (with ext4) that can do 100k RW IOPS@8kB (with fio/libaio, 4jobs) - shared_buffers = '8GB', max_wal_size = '32GB', track_wal_io_timing = on - maxed out wal_buffers = '256MB' tpcb-like with synchronous_commit=off TPS wal_insert_lock_acquire_time wal_wait_for_insert_to_finish_time 8 30393 24087 128 32 31205 968 93 tpcb-like with synchronous_commit=on TPS wal_insert_lock_acquire_time wal_wait_for_insert_to_finish_time 8 12031 8472 10722 32 11957 1188 12563 tpcb-like with synchronous_commit=on and pgbench -c 64 -j 64 TPS wal_insert_lock_acquire_time wal_wait_for_insert_to_finish_time 8 25010 90620 68318 32 25976 18569 85319 // same, Bharath said , it shifted from insert_lock to waiting_for_insert to finish insertonly (largeinserts) with synchronous_commit=off (still -c 32 -j 32) TPS wal_insert_lock_acquire_time wal_wait_for_insert_to_finish_time 8 367 19142 83 32 393 875 68 insertonly (largeinserts) with synchronous_commit=on (still -c 32 -j 32) TPS wal_insert_lock_acquire_time wal_wait_for_insert_to_finish_time 8 329 15950 125 32 310 2177 316 insertonly was := { create sequence s1; create table t (id bigint, t text) partition by hash (id); create table t_h0 partition of t FOR VALUES WITH (modulus 8, remainder 0); create table t_h1 partition of t FOR VALUES WITH (modulus 8, remainder 1); create table t_h2 partition of t FOR VALUES WITH (modulus 8, remainder 2); create table t_h3 partition of t FOR VALUES WITH (modulus 8, remainder 3); create table t_h4 partition of t FOR VALUES WITH (modulus 8, remainder 4); create table t_h5 partition of t FOR VALUES WITH (modulus 8, remainder 5); create table t_h6 partition of t FOR VALUES WITH (modulus 8, remainder 6); create table t_h7 partition of t FOR VALUES WITH (modulus 8, remainder 7); and runtime pgb: insert into t select nextval('s1'), repeat('A', 1000) from generate_series(1, 1000); } it was truncated every time, DB was checkpointed, of course it was on master. Without more details from Qingsong it is going to be hard to explain the boost he witnessed. -J.
В списке pgsql-hackers по дате отправления: