Обсуждение: Decouple last important WAL record LSN from WAL insert locks
Hi,
While working on something else, I noticed that each WAL insert lock
tracks its own last important WAL record's LSN (lastImportantAt) and
both the bgwriter and checkpointer later computes the max
value/server-wide last important WAL record's LSN via
GetLastImportantRecPtr(). While doing so, each WAL insertion lock is
acquired in exclusive mode in a for loop. This seems like too much
overhead to me. I quickly coded a patch (attached herewith) that
tracks the server-wide last important WAL record's LSN in
XLogCtlInsert (lastImportantPos) protected with a spinlock and gets
rid of lastImportantAt from each WAL insert lock. I ran pgbench with a
simple insert [1] and the results are below. While the test was run,
the GetLastImportantRecPtr() was called 4-5 times.
# of clients HEAD PATCHED
1 83 82
2 159 157
4 303 302
8 576 570
16 1104 1095
32 2055 2041
64 2286 2295
128 2270 2285
256 2302 2253
512 2205 2290
768 2224 2180
1024 2109 2150
2048 1941 1936
4096 1856 1848
It doesn't seem to hurt (for this use-case) anyone, however there
might be some benefit if bgwriter and checkpointer come in the way of
WAL inserters. With the patch, the extra exclusive lock burden on WAL
insert locks is gone. Since the amount of work the WAL inserters do
under the new spinlock is very minimal (updating
XLogCtlInsert->lastImportantPos), it may not be an issue. Also, it's
worthwhile to look at the existing comment [2], which doesn't talk
about the performance impact of having a lock.
Thoughts?
[1]
./configure --prefix=$PWD/inst/ CFLAGS="-O3" > install.log && make -j
8 install > install.log 2>&1 &
cd inst/bin
./pg_ctl -D data -l logfile stop
rm -rf data logfile insert.sql
free -m
sudo su -c 'sync; echo 3 > /proc/sys/vm/drop_caches'
free -m
./initdb -D data
./pg_ctl -D data -l logfile start
./psql -d postgres -c 'ALTER SYSTEM SET shared_buffers = "8GB";'
./psql -d postgres -c 'ALTER SYSTEM SET max_wal_size = "32GB";'
./psql -d postgres -c 'ALTER SYSTEM SET max_connections = "4096";'
./psql -d postgres -c 'ALTER SYSTEM SET bgwriter_delay = "10ms";'
./pg_ctl -D data -l logfile restart
./pgbench -i -s 1 -d postgres
./psql -d postgres -c "ALTER TABLE pgbench_accounts DROP CONSTRAINT
pgbench_accounts_pkey;"
cat << EOF >> insert.sql
\set aid random(1, 10 * :scale)
\set delta random(1, 100000 * :scale)
INSERT INTO pgbench_accounts (aid, bid, abalance) VALUES (:aid, :aid, :delta);
EOF
for c in 1 2 4 8 16 32 64 128 256 512 768 1024 2048 4096; do echo -n
"$c ";./pgbench -n -M prepared -U ubuntu postgres -b simple-update
-c$c -j$c -T5 2>&1|grep '^tps'|awk '{print $3}';done
[2]
* records. Tracking the WAL activity directly in WALInsertLock has the
* advantage of not needing any additional locks to update the value.
--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Вложения
Hi, On 2022-11-23 19:12:07 +0530, Bharath Rupireddy wrote: > While working on something else, I noticed that each WAL insert lock > tracks its own last important WAL record's LSN (lastImportantAt) and > both the bgwriter and checkpointer later computes the max > value/server-wide last important WAL record's LSN via > GetLastImportantRecPtr(). While doing so, each WAL insertion lock is > acquired in exclusive mode in a for loop. This seems like too much > overhead to me. GetLastImportantRecPtr() should be a very rare operation, so it's fine for it to be expensive. The important thing is for the maintenance of the underlying data to be very cheap. > I quickly coded a patch (attached herewith) that > tracks the server-wide last important WAL record's LSN in > XLogCtlInsert (lastImportantPos) protected with a spinlock and gets > rid of lastImportantAt from each WAL insert lock. That strikes me as a very bad idea. It adds another point of contention to a very very hot code path, to make a very rare code path cheaper. Greetings, Andres Freund
On Sun, Nov 27, 2022 at 2:43 AM Andres Freund <andres@anarazel.de> wrote: > > Hi, > > On 2022-11-23 19:12:07 +0530, Bharath Rupireddy wrote: > > While working on something else, I noticed that each WAL insert lock > > tracks its own last important WAL record's LSN (lastImportantAt) and > > both the bgwriter and checkpointer later computes the max > > value/server-wide last important WAL record's LSN via > > GetLastImportantRecPtr(). While doing so, each WAL insertion lock is > > acquired in exclusive mode in a for loop. This seems like too much > > overhead to me. > > GetLastImportantRecPtr() should be a very rare operation, so it's fine for it > to be expensive. The important thing is for the maintenance of the underlying > data to be very cheap. > > > I quickly coded a patch (attached herewith) that > > tracks the server-wide last important WAL record's LSN in > > XLogCtlInsert (lastImportantPos) protected with a spinlock and gets > > rid of lastImportantAt from each WAL insert lock. > > That strikes me as a very bad idea. It adds another point of contention to a > very very hot code path, to make a very rare code path cheaper. Thanks for the response. I agree that GetLastImportantRecPtr() gets called rarely, however, what concerns me is that it's taking all the WAL insertion locks when it gets called. Is tracking lastImportantPos as pg_atomic_uint64 in XLogCtlInsert any better than an explicit spinlock? I think it's better on platforms where atomics are supported, however, it boils down to using a spin lock on the platforms where atomics aren't supported. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Hi, On 2022-11-28 11:42:19 +0530, Bharath Rupireddy wrote: > On Sun, Nov 27, 2022 at 2:43 AM Andres Freund <andres@anarazel.de> wrote: > > On 2022-11-23 19:12:07 +0530, Bharath Rupireddy wrote: > > > While working on something else, I noticed that each WAL insert lock > > > tracks its own last important WAL record's LSN (lastImportantAt) and > > > both the bgwriter and checkpointer later computes the max > > > value/server-wide last important WAL record's LSN via > > > GetLastImportantRecPtr(). While doing so, each WAL insertion lock is > > > acquired in exclusive mode in a for loop. This seems like too much > > > overhead to me. > > > > GetLastImportantRecPtr() should be a very rare operation, so it's fine for it > > to be expensive. The important thing is for the maintenance of the underlying > > data to be very cheap. > > > > > I quickly coded a patch (attached herewith) that > > > tracks the server-wide last important WAL record's LSN in > > > XLogCtlInsert (lastImportantPos) protected with a spinlock and gets > > > rid of lastImportantAt from each WAL insert lock. > > > > That strikes me as a very bad idea. It adds another point of contention to a > > very very hot code path, to make a very rare code path cheaper. > > Thanks for the response. I agree that GetLastImportantRecPtr() gets > called rarely, however, what concerns me is that it's taking all the > WAL insertion locks when it gets called. So what? It's far from the only operation doing so. And in contrast to most of the other places (c.f. WALInsertLockAcquireExclusive()) it only takes one of them at a time. > Is tracking lastImportantPos as pg_atomic_uint64 in XLogCtlInsert any > better than an explicit spinlock? I think it's better on platforms > where atomics are supported, however, it boils down to using a spin > lock on the platforms where atomics aren't supported. A central atomic in XLogCtlInsert would be better than a spinlock protected variable, but still bad. We do *not* want to have more central state that needs to be manipulated, we want *less*. If we wanted to optimize this - and I haven't seen any evidence it's worth doing so - we should just optimize the lock acquisitions in GetLastImportantRecPtr() away. *Without* centralizing the state. Greetings, Andres Freund
On Mon, Nov 28, 2022 at 11:55 PM Andres Freund <andres@anarazel.de> wrote: > > > Is tracking lastImportantPos as pg_atomic_uint64 in XLogCtlInsert any > > better than an explicit spinlock? I think it's better on platforms > > where atomics are supported, however, it boils down to using a spin > > lock on the platforms where atomics aren't supported. > > A central atomic in XLogCtlInsert would be better than a spinlock protected > variable, but still bad. We do *not* want to have more central state that > needs to be manipulated, we want *less*. Agreed. > If we wanted to optimize this - and I haven't seen any evidence it's worth > doing so - we should just optimize the lock acquisitions in > GetLastImportantRecPtr() away. *Without* centralizing the state. Hm. I can think of converting lastImportantAt from XLogRecPtr to pg_atomic_uint64 and letting it stay within the WALInsertLock structure. This prevents torn-reads and also avoids WAL insertion lock acquire-release cycles in GetLastImportantRecPtr(). Please see the attached patch herewith. If this idea is worth it, I would like to bring this and the other thread [1] that converts insertingAt to atomic and modifies other WAL insert locks related code under one roof and start a new thread. BTW, the patch at [1] seems to be showing a good benefit for high-concurrent inserts with small records. Thoughts? [1] https://www.postgresql.org/message-id/CALj2ACWkWbheFhkPwMw83CUpzHFGXSV_HXTBxG9%2B-PZ3ufHE%3DQ%40mail.gmail.com -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com