Re: CSN snapshots in hot standby
От | Heikki Linnakangas |
---|---|
Тема | Re: CSN snapshots in hot standby |
Дата | |
Msg-id | b439edfc-c5e5-43a9-802d-4cb51ec20646@iki.fi обсуждение исходный текст |
Ответ на | Re: CSN snapshots in hot standby ("Andrey M. Borodin" <x4mmm@yandex-team.ru>) |
Ответы |
Re: CSN snapshots in hot standby
Re: CSN snapshots in hot standby |
Список | pgsql-hackers |
On 05/04/2024 13:49, Andrey M. Borodin wrote: >> On 5 Apr 2024, at 02:08, Kirill Reshke <reshkekirill@gmail.com> wrote: Thanks for taking a look, Kirill! >> maybe we need some hooks here? Or maybe, we can take CSN here from extension somehow. > > I really like the idea of CSN-provider-as-extension. > But it's very important to move on with CSN, at least on standby, to make CSN actually happen some day. > So, from my perspective, having LSN-as-CSN is already huge step forward. Yeah, I really don't want to expand the scope of this. Here's a new version. Rebased, and lots of comments updated. I added a tiny cache of the CSN lookups into SnapshotData, which can hold the values of 4 XIDs that are known to be visible to the snapshot, and 4 invisible XIDs. This is pretty arbitrary, but the idea is to have something very small to speed up the common cases that 1-2 XIDs are repeatedly looked up, without adding too much overhead. I did some performance testing of the visibility checks using these CSN snapshots. The tests run SELECTs with a SeqScan in a standby, over a table where all the rows have xmin/xmax values that are still in-progress in the primary. Three test scenarios: 1. large-xact: one large transaction inserted all the rows. All rows have the same XMIN, which is still in progress 2. many-subxacts: one large transaction inserted each row in a separate subtransaction. All rows have a different XMIN, but they're all subtransactions of the same top-level transaction. (This causes the subxids cache in the proc array to overflow) 3. few-subxacts: All rows are inserted, committed, and vacuum frozen. Then, using 10 in separate subtransactions, DELETE the rows, in an interleaved fashion. The XMAX values cycle like this "1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, ...". The point of this is that these sub-XIDs fit in the subxids cache in the procarray, but the pattern defeats the simple 4-element cache that I added. The test script I used is attached. I repeated it a few times with master and the patches here, and picked the fastest runs for each. Just eyeballing the results, there's about ~10% variance in these numbers. Smaller is better. Master: large-xact: 4.57732510566711 many-subxacts: 18.6958119869232 few-subxacts: 16.467698097229 Patched: large-xact: 10.2999930381775 many-subxacts: 11.6501438617706 few-subxacts: 19.8457028865814 With cache: large-xact: 3.68792295455933 many-subxacts: 13.3662350177765 few-subxacts: 21.4426419734955 The 'large-xacts' results show that the CSN lookups are slower than the binary search on the 'xids' array. Not a surprise. The 4-element cache fixes the regression, which is also not a surprise. The 'many-subxacts' results show that the CSN lookups are faster than the current method in master, when the subxids cache has overflowed. That makes sense: on master, we always perform a lookup in pg_subtrans, if the suxids cache has overflowed, which is more or less the same overhead as the CSN lookup. But we avoid the binary search on the xids array after that. The 'few-subxacts' shows a regression, when the 4-element cache is not effective. I think that's acceptable, the CSN approach has many benefits, and I don't think this is a very common scenario. But if necessary, it could perhaps be alleviated with more caching, or by trying to compensate by optimizing elsewhere. -- Heikki Linnakangas Neon (https://neon.tech)
Вложения
- v2-0001-Update-outdated-comment-on-WAL-logged-locks-with-.patch
- v2-0002-XXX-add-perf-test.patch
- v2-0003-Use-CSN-snapshots-during-Hot-Standby.patch
- v2-0004-Make-SnapBuildWaitSnapshot-work-without-xl_runnin.patch
- v2-0005-Remove-the-now-unused-xids-array-from-xl_running_.patch
- v2-0006-Add-a-small-cache-to-Snapshot-to-avoid-CSN-lookup.patch
В списке pgsql-hackers по дате отправления: