Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.
От | Peter Geoghegan |
---|---|
Тема | Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index. |
Дата | |
Msg-id | CAH2-WzkXHhjhmUYfVvu6afbojU97MST8RUT1U=hLd2W-GC5FNA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index. (Peter Geoghegan <pg@bowt.ie>) |
Список | pgsql-hackers |
On Tue, Dec 3, 2019 at 12:13 PM Peter Geoghegan <pg@bowt.ie> wrote: > The new criteria/heuristic for unique indexes is very simple: If a > unique index has an existing item that is a duplicate on the incoming > item at the point that we might have to split the page, then apply > deduplication. Otherwise (when the incoming item has no duplicates), > don't apply deduplication at all -- just accept that we'll have to > split the page. > the working/draft version of the patch will often avoid a huge amount of > bloat in a pgbench-style workload that has an extra index on the > pgbench_accounts table, to prevent HOT updates. The accounts primary > key (pgbench_accounts_pkey) hardly grows at all with the patch, but > grows 2x on master. I have numbers from my benchmark against my working copy of the patch, with this enhanced design for unique index deduplication. With an extra index on pgbench_accounts's abalance column (that is configured to not use deduplication for the test), and with the aid variable (i.e. UPDATEs on pgbench_accounts) configured to use skew, I have a variant of the standard pgbench TPC-B like benchmark. The pgbench script I used was as follows: \set r random_gaussian(1, 100000 * :scale, 4.0) \set aid abs(hash(:r)) % (100000 * :scale) \set bid random(1, 1 * :scale) \set tid random(1, 10 * :scale) \set delta random(-5000, 5000) BEGIN; UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid; SELECT abalance FROM pgbench_accounts WHERE aid = :aid; UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid; UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid; INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP); END; Results from interlaced 2 hour runs at pgbench scale 5,000 are as follows (shown in reverse chronological order): master_2_run_16.out: "tps = 7263.948703 (including connections establishing)" patch_2_run_16.out: "tps = 7505.358148 (including connections establishing)" master_1_run_32.out: "tps = 9998.868764 (including connections establishing)" patch_1_run_32.out: "tps = 9781.798606 (including connections establishing)" master_1_run_16.out: "tps = 8812.269270 (including connections establishing)" patch_1_run_16.out: "tps = 9455.476883 (including connections establishing)" The patch comes out ahead in the first 2 hour run, with later runs looking like a more even match. I think that each run didn't last long enough to even out the effects of autovacuum, but this is really about index size rather than overall throughput, so it's not that important. (I need to get a large server to do further performance validation work, rather than just running overnight benchmarks on my main work machine like this.) The primary key index (pgbench_accounts_pkey) starts out at 10.45 GiB in size, and ends at 12.695 GiB in size with the patch. Whereas with master, it also starts out at 10.45 GiB, but finishes off at 19.392 GiB. Clearly this is a significant difference -- the index is only ~65% of its master-branch size with the patch. See attached tar archive with logs, and pg_buffercache output after each run. (The extra index on pgbench_accounts.abalance is pretty much the same size for patch/master, since deduplication was disabled for the patch runs.) And, as I said, I believe that we can make this unique index deduplication stuff an internal thing that isn't even documented (maybe a passing reference is appropriate when talking about general deduplication). -- Peter Geoghegan
Вложения
В списке pgsql-hackers по дате отправления:
Предыдущее
От: "Smith, Peter"Дата:
Сообщение: RE: Proposal: Add more compile-time asserts to exposeinconsistencies.