Re: Hash Indexes

Поиск

Список

Период

Сортировка

От	Jeff Janes
Тема	Re: Hash Indexes
Дата	15 сентября 2016 г. 02:14:14
Msg-id	CAMkU=1zd1KXNHQxRD6fBx4+Vu5Hnt2H7vzH1KfGAU9VoQcQy8A@mail.gmail.com обсуждение исходный текст
Ответ на	Hash Indexes (Amit Kapila <amit.kapila16@gmail.com>)
Ответы	Re: Hash Indexes (Amit Kapila <amit.kapila16@gmail.com>)
Список	pgsql-hackers

Дерево обсуждения

On Tue, May 10, 2016 at 5:09 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

Although, I don't think it is a very good idea to take any performance data with WIP patch, still I couldn't resist myself from doing so and below are the performance numbers. To get the performance data, I have dropped the primary key constraint on pgbench_accounts and created a hash index on aid column as below.

alter table pgbench_accounts drop constraint pgbench_accounts_pkey;
create index pgbench_accounts_pkey on pgbench_accounts using hash(aid);

To be rigorously fair, you should probably replace the btree primary key with a non-unique btree index and use that in the btree comparison case. I don't know how much difference that would make, probably none at all for a read-only case.

Below data is for read-only pgbench test and is a median of 3 5-min runs. The performance tests are executed on a power-8 m/c.

With pgbench -S where everything fits in shared_buffers and the number of cores I have at my disposal, I am mostly benchmarking interprocess communication between pgbench and the backend. I am impressed that you can detect any difference at all.

For this type of thing, I like to create a server side function for use in benchmarking:

create or replace function pgbench_query(scale integer,size integer)

RETURNS integer AS $$

DECLARE sum integer default 0;

amount integer;

account_id integer;

BEGIN FOR i IN 1..size LOOP

account_id=1+floor(random()*scale);

SELECT abalance into strict amount FROM pgbench_accounts

WHERE aid = account_id;

sum := sum + amount;

END LOOP;

return sum;

END $$ LANGUAGE plpgsql;

And then run using a command like this:

pgbench -f <(echo 'select pgbench_query(40,1000)') -c$j -j$j -T 300

Where the first argument ('40', here) must be manually set to the same value as the scale-factor.

With 8 cores and 8 clients, the values I get are, for btree, hash-head, hash-concurrent, hash-concurrent-cache, respectively:

598.2

577.4

668.7

664.6

(each transaction involves 1000 select statements)

So I do see that the concurrency patch is quite an improvement. The cache patch does not produce a further improvement, which was somewhat surprising to me (I thought that that patch would really shine in a read-write workload, but I expected at least improvement in read only)

I've run this was 128MB shared_buffers and scale factor 40. Not everything fits in shared_buffers, but quite easily fits in RAM, and there is no meaningful IO caused by the benchmark.

Cheers,

Jeff

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Thomas Munro
Дата: 15 сентября 2016 г., 02:04:22
Сообщение: Re: kqueue

Следующее

От: Petr Jelinek
Дата: 15 сентября 2016 г., 02:15:36
Сообщение: Re: Logical Replication WIP

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Hash Indexes

Предыдущее

Следующее