Re: Hash Indexes
От | Amit Kapila |
---|---|
Тема | Re: Hash Indexes |
Дата | |
Msg-id | CAA4eK1KMS3WP0EG13xhyrB_30qKj53vtA2VHty7qOyaAaV6bxQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Hash Indexes (Jeff Janes <jeff.janes@gmail.com>) |
Список | pgsql-hackers |
On Thu, Sep 15, 2016 at 4:44 AM, Jeff Janes <jeff.janes@gmail.com> wrote: > On Tue, May 10, 2016 at 5:09 AM, Amit Kapila <amit.kapila16@gmail.com> > wrote: >> >> >> >> Although, I don't think it is a very good idea to take any performance >> data with WIP patch, still I couldn't resist myself from doing so and below >> are the performance numbers. To get the performance data, I have dropped >> the primary key constraint on pgbench_accounts and created a hash index on >> aid column as below. >> >> alter table pgbench_accounts drop constraint pgbench_accounts_pkey; >> create index pgbench_accounts_pkey on pgbench_accounts using hash(aid); > > > > To be rigorously fair, you should probably replace the btree primary key > with a non-unique btree index and use that in the btree comparison case. I > don't know how much difference that would make, probably none at all for a > read-only case. > >> >> >> >> Below data is for read-only pgbench test and is a median of 3 5-min runs. >> The performance tests are executed on a power-8 m/c. > > > With pgbench -S where everything fits in shared_buffers and the number of > cores I have at my disposal, I am mostly benchmarking interprocess > communication between pgbench and the backend. I am impressed that you can > detect any difference at all. > > For this type of thing, I like to create a server side function for use in > benchmarking: > > create or replace function pgbench_query(scale integer,size integer) > RETURNS integer AS $$ > DECLARE sum integer default 0; > amount integer; > account_id integer; > BEGIN FOR i IN 1..size LOOP > account_id=1+floor(random()*scale); > SELECT abalance into strict amount FROM pgbench_accounts > WHERE aid = account_id; > sum := sum + amount; > END LOOP; > return sum; > END $$ LANGUAGE plpgsql; > > And then run using a command like this: > > pgbench -f <(echo 'select pgbench_query(40,1000)') -c$j -j$j -T 300 > > Where the first argument ('40', here) must be manually set to the same value > as the scale-factor. > > With 8 cores and 8 clients, the values I get are, for btree, hash-head, > hash-concurrent, hash-concurrent-cache, respectively: > > 598.2 > 577.4 > 668.7 > 664.6 > > (each transaction involves 1000 select statements) > > So I do see that the concurrency patch is quite an improvement. The cache > patch does not produce a further improvement, which was somewhat surprising > to me (I thought that that patch would really shine in a read-write > workload, but I expected at least improvement in read only) > To see the benefit from cache meta page patch, you might want to test with clients more than the number of cores, atleast that is what data by Mithun [1] indicates or probably in somewhat larger m/c. [1] - https://www.postgresql.org/message-id/CAD__OugX0aOa7qopz3d-nbBAoVmvSmdFJOX4mv5tFRpijqH47A%40mail.gmail.com -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
В списке pgsql-hackers по дате отправления: