Re: Performance improvement for joins where outer side is unique

Поиск

Список

Период

Сортировка

От	David Rowley
Тема	Re: Performance improvement for joins where outer side is unique
Дата	30 января 2015 г. 14:37:33
Msg-id	CAApHDvod_uCMoUPovdpXbNkw50O14x3wwKoJmZLxkbBn71zdEg@mail.gmail.com обсуждение исходный текст
Ответ на	Performance improvement for joins where outer side is unique (David Rowley <dgrowleyml@gmail.com>)
Ответы	Re: Performance improvement for joins where outer side is unique (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>) Re: Performance improvement for joins where outer side is unique (Tomas Vondra <tomas.vondra@2ndquadrant.com>) Re: Performance improvement for joins where outer side is unique (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Список	pgsql-hackers

Дерево обсуждения

On 1 January 2015 at 02:47, David Rowley <dgrowleyml@gmail.com> wrote:

Hi,

I've been hacking a bit at the join code again... This time I've been putting some effort into optimising the case where the inner side of the join is known to be unique.
For example, given the tables:

create table t1 (id int primary key);
create table t2 (id int primary key);

And query such as:

select * from t1 left outer join t2 on t1.id=t2.id;

It is possible to deduce at planning time that "for each row in the outer relation, only 0 or 1 rows can exist in the inner relation", (inner being t2)

I've been hacking at this unique join idea again and I've now got it working for all join types -- Patch attached.

Here's how the performance is looking:

postgres=# create table t1 (id int primary key);

CREATE TABLE

postgres=# create table t2 (id int primary key);

CREATE TABLE

postgres=# insert into t1 select x.x from generate_series(1,1000000) x(x);

INSERT 0 1000000

postgres=# insert into t2 select x.x from generate_series(1,1000000) x(x);

INSERT 0 1000000

postgres=# vacuum analyze;

VACUUM

postgres=# \q

Query: select count(t1.id) from t1 inner join t2 on t1.id=t2.id;

With Patch on master as of 32bf6ee

D:\Postgres\install\bin>pgbench -f d:\unijoin3.sql -T 60 -n postgres

transaction type: Custom query

scaling factor: 1

query mode: simple

number of clients: 1

number of threads: 1

duration: 60 s

number of transactions actually processed: 78

latency average: 769.231 ms

tps = 1.288260 (including connections establishing)

tps = 1.288635 (excluding connections establishing)

Master as of 32bf6ee

D:\Postgres\install\bin>pgbench -f d:\unijoin3.sql -T 60 -n postgres

transaction type: Custom query

scaling factor: 1

query mode: simple

number of clients: 1

number of threads: 1

duration: 60 s

number of transactions actually processed: 70

latency average: 857.143 ms

tps = 1.158905 (including connections establishing)

tps = 1.159264 (excluding connections establishing)

That's a 10% performance increase.

I still need to perform more thorough benchmarking with different data types.

One weird thing that I noticed before is that in an earlier revision of the patch in the executor's join Initialise node code, I had set the unique_inner to true for semi joins and replaced the SEMI_JOIN check for a unique_join check in the execute node for each join method. With this the performance results barely changed from standard... I've yet to find out why.

The patch also has added a property to the EXPLAIN (VERBOSE) output which states if the join was found to be unique or not.

The patch also still requires a final pass of comment fix-ups. I've just plain run out of time for now.

I'll pick this up in 2 weeks time.

Regards

David Rowley

Вложения

unijoin_2015-01-31_9af9cfa.patch

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Etsuro Fujita
Дата: 30 января 2015 г., 13:21:37
Сообщение: Odd behavior of updatable security barrier views on foreign tables

Следующее

От: Michael Paquier
Дата: 30 января 2015 г., 15:06:56
Сообщение: NULL checks of deferenced pointers in picksplit method of intarray

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Performance improvement for joins where outer side is unique

Вложения

Предыдущее

Следующее