Re: PATCH: use foreign keys to improve join estimates v1

Поиск

Список

Период

Сортировка

От	David Rowley
Тема	Re: PATCH: use foreign keys to improve join estimates v1
Дата	27 сентября 2015 г. 12:00:15
Msg-id	CAKJS1f-Ad5Fs9Fsay_=LZT_Ep81r+qU3pVLcecz+a1+BJMVp0g@mail.gmail.com обсуждение исходный текст
Ответ на	Re: PATCH: use foreign keys to improve join estimates v1 (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Ответы	Re: PATCH: use foreign keys to improve join estimates v1
Список	pgsql-hackers

Дерево обсуждения

On 26 September 2015 at 01:57, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

Hi,

On 09/25/2015 03:39 AM, David Rowley wrote:
I looked at this again, and I'm not all that sure it's possible to
assume that 1.0 / <tuples> is valid when there's more than one
relation at either side of the join.
>
My reasoning for this is that the whole basis for the patch is that a
if we find a foreign key match, then we can be sure enough, as far as
row estimations go, that exactly 1 tuple will exist matching that
condition. This assumption of the 1 tuple no longer holds when 2
relations have already been joined, as this can either cause tuple
duplication, or elimination.

I don't see why that would be the case. Of course, you need to take the right <tuples>, i.e. the "target" of the foreign key (the table with UNIQUE constraint) so that the selectivity matches the fact that exactly 1 tuple (on the PK side) matches.

hmm, ok. You're right. It appears I was a bit confused, but thanks for explaining it again. I get it now.

I've been working on this again. I've put back the code that you wrote for the looping over each combination of relations from either side of the join.

I've also added some code to get around the problem with eclass joins and the RestrictInfo having some alternative Vars that don't belong to the foreign key. Basically I'm just checking if the RestrictInfo has a parent_ec, and if it does just loop over the members to try and find the Vars that belong to the foreign key. I've tested it with the following, and it seems to work:

create table a as select i as a_id1, i as a_id2, i as dummy1 from generate_series(0,999) s(i);

alter table a add unique (a_id1, a_id2);

create table b as select i as b_id1, i as b_id2 from generate_series(0,332) s(i);

analyze a;

analyze b;

alter table b add foreign key (b_id1, b_id2) references a (a_id1, a_id2);

explain analyze select * from a inner join b on a.dummy1 = b.b_id1 and a.a_id2 = b.b_id2 where a.a_id1 = a.dummy1;

QUERY PLAN

-----------------------------------------------------------------------------------------------------------

Hash Join (cost=18.57..26.41 rows=2 width=20) (actual time=0.775..1.046 rows=333 loops=1)

Hash Cond: ((b.b_id1 = a.dummy1) AND (b.b_id2 = a.a_id2))

-> Seq Scan on b (cost=0.00..5.33 rows=333 width=8) (actual time=0.013..0.046 rows=333 loops=1)

-> Hash (cost=18.50..18.50 rows=5 width=12) (actual time=0.737..0.737 rows=1000 loops=1)

Buckets: 1024 Batches: 1 Memory Usage: 51kB

-> Seq Scan on a (cost=0.00..18.50 rows=5 width=12) (actual time=0.014..0.389 rows=1000 loops=1)

Filter: (dummy1 = a_id1)

The non-patched version estimates 1 row. The patched estimates 2 rows, but that's due to the bad estimate on dummy1 = a_id1.

The 2 comes from ceil(5 * 0.333).

Perhaps you have a better test case to for this?

Regards

David Rowley

--
David Rowley http://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Training & Services

Вложения

estimation-with-fkeys-v2_davidv3.patch

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: PATCH: use foreign keys to improve join estimates v1

Вложения