Re: Removing unneeded self joins

Поиск
Список
Период
Сортировка
От Andrey Lepikhov
Тема Re: Removing unneeded self joins
Дата
Msg-id a1d6290c-44e0-0dfc-3fca-66a68b3109ef@postgrespro.ru
обсуждение исходный текст
Ответ на Re: Removing unneeded self joins  (Ronan Dunklau <ronan.dunklau@aiven.io>)
Ответы Re: Removing unneeded self joins  (Ronan Dunklau <ronan.dunklau@aiven.io>)
Список pgsql-hackers
On 5/17/22 19:14, Ronan Dunklau wrote:
> Le vendredi 13 mai 2022, 07:07:47 CEST Andrey Lepikhov a écrit :
>> New version of the feature.
>> Here a minor bug with RowMarks is fixed. A degenerated case is fixed,
>> when uniqueness of an inner deduced not from join quals, but from a
>> baserestrictinfo clauses 'x=const', where x - unique field.
>> Code, dedicated to solve second issue is controversial, so i attached
>> delta.txt for quick observation.
>> Maybe we should return to previous version of code, when we didn't split
>> restriction list into join quals and base quals?
> 
> Hello,
> 
> I tried to find problematic cases, which would make the planning time grow
> unacceptably, and couldn't devise it.
> 
> The worst case scenario I could think of was as follows:
> 
>   - a query with many different self joins
>   - an abundance of unique indexes on combinations of this table columns to
> consider
>   - additional predicates on the where clause on columns.
Looking into the patch I can imagine, that the most difficult case is 
when a set of relations with the same OID is huge, but only small part 
of them (or nothing) can be removed.
Also, removing a clause from restrictinfo list or from equivalence class 
adds non-linear complexity. So, you can dig this way ).
> 
> The base table I used for this was a table with 40 integers. 39 unique indexes
> were defined on every combination of (c1, cX) with cX being columns c2 to c40.
> 
> I turned geqo off, set from_collapse_limit and join_collapse_limit to
> unreasonably high values (30), and tried to run queries of the form:
> 
> SELECT * FROM test_table t1
> JOIN test_table tX ON t1.c1 = tX.c1 AND t1.c[X+2] = tX.cX
> ...
> JOIN test_table tX ON t1.c1 = tX.c1 AND t1.c[X+2] = tX.cX.
> 
> So no self join can be eliminated in that case.
I think, you should compare t1.cX with tX.cX to eliminate self-join. 
Cross-unique-index proof isn't supported now.
> The performance was very similar with or without the GUC enabled. I tested the
> same thing without the patch, since the test for uniqueness has been slightly
> altered and incurs a new allocation, but it doesn't seem to change.
> 
> One interesting side effect of this patch, is that removing any unneeded self
> join cuts down the planification time very significantly, as we lower the number
> of combinations to consider.
Even more - removing a join we improve cardinality estimation.
> 
> As for the code:
> 
>   - Comments on relation_has_unique_index_ext and relation_has_unique_index_for
> should be rewritten, as relation_has_unique_index_for is now just a special
> case of relation_has_unique_index_ext. By the way, the comment would probably
> be better read as: "but if extra_clauses isn't NULL".
>   - The whole thing about "extra_clauses", ie, baserestrictinfos which were
> used to determine uniqueness, is not very clear. Most functions where the new
> argument has been added have not seen an update in their comments, and the
> name itself doesn't really convey the intented meaning: perhaps
> required_non_join_clauses ?
> 
> The way this works should be explained a bit more thoroughly, for example in
> remove_self_joins_one_group the purpose of uclauses should be explained. The
> fact that degenerate_case returns true when we don't have any additional base
> restrict info is also confusing, as well as the degenerate_case name.
Agree,
but after this case thoughts wander in my head: should we make one step 
back to pre-[1] approach? It looks like we have quite similar changes, 
but without special function for a 'degenerate case' detection and 
restrictlist splitting.
> 
> I'll update if I think of more interesting things to add.
Thank you for your efforts!

See in attachment next version which fixes mistakes detected by 
zyu@yugabyte.com.

[1] 
https://www.postgresql.org/message-id/raw/CAApHDvpggnFMC4yP-jUO7PKN%3DfXeErW5bOxisvJ0HvkHQEY%3DWw%40mail.gmail.com

-- 
Regards
Andrey Lepikhov
Postgres Professional
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: Zstandard support for toast compression
Следующее
От: Amit Kapila
Дата:
Сообщение: Re: Build-farm - intermittent error in 031_column_list.pl