Re: Removing unneeded self joins
От | Andrey Lepikhov |
---|---|
Тема | Re: Removing unneeded self joins |
Дата | |
Msg-id | a1d6290c-44e0-0dfc-3fca-66a68b3109ef@postgrespro.ru обсуждение исходный текст |
Ответ на | Re: Removing unneeded self joins (Ronan Dunklau <ronan.dunklau@aiven.io>) |
Ответы |
Re: Removing unneeded self joins
(Ronan Dunklau <ronan.dunklau@aiven.io>)
|
Список | pgsql-hackers |
On 5/17/22 19:14, Ronan Dunklau wrote: > Le vendredi 13 mai 2022, 07:07:47 CEST Andrey Lepikhov a écrit : >> New version of the feature. >> Here a minor bug with RowMarks is fixed. A degenerated case is fixed, >> when uniqueness of an inner deduced not from join quals, but from a >> baserestrictinfo clauses 'x=const', where x - unique field. >> Code, dedicated to solve second issue is controversial, so i attached >> delta.txt for quick observation. >> Maybe we should return to previous version of code, when we didn't split >> restriction list into join quals and base quals? > > Hello, > > I tried to find problematic cases, which would make the planning time grow > unacceptably, and couldn't devise it. > > The worst case scenario I could think of was as follows: > > - a query with many different self joins > - an abundance of unique indexes on combinations of this table columns to > consider > - additional predicates on the where clause on columns. Looking into the patch I can imagine, that the most difficult case is when a set of relations with the same OID is huge, but only small part of them (or nothing) can be removed. Also, removing a clause from restrictinfo list or from equivalence class adds non-linear complexity. So, you can dig this way ). > > The base table I used for this was a table with 40 integers. 39 unique indexes > were defined on every combination of (c1, cX) with cX being columns c2 to c40. > > I turned geqo off, set from_collapse_limit and join_collapse_limit to > unreasonably high values (30), and tried to run queries of the form: > > SELECT * FROM test_table t1 > JOIN test_table tX ON t1.c1 = tX.c1 AND t1.c[X+2] = tX.cX > ... > JOIN test_table tX ON t1.c1 = tX.c1 AND t1.c[X+2] = tX.cX. > > So no self join can be eliminated in that case. I think, you should compare t1.cX with tX.cX to eliminate self-join. Cross-unique-index proof isn't supported now. > The performance was very similar with or without the GUC enabled. I tested the > same thing without the patch, since the test for uniqueness has been slightly > altered and incurs a new allocation, but it doesn't seem to change. > > One interesting side effect of this patch, is that removing any unneeded self > join cuts down the planification time very significantly, as we lower the number > of combinations to consider. Even more - removing a join we improve cardinality estimation. > > As for the code: > > - Comments on relation_has_unique_index_ext and relation_has_unique_index_for > should be rewritten, as relation_has_unique_index_for is now just a special > case of relation_has_unique_index_ext. By the way, the comment would probably > be better read as: "but if extra_clauses isn't NULL". > - The whole thing about "extra_clauses", ie, baserestrictinfos which were > used to determine uniqueness, is not very clear. Most functions where the new > argument has been added have not seen an update in their comments, and the > name itself doesn't really convey the intented meaning: perhaps > required_non_join_clauses ? > > The way this works should be explained a bit more thoroughly, for example in > remove_self_joins_one_group the purpose of uclauses should be explained. The > fact that degenerate_case returns true when we don't have any additional base > restrict info is also confusing, as well as the degenerate_case name. Agree, but after this case thoughts wander in my head: should we make one step back to pre-[1] approach? It looks like we have quite similar changes, but without special function for a 'degenerate case' detection and restrictlist splitting. > > I'll update if I think of more interesting things to add. Thank you for your efforts! See in attachment next version which fixes mistakes detected by zyu@yugabyte.com. [1] https://www.postgresql.org/message-id/raw/CAApHDvpggnFMC4yP-jUO7PKN%3DfXeErW5bOxisvJ0HvkHQEY%3DWw%40mail.gmail.com -- Regards Andrey Lepikhov Postgres Professional
Вложения
В списке pgsql-hackers по дате отправления:
Следующее
От: Amit KapilaДата:
Сообщение: Re: Build-farm - intermittent error in 031_column_list.pl