Re: Removing unneeded self joins

Поиск
Список
Период
Сортировка
От a.rybakina
Тема Re: Removing unneeded self joins
Дата
Msg-id 29ddec15-a83f-4f24-b069-6e93c189b169@postgrespro.ru
обсуждение исходный текст
Ответ на Re: Removing unneeded self joins  (Andrei Lepikhov <a.lepikhov@postgrespro.ru>)
Ответы Re: Removing unneeded self joins
Список pgsql-hackers

Also I've incorporated improvements from Alena Rybakina except one for
skipping SJ removal when no SJ quals is found.  It's not yet clear for
me if this check fix some cases. But at least optimization got skipped
in some useful cases (as you can see in regression tests).

Agree. I wouldn't say I like it too. But also, I suggest skipping some unnecessary assertions proposed in that patch:
Assert(toKeep->relid != -1); - quite strange. Why -1? Why not all the negative numbers, at least?
Assert(is_opclause(orinfo->clause)); - above we skip clauses with rinfo->mergeopfamilies == NIL. Each mergejoinable clause is already checked as is_opclause.
All these changes (see in the attachment) are optional.

I don't mind about asserts, maybe I misunderstood something in the patch.

About skipping SJ removal when no SJ quals is found, I assume it is about it:split_selfjoin_quals(root, restrictlist, &selfjoinquals,
                                  &otherjoinquals, inner->relid, outer->relid);
 
+            if (list_length(selfjoinquals) == 0)
+             {
+                 /*
+                  * XXX:
+                  * we would detect self-join without quals like 'x==x' if we had
+                  * an foreign key constraint on some of other quals and this join
+                  * haven't any columns from the outer in the target list.
+                  * But it is still complex task.
+                  */
+                 continue;
+             }

as far as I remember, this is the place where it is checked that the SJ list is empty and it is logical, in my opinion, that no transformations should be performed if no elements are found for them.

As for the cases where SJ did not work, maybe this is just right if there are no elements for processing these cases. I'll try to check or come up with tests for them. If I'm wrong, write.

On 11.10.2023 06:51, Andrei Lepikhov wrote:
On 11/10/2023 02:29, Alena Rybakina wrote:
I have reviewed your patch and I noticed a few things.

Thanks for your efforts,

I have looked at the latest version of the code, I assume that the error lies in the replace_varno_walker function, especially in the place where we check the node by type Var, and does not form any NullTest node.

It's not a bug, it's an optimization we discussed with Alexander above.

Secondly, I added some code in some places to catch erroneous cases and added a condition when we should not try to apply the self-join-removal transformation due to the absence of an empty self-join list after searching for it and in general if there are no joins in the query. Besides, I added a query for testing and wrote about it above. I have attached my diff file.
Ok, I will look at this
In addition, I found a comment for myself that was not clear to me. I would be glad if you could explain it to me.

You mentioned superior outer join in the comment, unfortunately, I didn't find anything about it in the PostgreSQL code, and this explanation remained unclear to me. Could you explain in more detail what you meant?
I meant here that only clauses pushed by reconsider_outer_join_clauses() can be found in the joininfo list, and they are not relevant, as you can understand.
Having written that, I realized that it was a false statement. ;) - joininfo can also contain results of previous SJE iterations, look:

CREATE TABLE test (oid int PRIMARY KEY);
CREATE UNIQUE INDEX ON test((oid*oid));
explain
SELECT count(*)
FROM test c1, test c2, test c3
WHERE c1.oid=c2.oid AND c1.oid*c2.oid=c3.oid*c3.oid;
explain
SELECT count(*)
FROM test c1, test c2, test c3
WHERE c1.oid=c3.oid AND c1.oid*c3.oid=c2.oid*c2.oid;
explain
SELECT count(*)
FROM test c1, test c2, test c3
WHERE c3.oid=c2.oid AND c3.oid*c2.oid=c1.oid*c1.oid;

Ok, I understood. Thank you for explanation.


-- 
Regards,
Alena Rybakina

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alena Rybakina
Дата:
Сообщение: Re: A new strategy for pull-up correlated ANY_SUBLINK
Следующее
От: Daniel Gustafsson
Дата:
Сообщение: Re: Improve the log message output of basic_archive when basic_archive.archive_directory parameter is not set