Re: [PATCH] Allow multiple recursive self-references

Поиск
Список
Период
Сортировка
От Denis Hirn
Тема Re: [PATCH] Allow multiple recursive self-references
Дата
Msg-id AC56483A-D07D-4724-975B-FC590CBF3F3B@uni-tuebingen.de
обсуждение исходный текст
Ответ на Re: [PATCH] Allow multiple recursive self-references  (Peter Eisentraut <peter.eisentraut@enterprisedb.com>)
Ответы Re: [PATCH] Allow multiple recursive self-references  (Peter Eisentraut <peter.eisentraut@enterprisedb.com>)
Re: [PATCH] Allow multiple recursive self-references  (Peter Eisentraut <peter.eisentraut@enterprisedb.com>)
Список pgsql-hackers
> I studied up on the theory and terminology being discussed here.  I conclude that what the latest version of this
patchis doing (allowing multiple recursive references, but only in a linear-recursion way) is sound and useful. 

That's great to hear!

> I haven't looked closely at the implementation changes yet.  I'm not very familiar with that part of the code, so it
willtake a bit longer. Perhaps you could explain what you are doing there, either in email or (even better) in the
commitmessage. 

I have extended the commit message. Some more discussion (regarding tree rotation etc.) can be found
in this mail, but also in the commit message.

> What I think this patch needs is a lot more test cases.  I would like to see more variants of different nestings of
theUNION branches, some mixtures of UNION ALL and UNION DISTINCT, joins of the recursive CTE with regular tables (like
inthe flights-and-trains examples) 

You are right, the testing is a bit sparse at the moment. I've added some more
test cases with the new version of this patch. Also I've improved the comments.

> as well as various cases of what is not allowed, namely showing that it can carefully prohibit non-linear recursion.

The regression tests already feature tests against non-linear recursion.
Therefore I did not add more myself.

> Also, in one instance you modify an existing test case.  I'm not sure why.  Perhaps it would be better to leave the
existingtest case alone and write a new one. 

I agree that it would be better not to modify the existing test case, but the
modification is unavoidable. Here are the original queries:

> -- non-linear recursion is not allowed
> WITH RECURSIVE foo(i) AS
>     (values (1)
>     UNION ALL
>        (SELECT i+1 FROM foo WHERE i < 10
>           UNION ALL
>        SELECT i+1 FROM foo WHERE i < 5)
> ) SELECT * FROM foo;

> WITH RECURSIVE foo(i) AS
>     (values (1)
>     UNION ALL
>      SELECT * FROM
>        (SELECT i+1 FROM foo WHERE i < 10
>           UNION ALL
>        SELECT i+1 FROM foo WHERE i < 5) AS t
> ) SELECT * FROM foo;

These two test cases are supposed to trigger the non-linear recursion check,
and they do without this patch. However, with the modifications this patch
introduces, both queries are now valid input. That's because each recursive
reference is placed inside a separate recursive UNION branch. This means that
both queries are linear recursive, and not non-linear recursive as they should be.

To make these tests check for non-linear recursion again, at least one
UNION branch must contain more than one recursive reference. That's the
modification I did.



> You also introduce this concept of reshuffling the UNION branches to collect all the recursive terms under one
subtree.

Yes, but the reshuffling is only applied in a very specific situation. Example:

>      UNION   --->   UNION
>     /     \        /     \
>   UNION    C      A    UNION
>  /     \              /     \
> A       B            B       C

A, B, and C are arbitrary SelectStmt nodes and can themselfes be deeper nested
UNION nodes.

A is a non-recursive term in the WITH RECURSIVE query. B, and C both contain a
recursive self-reference. The planner expects the top UNION node to contain
the non-recursive term in the larg, and the recursive term in the rarg.
Therefore, the tree shape on the left is invalid and would result in an error
message at the parsing stage. However, by rotating the tree to the right, this
problem can be solved so that the valid tree shape on the right side is created.

All this does, really, is to re-parenthesize the expression:
(A UNION B) UNION C ---> A UNION (B UNION C)



> Also, currently a query like this works [...] but this doesn't:
>
> WITH RECURSIVE t(n) AS (
>    SELECT n+1 FROM t WHERE n < 100
> UNION ALL
>    VALUES (1)
> )
> SELECT sum(n) FROM t;
>
> With your patch, the second should also work, so let's show some tests for that as well.

With just the tree rotation, the second query can not be fixed. The order of two
nodes is never changed. And I think that this is a good thing. Consider the
following query:

> WITH RECURSIVE t(n) AS (
>     VALUES (1)
>       UNION ALL
>     SELECT n+1 FROM t WHERE n < 100
>       UNION ALL
>     VALUES (2)
> ) SELECT * FROM t LIMIT 100;

If we'd just collect all non-recursive UNION branches, the semantics of the
second query would change. But changing the semantics of a query (or preventing
certain queries to be formulated at all) is not something I think this patch
should do. Therfore – I think – it's appropriate that the second query fails.

Best,
  -- Denis


Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tomas Vondra
Дата:
Сообщение: Re: mem context is not reset between extended stats
Следующее
От: Dilip Kumar
Дата:
Сообщение: Re: row filtering for logical replication