Extra target list entries for child partitions in DELETE...USING...RETURNING

Поиск
Список
Период
Сортировка
От Melanie Plageman
Тема Extra target list entries for child partitions in DELETE...USING...RETURNING
Дата
Msg-id CAAKRu_Y+7qX4JzvfovsBE9T9_2c4kK1Bdda139oQ6cA5B-LTZA@mail.gmail.com
обсуждение исходный текст
Список pgsql-hackers
A few months ago while writing the initial version of a patch to extract
the columns that need to be scanned at plan time for use by table AMs,
[1] Ashwin and I noticed some peculiar aspects of the reltarget exprs
for partition tables for DELETE FROM ... USING... WHERE... RETURNING ...
queries (mentioned in this [2] specific mail in the thread)

Recently, I was looking into this again and decided to ask in a separate
thread if the current implementation is correct.

Given a partitioned table t(a,b,c) (with partitions tp1(a,b,c) and
tp2(a,b,c) and a non-partitioned table foo(a,b)

and the following query

DELETE FROM t USING foo where foo.a = t.a RETURNING *;

The processed_tlist of the child partitions (which are the targets of
the DELETE) include *all* of the columns from foo and t.

The columns from foo are added to the child partition's querytree's
processed_tlist in preprocess_targetlist() from the child partitions's
querytree's returningList so that those vars are "available for the
RETURNING calculation", however, neither the qual evaluation nor the
DELETE execution require the other columns in t to be added to the
targetlist.

In fact, the root/parent partition, t , only adds those columns from foo
(a and b) from its returningList to its processed_targetlist. It does
not end up adding any other columns from t to its processed_tlist than
those that are already there.

This alone didn't bother me that much. I assume that this could be this
way for some valid reason. However, the part that seems odd to me is the
exact way in which these other columns are added to the child
partition's processed_tlist.

In preprocess_targetlist(), this code determines if the var pulled out
of the querytree's returningList is added to its processed_tlist

if (IsA(var, Var) &&
var->varno == result_relation)
continue;

For the root partition, result_relation will be the index into the range
table of the relation that is the target of the DELETE -- so, in this
case, the index into the range table for the leaf partitions.
For the child partitions, however, result_relation is 0 because in
inheritance_planner(), we copy the parent querytree (including the
returningList) and then set the result_relation to 0. The comment
reads:

  /*
    * Make a deep copy of the parsetree for this planning cycle to mess
    * around with, and change it to look like a SELECT.
    ...

That means that in preprocess_targetlist() for child partitions,
result_relation in the querytree is 0 and all of the returningList will
always be added to the processed_tlist.

In practice, I don't actually know if this breaks anything. I poked
around a bit, but I don't see how having too many target list entries
for a leaf partition could ever, for example, produce wrong results.

Anyway, I think it makes the code harder to understand.

A simple fix would be to also guard that if statement with a check that
result_relation is not 0.

-       if (parse->returningList && list_length(parse->rtable) > 1)
+       if (result_relation && parse->returningList && list_length(parse->rtable) > 1)

(this does pass regress)
That doesn't make that much sense, though, I think, since SELECT
statements shouldn't have a returningList, so the existing check should
be sufficient.

Maybe a more complete solution would do something to make sure that
child partitions are treated the same as root partitions for DELETE
queries, similar to what seems to be described to be happening for
UPDATE queries in the same comment about making the deep copy of the
querytree for the child partition:

  /*
    * Make a deep copy of the parsetree for this planning cycle to mess
    * around with, and change it to look like a SELECT.  (Hack alert: the
    * target RTE still has updatedCols set if this is an UPDATE, so that
    * expand_partitioned_rtentry will correctly update
    * subroot->partColsUpdated.)
    */

[1] https://www.postgresql.org/message-id/flat/CAAKRu_ZQ0Jy7LfZDCY0JdxChdpja9rf-S8Y5%2BU4vX7cYJd62dA%40mail.gmail.com#f16fb3bdf33519c0d547a4b7ae2fc3c3
[2] https://www.postgresql.org/message-id/CAAKRu_ZQ0Jy7LfZDCY0JdxChdpja9rf-S8Y5%2BU4vX7cYJd62dA%40mail.gmail.com

--
Melanie Plageman

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: SIGSEGV from START_REPLICATION 0/XXXXXXX in XLogSendPhysical ()at walsender.c:2762
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: min_safe_lsn column in pg_replication_slots view