Re: a misbehavior of partition row movement (?)

Поиск
Список
Период
Сортировка
От Amit Langote
Тема Re: a misbehavior of partition row movement (?)
Дата
Msg-id CA+HiwqFpeYBXT=Ss8Ya-FMxeHPYEz0p-+BdBa5neq7pD3hB3ow@mail.gmail.com
обсуждение исходный текст
Ответ на Re: a misbehavior of partition row movement (?)  (Amit Langote <amitlangote09@gmail.com>)
Ответы Re: a misbehavior of partition row movement (?)  (Amit Langote <amitlangote09@gmail.com>)
Список pgsql-hackers
On Mon, Sep 20, 2021 at 3:32 PM Amit Langote <amitlangote09@gmail.com> wrote:
> The problem was that the tuplestore
> (afterTriggers.query_stack[query_level].tuplestore) that I decided to
> use to store the AFTER trigger tuples of a partitioned table that is
> the target of an cross-partition update lives only for the duration of
> a given query.  So that approach wouldn't work if the foreign key
> pointing into that partitioned table is marked INITIALLY DEFERRED.  To
> fix, I added a List field to AfterTriggersData that stores the
> tuplestores to store the tuples of partitioned tables that undergo
> cross-partition updates in a transaction and are pointed to by
> INITIALLY DEFERRED foreign key constraints.  I couldn't understand one
> comment about why using a tuplestore for such cases *might not* work,
> which as follows:
>
>  * Note that we need triggers on foreign tables to be fired in exactly the
>  * order they were queued, so that the tuples come out of the tuplestore in
>  * the right order.  To ensure that, we forbid deferrable (constraint)
>  * triggers on foreign tables.  This also ensures that such triggers do not
>  * get deferred into outer trigger query levels, meaning that it's okay to
>  * destroy the tuplestore at the end of the query level.
>
> I tried to break the approach using various test cases (some can be
> seen added by the patch to foreign_key.sql), but couldn't see the
> issue alluded to in the above comment.  So I've marked the comment
> with an XXX note as follows:
>
> - * Note that we need triggers on foreign tables to be fired in exactly the
> - * order they were queued, so that the tuples come out of the tuplestore in
> - * the right order.  To ensure that, we forbid deferrable (constraint)
> - * triggers on foreign tables.  This also ensures that such triggers do not
> - * get deferred into outer trigger query levels, meaning that it's okay to
> - * destroy the tuplestore at the end of the query level.
> + * Note that we need triggers on foreign and partitioned tables to be fired in
> + * exactly the order they were queued, so that the tuples come out of the
> + * tuplestore in the right order.  To ensure that, we forbid deferrable
> + * (constraint) triggers on foreign tables.  This also ensures that such
> + * triggers do not get deferred into outer trigger query levels, meaning that
> + * it's okay to destroy the tuplestore at the end of the query level.
> + * XXX - update this paragraph if the new approach, whereby tuplestores in
> + * afterTriggers.deferred_tuplestores outlive any given query, can be proven
> + * to not really break any assumptions mentioned here.
>
> If anyone reading this can think of the issue the original comment
> seems to be talking about, please let me know.

I brought this up in an off-list discussion with Robert and he asked
why I hadn't considered not using tuplestores to remember the tuples
in the first place, specifically pointing out that it may lead to
unnecessarily consuming a lot of memory when such updates move a bunch
of rows around.  Like, we could simply remember the tuples to be
passed to the trigger function using their CTIDs as is done for normal
(single-heap-relation) updates, though in this case also remember the
OIDs of the source and the destination partitions of a particular
cross-partition update.

I had considered that idea before but I think I had overestimated the
complexity of that approach so didn't go with it.  I tried and the
resulting patch doesn't look all that complicated, with the added
bonus that the bulk update case shouldn't consume so much memory with
this approach like the previous tuplestore version would have.

Updated patches attached.

-- 
Amit Langote
EDB: http://www.enterprisedb.com

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Greg Nancarrow
Дата:
Сообщение: Re: Skipping logical replication transactions on subscriber side
Следующее
От: Dilip Kumar
Дата:
Сообщение: Re: Reset snapshot export state on the transaction abort