RE: row filtering for logical replication

Поиск
Список
Период
Сортировка
От houzj.fnst@fujitsu.com
Тема RE: row filtering for logical replication
Дата
Msg-id OS3PR01MB57184FB9EFA06AF190F2870F94639@OS3PR01MB5718.jpnprd01.prod.outlook.com
обсуждение исходный текст
Ответ на Re: row filtering for logical replication  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: row filtering for logical replication  (Greg Nancarrow <gregn4422@gmail.com>)
Re: row filtering for logical replication  (Peter Smith <smithpb2250@gmail.com>)
Список pgsql-hackers
On Wednesday, November 24, 2021 1:46 PM Amit Kapila <amit.kapila16@gmail.com>
> On Wed, Nov 24, 2021 at 6:51 AM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote:
> >
> > On Tues, Nov 23, 2021 6:16 PM Amit Kapila <amit.kapila16@gmail.com>
> wrote:
> > > On Tue, Nov 23, 2021 at 1:29 PM houzj.fnst@fujitsu.com
> > > <houzj.fnst@fujitsu.com> wrote:
> > > >
> > > > On Tues, Nov 23, 2021 2:27 PM vignesh C <vignesh21@gmail.com>
> wrote:
> > > > > On Thu, Nov 18, 2021 at 7:04 AM Peter Smith
> > > > > <smithpb2250@gmail.com>
> > > > > wrote:
> > > > > >
> > > > > > PSA new set of v40* patches.
> > > > >
> > > > > Few comments:
> > > > > 1) When a table is added to the publication, replica identity is
> > > > > checked. But while modifying the publish action to include
> > > > > delete/update, replica identity is not checked for the existing
> > > > > tables. I felt it should be checked for the existing tables too.
> > > >
> > > > In addition to this, I think we might also need some check to
> > > > prevent user from changing the REPLICA IDENTITY index which is used in
> > > > the filter expression.
> > > >
> > > > I was thinking is it possible do the check related to REPLICA
> > > > IDENTITY in function CheckCmdReplicaIdentity() or In
> > > > GetRelationPublicationActions(). If we move the REPLICA IDENTITY
> > > > check to this function, it would be consistent with the existing
> > > > behavior about the check related to REPLICA IDENTITY(see the
> > > > comments in CheckCmdReplicaIdentity) and seems can cover all the
> > > > cases mentioned above.
> > > >
> > >
> > > Yeah, adding the replica identity check in CheckCmdReplicaIdentity()
> > > would cover all the above cases but I think that would put a premium
> > > on each update/delete operation. I think traversing the expression
> > > tree (it could be multiple traversals if the relation is part of
> > > multiple publications) during each update/delete would be costly.
> > > Don't you think so?
> >
> > Yes, I agreed that traversing the expression every time would be costly.
> >
> > I thought maybe we can cache the columns used in row filter or cache
> > only the a
> > flag(can_update|delete) in the relcache. I think every operation that
> > affect the row-filter or replica-identity will invalidate the relcache
> > and the cost of check seems acceptable with the cache.
> >
> 
> I think if we can cache this information especially as a bool flag then that should
> probably be better.

Based on this direction, I tried to write a top up POC patch(0005) which I'd like to share.

The top up patch mainly did the following things.

* Move the row filter columns invalidation to CheckCmdReplicaIdentity, so that
the invalidation is executed only when actual UPDATE or DELETE executed on the
published relation. It's consistent with the existing check about replica
identity.

* Cache the results of the validation for row filter columns in relcache to
reduce the cost of the validation. It's safe because every operation that
change the row filter and replica identity will invalidate the relcache.

Also attach the v42 patch set to keep cfbot happy.

Best regards,
Hou zj


Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: "tanghy.fnst@fujitsu.com"
Дата:
Сообщение: RE: Skipping logical replication transactions on subscriber side
Следующее
От: SATYANARAYANA NARLAPURAM
Дата:
Сообщение: Re: Postgres restart in the middle of exclusive backup and the presence of backup_label file