RE: row filtering for logical replication
От | houzj.fnst@fujitsu.com |
---|---|
Тема | RE: row filtering for logical replication |
Дата | |
Msg-id | OS0PR01MB57169DC3E8D7B7E588CD64DC94229@OS0PR01MB5716.jpnprd01.prod.outlook.com обсуждение исходный текст |
Ответ на | Re: row filtering for logical replication (Amit Kapila <amit.kapila16@gmail.com>) |
Ответы |
Re: row filtering for logical replication
(Greg Nancarrow <gregn4422@gmail.com>)
Re: row filtering for logical replication (Alvaro Herrera <alvherre@alvh.no-ip.org>) Re: row filtering for logical replication (Andres Freund <andres@anarazel.de>) Re: row filtering for logical replication (Peter Smith <smithpb2250@gmail.com>) |
Список | pgsql-hackers |
On Wednesday, January 26, 2022 6:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Wed, Jan 26, 2022 at 8:37 AM houzj.fnst@fujitsu.com > <houzj.fnst@fujitsu.com> wrote: > > > > On Monday, January 24, 2022 4:38 PM Peter Smith > <smithpb2250@gmail.com> wrote: > > > > > > > > > 3. src/backend/utils/cache/relcache.c - RelationBuildPublicationDesc > > > > > > +RelationBuildPublicationDesc(Relation relation) > > > { > > > List *puboids; > > > ListCell *lc; > > > MemoryContext oldcxt; > > > Oid schemaid; > > > - PublicationActions *pubactions = palloc0(sizeof(PublicationActions)); > > > + List *ancestors = NIL; > > > + Oid relid = RelationGetRelid(relation); AttrNumber invalid_rfcolnum = > > > + InvalidAttrNumber; PublicationDesc *pubdesc = > > > + palloc0(sizeof(PublicationDesc)); PublicationActions *pubactions = > > > + &pubdesc->pubactions; > > > + > > > + pubdesc->rf_valid_for_update = true; > > > + pubdesc->rf_valid_for_delete = true; > > > > > > IMO it wold be better to change the "sense" of those variables. > > > e.g. > > > > > > "rf_valid_for_update" --> "rf_invalid_for_update" > > > "rf_valid_for_delete" --> "rf_invalid_for_delete" > > > > > > That way they have the same 'sense' as the AttrNumbers so it all reads better > to > > > me. > > > > > > Also, it means no special assignment is needed because the palloc0 will set > > > them correctly > > > > Think again, I am not sure it's better to have an invalid_... flag. > > It seems more natural to have a valid_... flag. > > Thanks for the comments ! > Can't we do without these valid_ flags? AFAICS, if we check for > "invalid_" attributes, it should serve our purpose because those can > have some attribute number only when the row filter contains some > column that is not part of RI. A few possible optimizations in > RelationBuildPublicationDesc: I slightly refactored the logic here. > a. It calls contain_invalid_rfcolumn with pubid and then does cache > lookup to again find a publication which its only caller has access > to, so can't we pass the same? Adjusted the code here. > b. In RelationBuildPublicationDesc(), we call > GetRelationPublications() to get the list of publications and then > process those publications. I think if none of the publications has > row filter and the relation has replica identity then we don't need to > build the descriptor at all. If we do this optimization inside > RelationBuildPublicationDesc, we may want to rename function as > CheckAndBuildRelationPublicationDesc or something like that? After thinking more on this and considering Alvaro's comments. I did some changes for the RelationBuildPublicationDesc function to try to make it more natural. - Make the function always collect the complete information instead of returning immediately when find invalid rowfilter. The reason for this change is: some extensions(3rd-part) might only care about the cached publication actions, this approach can make sure they can still get complete pulication actions as usual. Besides, this is also consistent with the other existing cache management functions(like RelationGetIndexAttrBitmap ...) which will always build complete information even if user only want part of it. - Only cache the flag rf_valid_for_[update|delete] flag in PublicationDesc instead of the invalid rowfilter column. Because it's a bit unnatural to me to store an invalid thing in relcache. Note that now the patch doesn't report the column number in the error message. If we later decide that the accurate column number or publication is useful, I think it might be better to add a separate simple function(get_invalid_...) to report the accurate column or publication instead of reusing the cache management function. Also address Peter's comments[1] and Greg's comments[2] [3] [1] https://www.postgresql.org/message-id/CAHut%2BPsG1G80AoSYka7m1x05vHjKZAzKeVyK4b6CAm2-sTkadg%40mail.gmail.com [2] https://www.postgresql.org/message-id/CAJcOf-c7XrtsWSGppb96-eQxPbtg%2BAfssAtTXNYbT8QuhdyOYA%40mail.gmail.com [3] https://www.postgresql.org/message-id/CAJcOf-f0kc%2B4xGEgkvqNLkbJxMf8Ff0E9gTO2biHDoSJnxyziA%40mail.gmail.com Attach the V72 patch set which did the above changes. Best regards, Hou zj
Вложения
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Julien RouhaudДата:
Сообщение: Re: Is there a way (except from server logs) to know the kind of on-going/last checkpoint?
Следующее
От: Bharath RupireddyДата:
Сообщение: Re: Is there a way (except from server logs) to know the kind of on-going/last checkpoint?