Re: Column Filtering in Logical Replication
От | Tomas Vondra |
---|---|
Тема | Re: Column Filtering in Logical Replication |
Дата | |
Msg-id | 05353cce-2da8-74cc-5419-d0a398212f12@enterprisedb.com обсуждение исходный текст |
Ответ на | Re: Column Filtering in Logical Replication (Amit Kapila <amit.kapila16@gmail.com>) |
Список | pgsql-hackers |
On 3/21/22 15:12, Amit Kapila wrote: > On Sat, Mar 19, 2022 at 11:11 PM Tomas Vondra > <tomas.vondra@enterprisedb.com> wrote: >> >> On 3/19/22 18:11, Tomas Vondra wrote: >>> Fix a compiler warning reported by cfbot. >> >> Apologies, I failed to actually commit the fix. So here we go again. >> > > Few comments: > =============== > 1. > +/* > + * Gets a list of OIDs of all partial-column publications of the given > + * relation, that is, those that specify a column list. > + */ > +List * > +GetRelationColumnPartialPublications(Oid relid) > { > ... > } > > ... > +/* > + * For a relation in a publication that is known to have a non-null column > + * list, return the list of attribute numbers that are in it. > + */ > +List * > +GetRelationColumnListInPublication(Oid relid, Oid pubid) > { > ... > } > > Both these functions are not required now. So, we can remove them. > Good catch, removed. > 2. > @@ -464,11 +478,11 @@ logicalrep_write_update(StringInfo out, > TransactionId xid, Relation rel, > pq_sendbyte(out, 'O'); /* old tuple follows */ > else > pq_sendbyte(out, 'K'); /* old key follows */ > - logicalrep_write_tuple(out, rel, oldslot, binary); > + logicalrep_write_tuple(out, rel, oldslot, binary, columns); > } > > As mentioned previously, here, we should pass NULL similar to > logicalrep_write_delete as we don't need to use column list for old > tuples. > Fixed. > 3. > + * XXX The name is a bit misleading, because we don't really transform > + * anything here - we merely check the column list is compatible with the > + * definition of the publication (with publish_via_partition_root=false) > + * we only allow column lists on the leaf relations. So maybe rename it? > + */ > +static void > +TransformPubColumnList(List *tables, const char *queryString, > + bool pubviaroot) > > The second parameter is not used in this function. As noted in the > comments, I also think it is better to rename this. How about > ValidatePubColumnList? > > 4. > @@ -821,6 +942,9 @@ fetch_remote_table_info(char *nspname, char *relname, > * > * 3) one of the subscribed publications is declared as ALL TABLES IN > * SCHEMA that includes this relation > + * > + * XXX Does this actually handle puballtables and schema publications > + * correctly? > */ > if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000) > > Why is this comment added in the row filter code? Now, both row filter > and column list are fetched in the same way, so not sure what exactly > this comment is referring to. > I added that comment as a note to myself while learning about how the code works, forgot to remove that. > 5. > +/* qsort comparator for attnums */ > +static int > +compare_int16(const void *a, const void *b) > +{ > + int av = *(const int16 *) a; > + int bv = *(const int16 *) b; > + > + /* this can't overflow if int is wider than int16 */ > + return (av - bv); > +} > > The exact same code exists in statscmds.c. Do we need a second copy of the same? > Yeah, I thought about moving it to some common header, but I think it's not really worth it at this point. > 6. > static void pgoutput_row_filter_init(PGOutputData *data, > List *publications, > RelationSyncEntry *entry); > + > static bool pgoutput_row_filter_exec_expr(ExprState *state, > > Spurious line addition. > Fixed. > 7. The tests in 030_column_list.pl take a long time as compared to all > other similar individual tests in the subscription folder. I haven't > checked whether there is any need to reduce some tests but it seems > worth checking. > On my machine, 'make check' in src/test/subscription takes ~150 seconds (with asserts and -O0), and the new script takes ~14 seconds, while most other tests have 3-6 seconds. AFAICS that's simply due to the number of tests in the script, and I don't think there are any unnecessary ones. I was actually adding them in response to issues reported during development, or to test various important cases. So I don't think we can remove some of them easily :-( And it's not like the tests are using massive amounts of data either. We could split the test, but that obviously won't reduce the duration, of course. So I decided to keep the test as is, for now, and maybe we can try reducing the test after a couple buildfarm runs. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
В списке pgsql-hackers по дате отправления: