Обсуждение: Remove duplicate table scan in logical apply worker and code refactoring
Remove duplicate table scan in logical apply worker and code refactoring
От
"Zhijie Hou (Fujitsu)"
Дата:
Hi, When reviewing the code in logical/worker.c, I noticed that when applying a cross-partition update action, it scans the old partition twice. I am attaching the patch 0001 to remove this duplicate table scan. The test shows that it brings noticeable improvement: Steps ----- Pub: create table tab (a int not null, b int); alter table tab replica identity full; insert into tab select 1,generate_series(1, 1000000, 1); Sub: create table tab (a int not null, b int) partition by range (b); create table tab_1 partition of tab for values from (minvalue) to (5000000); create table tab_2 partition of tab for values from (5000000) to (maxvalue); alter table tab replica identity full; Test query: update tab set b = 6000000 where b > 999900; -- UPDATE 100 Results (The time spent by apply worker to apply the all the UPDATEs): Before 14s After 7s ----- Apart from above, I found there are quite a few duplicate codes related to partition handling(e.g. apply_handle_tuple_routing), so I tried to extract some common logic to simplify the codes. Please see 0002 for this refactoring. Best Regards, Hou Zhijie
Вложения
Hi! > When reviewing the code in logical/worker.c, I noticed that when applying a > cross-partition update action, it scans the old partition twice. Nice catch! > -/* > - * Workhorse for apply_handle_update() > - * relinfo is for the relation we're actually updating in > - * (could be a child partition of edata->targetRelInfo) > - */ > -static void > -apply_handle_update_internal(ApplyExecutionData *edata, > - ResultRelInfo *relinfo, > - TupleTableSlot *remoteslot, > - LogicalRepTupleData *newtup, > - Oid localindexoid) > -{ What's the necessity of this change? Can we modify a function in-place instead of removing and rewriting it in the same file? This will reduce diff, making patch a bit more clear. > +/* > + * If the tuple to be modified could not be found, a log message is emitted. > + */ > +static void > +report_tuple_not_found(CmdType cmd, Relation targetrel, bool is_partition) > +{ > + Assert(cmd == CMD_UPDATE || cmd == CMD_DELETE); > + > + /* XXX should this be promoted to ereport(LOG) perhaps? */ > + elog(DEBUG1, > + "logical replication did not find row to be %s in replication target relation%s \"%s\"", > + cmd == CMD_UPDATE ? "updated" : "deleted", > + is_partition ? "'s partition" : "", > + RelationGetRelationName(targetrel)); > +} Encapsulating report logic inside function is ok, but double ternary expression is a bit out of line. I do not see similar places within PostgreSQL, so it probably violates code style.
Re: Remove duplicate table scan in logical apply worker and code refactoring
От
Ashutosh Bapat
Дата:
On Thu, Jul 25, 2024 at 5:56 PM Kirill Reshke <reshkekirill@gmail.com> wrote: > > +/* > > + * If the tuple to be modified could not be found, a log message is emitted. > > + */ > > +static void > > +report_tuple_not_found(CmdType cmd, Relation targetrel, bool is_partition) > > +{ > > + Assert(cmd == CMD_UPDATE || cmd == CMD_DELETE); > > + > > + /* XXX should this be promoted to ereport(LOG) perhaps? */ > > + elog(DEBUG1, > > + "logical replication did not find row to be %s in replication target relation%s \"%s\"", > > + cmd == CMD_UPDATE ? "updated" : "deleted", > > + is_partition ? "'s partition" : "", > > + RelationGetRelationName(targetrel)); > > +} > > Encapsulating report logic inside function is ok, but double ternary > expression is a bit out of line. I do not see similar places within > PostgreSQL, > so it probably violates code style. > They it's written, it would make it hard for the translations. See [1] which redirects to [2]. [1] https://www.postgresql.org/docs/current/error-style-guide.html [2] https://www.postgresql.org/docs/current/nls-programmer.html#NLS-GUIDELINES -- Best Wishes, Ashutosh Bapat