Обсуждение: Remove duplicate table scan in logical apply worker and code refactoring

Поиск

Список

Период

Сортировка

Remove duplicate table scan in logical apply worker and code refactoring

От

"Zhijie Hou (Fujitsu)"

Дата:

25 июля, 13:30:21

Hi,

When reviewing the code in logical/worker.c, I noticed that when applying a
cross-partition update action, it scans the old partition twice.
I am attaching the patch 0001 to remove this duplicate table scan.

The test shows that it brings noticeable improvement:

Steps
-----
Pub:
create table tab (a int not null, b int);
alter table tab replica identity full;
insert into tab select 1,generate_series(1, 1000000, 1);

Sub:
create table tab (a int not null, b int) partition by range (b);
create table tab_1 partition of tab for values from (minvalue) to (5000000);
create table tab_2 partition of tab for values from (5000000) to (maxvalue);
alter table tab replica identity full;


Test query:
update tab set b = 6000000 where b > 999900; -- UPDATE 100

Results (The time spent by apply worker to apply the all the UPDATEs):
Before    14s
After    7s
-----

Apart from above, I found there are quite a few duplicate codes related to partition
handling(e.g. apply_handle_tuple_routing), so I tried to extract some
common logic to simplify the codes. Please see 0002 for this refactoring.

Best Regards,
Hou Zhijie

Вложения

Re: Remove duplicate table scan in logical apply worker and code refactoring

От

Kirill Reshke

Дата:

25 июля, 15:25:53

Hi!

> When reviewing the code in logical/worker.c, I noticed that when applying a
> cross-partition update action, it scans the old partition twice.

Nice catch!


> -/*
> - * Workhorse for apply_handle_update()
> - * relinfo is for the relation we're actually updating in
> - * (could be a child partition of edata->targetRelInfo)
> - */
> -static void
> -apply_handle_update_internal(ApplyExecutionData *edata,
> - ResultRelInfo *relinfo,
> - TupleTableSlot *remoteslot,
> - LogicalRepTupleData *newtup,
> - Oid localindexoid)
> -{

What's the necessity of this change? Can we modify a function in-place
instead of removing and rewriting it in the same file?
This will reduce diff, making patch a bit more clear.

> +/*
> + * If the tuple to be modified could not be found, a log message is emitted.
> + */
> +static void
> +report_tuple_not_found(CmdType cmd, Relation targetrel, bool is_partition)
> +{
> + Assert(cmd == CMD_UPDATE || cmd == CMD_DELETE);
> +
> + /* XXX should this be promoted to ereport(LOG) perhaps? */
> + elog(DEBUG1,
> + "logical replication did not find row to be %s in replication target relation%s \"%s\"",
> + cmd == CMD_UPDATE ? "updated" : "deleted",
> + is_partition ? "'s partition" : "",
> + RelationGetRelationName(targetrel));
> +}

Encapsulating report logic inside function is ok, but double ternary
expression is a bit out of line. I do not see similar places within
PostgreSQL,
so it probably violates code style.

Re: Remove duplicate table scan in logical apply worker and code refactoring

От

Ashutosh Bapat

Дата:

26 июля, 11:54:16

On Thu, Jul 25, 2024 at 5:56 PM Kirill Reshke <reshkekirill@gmail.com> wrote:

> > +/*
> > + * If the tuple to be modified could not be found, a log message is emitted.
> > + */
> > +static void
> > +report_tuple_not_found(CmdType cmd, Relation targetrel, bool is_partition)
> > +{
> > + Assert(cmd == CMD_UPDATE || cmd == CMD_DELETE);
> > +
> > + /* XXX should this be promoted to ereport(LOG) perhaps? */
> > + elog(DEBUG1,
> > + "logical replication did not find row to be %s in replication target relation%s \"%s\"",
> > + cmd == CMD_UPDATE ? "updated" : "deleted",
> > + is_partition ? "'s partition" : "",
> > + RelationGetRelationName(targetrel));
> > +}
>
> Encapsulating report logic inside function is ok, but double ternary
> expression is a bit out of line. I do not see similar places within
> PostgreSQL,
> so it probably violates code style.
>

They it's written, it would make it hard for the translations. See [1]
which redirects to [2].

[1] https://www.postgresql.org/docs/current/error-style-guide.html
[2] https://www.postgresql.org/docs/current/nls-programmer.html#NLS-GUIDELINES

--
Best Wishes,
Ashutosh Bapat

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Remove duplicate table scan in logical apply worker and code refactoring

Remove duplicate table scan in logical apply worker and code refactoring

Вложения

Re: Remove duplicate table scan in logical apply worker and code refactoring

Re: Remove duplicate table scan in logical apply worker and code refactoring