Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher

Поиск

Список

Период

Сортировка

От	Marco Slot
Тема	Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
Дата	20 июля 2022 г. 10:15:12
Msg-id	CAFMSG9G0Pr=feCDwxJGF2=xBSG69iXJZLDiFKyr3hD7bYhb33Q@mail.gmail.com обсуждение исходный текст
Ответ на	Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher (Amit Kapila <amit.kapila16@gmail.com>)
Список	pgsql-hackers

Дерево обсуждения

On Mon, Jul 18, 2022 at 8:29 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> IIUC, this proposal is to optimize cases where users can't have a
> unique/primary key for a relation on the subscriber and those
> relations receive lots of updates or deletes?

I think this patch optimizes for all non-trivial cases of update/delete replication (e.g. >1000 rows in the table, >1000 rows per hour updated) without a primary key. For instance, it's quite common to have a large append-mostly events table without a primary key (e.g. because of partitioning, or insertion speed), which will still have occasional batch updates/deletes.

Imagine an update of a table or partition with 1 million rows and a typical scan speed of 1M rows/sec. An update on the whole table takes maybe 1-2 seconds. Replicating the update using a sequential scan per row can take on the order of ~12 days ≈ 1M seconds.

The current implementation makes using REPLICA IDENTITY FULL a huge liability/ impractical for scenarios where you want to replicate an arbitrary set of user-defined tables, such as upgrades, migrations, shard moves. We generally recommend users to tolerate update/delete errors in such scenarios.

If the apply worker can use an index, the data migration tool can tactically create one on a high cardinality column, which would practically always be better than doing a sequential scan for non-trivial workloads.

cheers,
Marco

В списке pgsql-hackers по дате отправления:

Предыдущее

От: "tanghy.fnst@fujitsu.com"
Дата: 20 июля 2022 г., 10:03:47
Сообщение: RE: Memory leak fix in psql

Следующее

От: Kyotaro Horiguchi
Дата: 20 июля 2022 г., 10:16:32
Сообщение: Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher

Предыдущее

Следующее