Re: Handle infinite recursion in logical replication setup

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: Handle infinite recursion in logical replication setup
Дата
Msg-id CAA4eK1+UTzBq02SmBzCFFeyubavBdgMN4p5FMdsfQKQJBoTo=Q@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Handle infinite recursion in logical replication setup  ("Jonathan S. Katz" <jkatz@postgresql.org>)
Ответы Re: Handle infinite recursion in logical replication setup  (vignesh C <vignesh21@gmail.com>)
Re: Handle infinite recursion in logical replication setup  ("Jonathan S. Katz" <jkatz@postgresql.org>)
Список pgsql-hackers
On Fri, Jul 22, 2022 at 1:39 AM Jonathan S. Katz <jkatz@postgresql.org> wrote:
>
> Thanks for the work on this feature -- this is definitely very helpful
> towards supporting more types of use cases with logical replication!
>
> I've read through the proposed documentation and did some light testing
> of the patch. I have two general comments about the docs as they
> currently read:
>
> 1. I'm concerned by calling this "Bidirectional replication" in the docs
> that we are overstating the current capabilities. I think this is
> accentuated int he opening paragraph:
>
> ==snip==
>   Bidirectional replication is useful for creating a multi-master database
>   environment for replicating read/write operations performed by any of the
>   member nodes.
> ==snip==
>
> For one, we're not replicating reads, we're replicating writes. Amongst
> the writes, at this point we're only replicating DML. A reader could
> think that deploying can work for a full bidirectional solution.
>
> (Even if we're aspirationally calling this section "Bidirectional
> replication", that does make it sound like we're limited to two nodes,
> when we can support more than two).
>

Right, I think the system can support N-Way replication.

> Perhaps "Logical replication between writers" or "Logical replication
> between primaries" or "Replicating changes between primaries", or
> something better.
>

Among the above "Replicating changes between primaries" sounds good to
me or simply "Replication between primaries". As this is a sub-section
on the Logical Replication page, I feel it is okay to not use Logical
in the title.

> 2. There is no mention of conflicts in the documentation, e.g.
> referencing the "Conflicts" section of the documentation. It's very easy
> to create a conflicting transaction that causes a subscriber to be
> unable to continue to apply transactions:
>
>    -- DB 1
>    CREATE TABLE abc (id int);
>    CREATE PUBLICATION node1 FOR ALL TABLES ;
>
>    -- DB2
>    CREATE TABLE abc (id int);
>    CREATE PUBLICATION node2 FOR ALL TABLES ;
>    CREATE SUBSCRIPTION node2_node1
>      CONNECTION 'dbname=logi port=5433'
>      PUBLICATION node1
>      WITH (copy_data = off, origin = none);
>
>    -- DB1
>    CREATE SUBSCRIPTION node1_node2
>      CONNECTION 'dbname=logi port=5434'
>      PUBLICATION node2
>      WITH (copy_data = off, origin = none);
>    INSERT INTO abc VALUES (1);
>
>    -- DB2
>    INSERT INTO abc VALUES (2);
>
>    -- DB1
>    ALTER TABLE abc ADD PRIMARY KEY id;
>    INSERT INTO abc VALUES (3);
>
>    -- DB2
>    INSERT INTO abc VALUES (3);
>
>    -- DB1 cannot apply the transactions
>
> At a minimum, I think we should reference the documentation we have in
> the logical replication section on conflicts. We may also want to advise
> that a user is responsible for designing their schemas in a way to
> minimize the risk of conflicts.
>

This sounds reasonable to me.

One more point about docs, it appears to be added as the last
sub-section on the Logical Replication page. Is there a reason for
doing so? I feel this should be third sub-section after describing
Publication and Subscription.

BTW, do you have any opinion on the idea of the first remaining patch
where we accomplish two things: a) Checks and throws an error if
'copy_data = on' and 'origin = none' but the publication tables were
also replicated from other publishers. b) Adds 'force' value for
copy_data parameter to allow copying in such a case. The primary
reason for this patch is to avoid loops or duplicate data in the
initial phase. We can't skip copying based on origin as we can do
while replicating changes from WAL. So, we detect that the publisher
already has data from some other node and doesn't allow replication
unless the user uses the 'force' option for copy_data.

-- 
With Regards,
Amit Kapila.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Thomas Munro
Дата:
Сообщение: Re: Strange failures on chipmunk
Следующее
От: Amit Kapila
Дата:
Сообщение: Re: [PATCH v1] eliminate duplicate code in table.c