Re: Handle infinite recursion in logical replication setup

Поиск
Список
Период
Сортировка
От Jonathan S. Katz
Тема Re: Handle infinite recursion in logical replication setup
Дата
Msg-id c78e8807-ae71-5aeb-a259-f8cf27d3f36f@postgresql.org
обсуждение исходный текст
Ответ на Re: Handle infinite recursion in logical replication setup  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: Handle infinite recursion in logical replication setup  (Amit Kapila <amit.kapila16@gmail.com>)
Re: Handle infinite recursion in logical replication setup  (vignesh C <vignesh21@gmail.com>)
Re: Handle infinite recursion in logical replication setup  (vignesh C <vignesh21@gmail.com>)
Список pgsql-hackers
On 7/22/22 12:47 AM, Amit Kapila wrote:
> On Fri, Jul 22, 2022 at 1:39 AM Jonathan S. Katz <jkatz@postgresql.org> wrote:

>> 1. I'm concerned by calling this "Bidirectional replication" in the docs
>> that we are overstating the current capabilities. I think this is
>> accentuated int he opening paragraph:
>>
>> ==snip==
>>    Bidirectional replication is useful for creating a multi-master database
>>    environment for replicating read/write operations performed by any of the
>>    member nodes.
>> ==snip==
>>
>> For one, we're not replicating reads, we're replicating writes. Amongst
>> the writes, at this point we're only replicating DML. A reader could
>> think that deploying can work for a full bidirectional solution.
>>
>> (Even if we're aspirationally calling this section "Bidirectional
>> replication", that does make it sound like we're limited to two nodes,
>> when we can support more than two).
>>
> 
> Right, I think the system can support N-Way replication.

I did some more testing of the feature, i.e. doing 3-node and 4-node 
tests. While logical replication today can handle replicating between 
multiple nodes (N-way), the "origin = none" does require setting up 
subscribers between each of the nodes.

For example, if I have 4 nodes A, B, C, D and I want to replicate the 
same table between all of them, I need to set up subscribers between all 
of them (A<=>B, A<=>C, A<=>D, B<=>C, B<=>D, C<=>D). However, each node 
can replicate between each other in a way that's convenient (vs. having 
to do something funky with partitions) so this is still a big step forward.

This is a long way of saying that I do think it's fair to say we support 
"N-way" replication so long as you are set up in a mesh (vs. a ring, 
e.g. A=>B=>C=>D=>A).

> Among the above "Replicating changes between primaries" sounds good to
> me or simply "Replication between primaries". As this is a sub-section
> on the Logical Replication page, I feel it is okay to not use Logical
> in the title.

Agreed, I think that's fine.

>> At a minimum, I think we should reference the documentation we have in
>> the logical replication section on conflicts. We may also want to advise
>> that a user is responsible for designing their schemas in a way to
>> minimize the risk of conflicts.
>>
> 
> This sounds reasonable to me.
> 
> One more point about docs, it appears to be added as the last
> sub-section on the Logical Replication page. Is there a reason for
> doing so? I feel this should be third sub-section after describing
> Publication and Subscription.

When I first reviewed, I had not built the docs. Did so on this pass.

I agree with the positioning argument, i.e. it should go after 
"Subscription" in the table of contents -- but it makes me question a 
couple of things:

1. The general ordering of the docs
2. How we describe that section (more on that in a sec)
3. If "row filters" should be part of "subscription" instead of its own 
section.

If you look at the current order, "Quick setup" is the last section; one 
would think the "quick" portion goes first :) Given a lot of this is for 
the current docs, I may start a separate discussion on -docs for this part.

For the time being, I agree it should be moved to the section after 
"Subscription".

I think what this section describes is "Configuring Replication Between 
Nodes" as it covers a few different scenarios.

I do think we need to iterate on these docs -- the examples with the 
commands are generally OK and easy to follow, but a few things I noticed:

1. The general description of the section needs work. We may want to 
refine the description of the use cases, and in the warning, link to 
instructions on how to take backups.
2. We put the "case not supported" in the middle, not at the end.
3. The "generic steps for adding a new node..." section uses a 
convention for steps that is not found in the docs. We also don't 
provide an example for this section, and this is the most complicated 
scenario to set up.

I may be able to propose some suggestions in a few days.

> BTW, do you have any opinion on the idea of the first remaining patch
> where we accomplish two things: a) Checks and throws an error if
> 'copy_data = on' and 'origin = none' but the publication tables were
> also replicated from other publishers. b) Adds 'force' value for
> copy_data parameter to allow copying in such a case. The primary
> reason for this patch is to avoid loops or duplicate data in the
> initial phase. We can't skip copying based on origin as we can do
> while replicating changes from WAL. So, we detect that the publisher
> already has data from some other node and doesn't allow replication
> unless the user uses the 'force' option for copy_data.

In general, I agree with the patch; but I'm not sure why we are calling 
"copy_data = force" in this case and how it varies from "on". I 
understand the goal is to prevent the infinite loop, but is there some 
technical restriction why we can't set "origin = none, copy_data = on" 
and have this work (and apologies if I missed that upthread)?

The other concern I'll note is that we're changing a boolean parameter 
to an enum and I want to be sensitive to folks who are already using 
"copy_data" to be sure we don't break them.

Jonathan


Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andrew Dunstan
Дата:
Сообщение: fairywren hung in pg_basebackup tests
Следующее
От: Tom Lane
Дата:
Сообщение: Re: fairywren hung in pg_basebackup tests