Обсуждение: Infinite replication loop with pglogical

Поиск
Список
Период
Сортировка

Infinite replication loop with pglogical

От
"Peter J. Holzer"
Дата:
[I'm not sure whether this is the right list to discuss third party
extensions like 2ndQuadrant's pglogical. If there is a better place,
feel free to direct me to it]

We are testing bidirectional replication with pglogical:

Node 1:
    postgresql-9.6           9.6.6-1.pgdg90+1
    postgresql-9.6-pglogical 2.1.0-1.jessie+1
Node 2:
    postgresql-10            10.1-1.pgdg90+1
    postgresql-10-pglogical  2.1.0-1.stretch+1

(They are on different versions because we wanted to test a staggered
upgrade)

We have configured bidirectional replication on several tables. In
general that seems to work fine. When a row is inserted/updated/
deleted on one node, it is replicated to the other.

But now I seem to have triggered a replication loop:

wds=> select id, source_id, insert_ts, start_ts, update_ts, finished_ts, module_id, status_id, status_msg, pid,
usernamefrom qualitygate.jobs where id = 19142008; 
    id    | source_id |         insert_ts          | start_ts | update_ts | finished_ts | module_id | status_id |
status_msg| pid | username  

----------+-----------+----------------------------+----------+-----------+-------------+-----------+-----------+------------+-----+----------
 19142008 |         9 | 2017-12-07 11:39:09.904071 |          |           |             |        80 |         0 |
    |   0 | hjp 
(1 row)

wds=> select id, source_id, insert_ts, start_ts, update_ts, finished_ts, module_id, status_id, status_msg, pid,
usernamefrom qualitygate.jobs where id = 19142008; 
    id    | source_id |         insert_ts          |          start_ts          |         update_ts          |
finished_ts| module_id | status_id | status_msg | pid  | username  

----------+-----------+----------------------------+----------------------------+----------------------------+-------------+-----------+-----------+------------+------+----------
 19142008 |         9 | 2017-12-07 11:39:09.904071 | 2017-12-07 11:39:09.914616 | 2017-12-07 11:39:09.914616 |
  |        80 |         0 |            | 6863 | hjp 
(1 row)

wds=> select id, source_id, insert_ts, start_ts, update_ts, finished_ts, module_id, status_id, status_msg, pid,
usernamefrom qualitygate.jobs where id = 19142008; 
    id    | source_id |         insert_ts          |          start_ts          |         update_ts         |
finished_ts       | module_id | status_id | status_msg | pid | username  

----------+-----------+----------------------------+----------------------------+---------------------------+---------------------------+-----------+-----------+------------+-----+----------
 19142008 |         9 | 2017-12-07 11:39:09.904071 | 2017-12-07 11:39:09.914616 | 2017-12-07 11:39:09.95925 |
2017-12-0711:39:09.95925 |        80 |         0 |            |  -1 | hjp 
(1 row)

Subsequent selects randomly return one of three states for this row
(other rows are stable). The order above is the "logical order", i.e.
the row was first inserted with the first state, then updated to the
second, and finally to the third.

I suspect that the updates are bounced between the two nodes eternally
overwriting each other and never reaching a stable state.

Has anybody seen this? If so, is there a way to reliably avoid this?
(Maybe with a different pglogical.conflict_resolution setting, but if it
was a conflict, I should see something in the logs, right?)

        hp

--
   _  | Peter J. Holzer    | we build much bigger, better disasters now
|_|_) |                    | because we have much more sophisticated
| |   | hjp@hjp.at         | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>

Вложения

Re: Infinite replication loop with pglogical

От
"Peter J. Holzer"
Дата:
On 2017-12-07 14:41:31 +0100, Peter J. Holzer wrote:
> We are testing bidirectional replication with pglogical:
[...]
> We have configured bidirectional replication on several tables. In
> general that seems to work fine. When a row is inserted/updated/
> deleted on one node, it is replicated to the other.
>
> But now I seem to have triggered a replication loop:
[...]
> Subsequent selects randomly return one of three states for this row
> (other rows are stable). The order above is the "logical order", i.e.
> the row was first inserted with the first state, then updated to the
> second, and finally to the third.
>
> I suspect that the updates are bounced between the two nodes eternally
> overwriting each other and never reaching a stable state.
>
> Has anybody seen this? If so, is there a way to reliably avoid this?
> (Maybe with a different pglogical.conflict_resolution setting, but if it
> was a conflict, I should see something in the logs, right?)

This paragraph from
https://github.com/2ndQuadrant/pglogical/blob/master/README.md#subscription-management
looks relevant:

| * forward_origins - array of origin names to forward, currently only
| supported values are empty array meaning don't forward any changes
| that didn't originate on provider node (this is useful for two-way
| replication between the nodes), or "{all}" which means replicate all
| changes no matter what is their origin, default is "{all}"

I didn't change the default, so that means that each change is
replicated back to the originator. I guess that doesn't matter if it
happens before the next change (then the change is essentially a noop
and won't be replicated again), but it does trigger a loop if the row
has changed (because then it is changed again, which means that the
previous change will trigger another replication, and so on.)

        hp


--
   _  | Peter J. Holzer    | we build much bigger, better disasters now
|_|_) |                    | because we have much more sophisticated
| |   | hjp@hjp.at         | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>

Вложения