Re: [BDR] Node Join Question

Поиск

Список

Период

Сортировка

От	Wayne E. Seguin
Тема	Re: [BDR] Node Join Question
Дата	12 мая 2015 г. 16:50:27
Msg-id	CANf8RLt3L-WuW7CrRw2WjOcBwbmmc-YDqd4dRudbWLyh9y8=Jg@mail.gmail.com обсуждение исходный текст
Ответ на	Re: [BDR] Node Join Question (Craig Ringer <craig@2ndquadrant.com>)
Список	pgsql-general

Дерево обсуждения

Craig, thank you so much for the quick response!

Adding these cleanup functions sounds wonderful, thank you for looking into that.

One question, why template0 vs template1 ? (My guess is because you want it to be devoid of pretty much everything?)

On Tue, May 12, 2015 at 1:31 AM, Craig Ringer <craig@2ndquadrant.com> wrote:

On 12 May 2015 at 14:36, Wayne E. Seguin <wayneeseguin@gmail.com> wrote:
Also,

Is there a way to remove these things from the init target node easier?

d= p=504 a=ERROR: 55000: previous init failed, manual cleanup is required
d= p=504 a=DETAIL: Found bdr.bdr_nodes entry for bdr (6147869128174526660,1,16908,) with state=i in remote bdr.bdr_nodes
d= p=504 a=HINT: Remove all replication identifiers and slots corresponding to this node from the init target node then drop and recreate this database and try again

Now that we have SQL-level join it'd probably make sense to provide a cleanup function for failed node joins. At this point there's no such function.

Take note of the node identity given in the error as it corresponds to the replication identifier name and slot name.

You need to, on the join target node:

SELECT pg_drop_replication_slot(slot_name)
FROM pg_replication_slots
WHERE slot_name = bdr.bdr_format_slot_name('6147869128174526660',1,16908)

where the sysid, timeline ID and database OID are those given in the error. You must run this from the target node's database, as it'll only consider slots for the current database.

Then

SELECT pg_replication_identifier_drop(...)

the replication identifier used, after looking up the replication identifier from pg_catalog.pg_replication_identifier. There isn't an equivalent of bdr.bdr_format_slot_name for replication identifiers; I'll look at adding one. Look it up visually or write a simple function to format the string in the mean time.

Then delete the bdr.bdr_nodes entry for the failed-to-join node and any bdr.bdr_connections entries for it.

You *must* drop and re-create the database on the failed-to-join node, making a new blank db (preferably from template0).