Обсуждение: [GENERAL] pglogical bidirectional replication of sequences
TLDR: Don't. I'm currently conducting tests which should eventually lead to a 2 node cluster with working bidirectional logical replication. (Postgresql 9.6.4-1.pgdg90+1, pglogical 2.0.1-1.jessie+1 on Debian 9 (Stretch)) pglogical supports replication of sequences, and although the way it does this suggests that it can't really work in both directions (actually I'm sceptical that it works reliably in one direction), of course I had to try it. So I created a sequence on both nodes and called select pglogical.replication_set_add_sequence('default', 'test_sequence'); on both nodes. The result was ... interesting. First I got the same sequence (1, 2, 3, 4, 5) on both nodes. After a few seconds the replication kicked in, and then I got the same value (1005) on both nodes most of the time, with a few variants (2005, 3005) thrown in. In a word, the sequence was completely unusable. Experiment completed, so I removed the sequence from the replication set: select pglogical.replication_set_remove_sequence('default', 'test_sequence'); on both nodes. But the behaviour of the sequence doesn't change. It still returns 1005 most of the time, and sometimes 2005 or 3005. This is true even after restarting both nodes. Plus, I can't drop the sequence any more (as the user who created the sequence): wds=> drop sequence public.test_sequence ; ERROR: permission denied for schema pglogical So, clearly, pglogical is still managing that sequence. If I drop the sequence as postgres and then recreate it, it works normally for some time (also the sequence on the other node now works normally), but after some time, the replication kicks in again and the sequence is stuck again at 1005. So, is there a way to recover from this situation without drastic measures like nuking the whole database. hp -- _ | Peter J. Holzer | we build much bigger, better disasters now |_|_) | | because we have much more sophisticated | | | hjp@hjp.at | management tools. __/ | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>
Вложения
On 09/01/2017 02:29 AM, Peter J. Holzer wrote: > TLDR: Don't. > > I'm currently conducting tests which should eventually lead to a 2 node > cluster with working bidirectional logical replication. > > (Postgresql 9.6.4-1.pgdg90+1, pglogical 2.0.1-1.jessie+1 on Debian 9 > (Stretch)) > > pglogical supports replication of sequences, and although the way it > does this suggests that it can't really work in both directions > (actually I'm sceptical that it works reliably in one direction), of > course I had to try it. > > So I created a sequence on both nodes and called > select pglogical.replication_set_add_sequence('default', 'test_sequence'); > on both nodes. > > The result was ... interesting. > > First I got the same sequence (1, 2, 3, 4, 5) on both nodes. > > After a few seconds the replication kicked in, and then I got the same > value (1005) on both nodes most of the time, with a few variants (2005, > 3005) thrown in. > > In a word, the sequence was completely unusable. > > Experiment completed, so I removed the sequence from the replication > set: > > select pglogical.replication_set_remove_sequence('default', 'test_sequence'); > on both nodes. > > But the behaviour of the sequence doesn't change. It still returns 1005 > most of the time, and sometimes 2005 or 3005. This is true even after > restarting both nodes. > > Plus, I can't drop the sequence any more (as the user who created the > sequence): > > wds=> drop sequence public.test_sequence ; > ERROR: permission denied for schema pglogical > > So, clearly, pglogical is still managing that sequence. > > If I drop the sequence as postgres and then recreate it, it works > normally for some time (also the sequence on the other node now works > normally), but after some time, the replication kicks in again and the > sequence is stuck again at 1005. > > So, is there a way to recover from this situation without drastic > measures like nuking the whole database. > > hp I trust you mean don't use sequences
On 2017-09-01 09:57:52 -0600, Rob Sargent wrote: > On 09/01/2017 02:29 AM, Peter J. Holzer wrote: > >TLDR: Don't. > > > >I'm currently conducting tests which should eventually lead to a 2 node > >cluster with working bidirectional logical replication. > > > >(Postgresql 9.6.4-1.pgdg90+1, pglogical 2.0.1-1.jessie+1 on Debian 9 > >(Stretch)) > > > >pglogical supports replication of sequences, and although the way it > >does this suggests that it can't really work in both directions > >(actually I'm sceptical that it works reliably in one direction), of > >course I had to try it. > > [and it blew up] > I trust you mean don't use sequences I trust you don't mean what I understood ;-). Seriously: Sequences in general are fine and very useful. I think they should be used where appropriate. Sequences and logical replication don't mix well. That still doesn't mean that you can't use sequences, you just have to be careful how you use them. Since replicating sequence state doesn't really work, I think it is best to use independent sequences on each node and just configure them in a way that they can not produce the same values. A naive approach would be to use MINVALUE/MAXVALUE/START WITH to ensure non-overlapping ranges. A somewhat more elegant approach is to increment by $n$ (the number of nodes in the cluster) and use different start values (I got that idea from http://thedumbtechguy.blogspot.co.at/2017/04/demystifying-pglogical-tutorial.html). There are other ways to get unique ids: A uuid should work pretty well in most cases, and in some even a random 64 bit int might be enough. hp -- _ | Peter J. Holzer | we build much bigger, better disasters now |_|_) | | because we have much more sophisticated | | | hjp@hjp.at | management tools. __/ | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>
Вложения
On 2017-09-01 10:29:51 +0200, Peter J. Holzer wrote: > pglogical supports replication of sequences, and although the way it > does this suggests that it can't really work in both directions > (actually I'm sceptical that it works reliably in one direction), of > course I had to try it. > > So I created a sequence on both nodes and called > select pglogical.replication_set_add_sequence('default', 'test_sequence'); > on both nodes. > > The result was ... interesting. > > First I got the same sequence (1, 2, 3, 4, 5) on both nodes. > > After a few seconds the replication kicked in, and then I got the same > value (1005) on both nodes most of the time, with a few variants (2005, > 3005) thrown in. > > In a word, the sequence was completely unusable. [...some failed attempts to recover...] > So, is there a way to recover from this situation without drastic > measures like nuking the whole database. To answer my own question: delete from pglogical.queue where message_type='S'; on both nodes seems to have the desired effect. A vacuum full pglogical.queue afterwards is a good idea to get the bloated table back to a reasonable size. hp -- _ | Peter J. Holzer | we build much bigger, better disasters now |_|_) | | because we have much more sophisticated | | | hjp@hjp.at | management tools. __/ | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>