Re: pglogical - logical replication contrib module
От | Petr Jelinek |
---|---|
Тема | Re: pglogical - logical replication contrib module |
Дата | |
Msg-id | 569925DA.7090102@2ndquadrant.com обсуждение исходный текст |
Ответ на | Re: pglogical - logical replication contrib module (Steve Singer <steve@ssinger.info>) |
Список | pgsql-hackers |
On 2016-01-09 19:30, Steve Singer wrote:\ > I am going to send my comments/issues out in batches as I find them > instead of waiting till I look over everything. > Thanks for looking at this! Yes going in batches/steps makes sense, this is huge patch. > > I find this part of the documentation a bit unclear > > > +Once the provider node is setup, subscribers can be subscribed to it. > First the > +subscriber node must be created: > + > + SELECT pglogical.create_node( > + node_name := 'subscriber1', > + dsn := 'host=thishost port=5432 dbname=db' > + ); > + > > My initial reading was that I should execute this on the provider node. > Perhaps instead > ----------------- > Once the provider node is setup you can then create subscriber nodes. > Create the subscriber nodes and > then execute the following commands on each subscriber node > > create extension pglogical > > select pglogical.create_node(node_name:='subsriberX',dsn:='host=thishost > dbname=db port=5432'); > > ------------------- Makes sense I guess, this is probably relic of how this internally evolved (we used to have providers and subscribers before we merged them into nodes). > > Also the documentation for create_subscription talks about > > + - `synchronize_structure` - specifies if to synchronize structure from > + provider to the subscriber, default true > Not sure what's your comment on this. > > > I did the following > > test2=# select pglogical.create_subscription(subscription_name:='default > sub',provider_dsn:='host=localhost dbname=test1 port=5436'); > create_subscription > --------------------- > 247109879 > > > Which then resulted in the following showing up in my PG log > > LOG: worker process: pglogical apply 16542:247109879 (PID 4079) exited > with exit code 1 > ERROR: replication slot name "pgl_test2_test1_default sub" contains > invalid character > HINT: Replication slot names may only contain lower case letters, > numbers, and the underscore character. > FATAL: could not send replication command "CREATE_REPLICATION_SLOT > "pgl_test2_test1_default sub" LOGICAL pglogical_output": status > PGRES_FATAL_ERROR: ERROR: replication slot name > "pgl_test2_test1_default sub" contains invalid character > HINT: Replication slot names may only contain lower case letters, > numbers, and the underscore character. > > > The create_subscription command should check if the subscription name is > valid (meets the rules that will be applied against the slot command). > Yes, fixed. Also added some other sensitization code since we also use dbname in slot name and that can contain whatever. > I wondered how I could fix my mistake. > > The docs say > > +- `pglogical.pglogical_drop_subscription(subscription_name name, > ifexists bool)` > + Disconnects the subscription and removes it from the catalog. > + > > test2=# select pglogical.pglogical_drop_subscription('default sub', true); > ERROR: function pglogical.pglogical_drop_subscription(unknown, boolean) > does not exist > > > The command is actually called pglogical.drop_subscription the docs > should be fixed to show the actual command name > Yep, got this from other people as well, fixed. > > I then wanted to add a second table to my database. ('b'). > > select pglogical.replication_set_add_table('default','public.b',true); > replication_set_add_table > --------------------------- > t > (1 row) > > In my pglog I then got > > LOG: starting sync of table public.b for subscriber defaultsub > ERROR: replication slot name "pgl_test2_test1_defaultsub_public.b" > contains invalid character > HINT: Replication slot names may only contain lower case letters, > numbers, and the underscore character. > FATAL: could not send replication command "CREATE_REPLICATION_SLOT > "pgl_test2_test1_defaultsub_public.b" LOGICAL pglogical_output": status > PGRES_FATAL_ERROR: ERROR: replication slot name > "pgl_test2_test1_defaultsub_public.b" contains invalid character > HINT: Replication slot names may only contain lower case letters, > numbers, and the underscore character. > Right, needed the sensitization as well (I am actually using the hash now as there is only 8 chars left anyway). > > I then did > > test1=# select > pglogical.replication_set_remove_table('default','public.b'); > replication_set_remove_table > ------------------------------ > t > (1 row) > > > but my log still keep repeating the error, so I tried connecting to the > replica and did the same > > test2=# select > pglogical.replication_set_remove_table('default','public.b'); > ERROR: replication set mapping -303842815:16726 not found > > Is there any way to recover from this situation? > Not really, there is no api yet to remove table from synchronization process so you'd have to manually delete row from pglogical.local_sync_status on subscriber, kill the sync process and remove the slot. I will think about what would be good api to solve this. > The documenation says I can drop a replication set, maybe that will let > replication continue. > > +- `pglogical.delete_replication_set(set_name text)` > + Removes the replication set. > + > > select pglogical.delete_replication_set('default'); > ERROR: function pglogical.delete_replication_set(unknown) does not exist > LINE 1: select pglogical.delete_replication_set('default'); > ^ > HINT: No function matches the given name and argument types. You might > need to add explicit type casts. > > The function is actually pglogical.drop_replication_set , the docs > should be updated. > (note that didn't fix my problem either but then dropping the > subscription did seem to work). Yeah it doesn't as the problem is on subscriber, replication sets only affect provider. And fixed the docs. > > > > I then re-added the default set to the origin and resubscribed my replica > > > > test2=# select > pglogical.create_subscription(subscription_name:='defaultsub',provider_dsn:='host=localhost > dbname=test1 port=5436'); > create_subscription > --------------------- > 2974019075 > > > I then saw a bunch of > LOG: worker process: pglogical apply 16542:2974019075 (PID 26778) > exited with exit code 1 > ERROR: subscriber defaultsub initialization failed during > nonrecoverable step (s), please try the setup again > LOG: worker process: pglogical apply 16542:2974019075 (PID 26779) > exited with exit code 1 > > in the log but then those stopped and I see > > test2=# select pglogical.show_subscription_status(); > show_subscription_status > > -------------------------------------------------------------------------------- > > -------------------------------------------------- > (defaultsub,down,test1,"host=localhost dbname=test1 > port=5436",pgl_test2_test1_ > defaultsub,"{default,default_insert_only}",{all}) > (1 row) > > > I'm not really sure what to do to 'recover' my cluster at this point so > I'll send this off and rebuild my cluster and start over. > I think the problem here is that you resubscribed with syncrhonize_structure := true while the conflicting structure already existed, that option only works correctly when there is no conflicting structure (we don't try to make diffs or anything, just dump/restore). Recovering should be drop the uninitialized subscription and create new one where you don't synchronize structure. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
В списке pgsql-hackers по дате отправления:
Следующее
От: Tom LaneДата:
Сообщение: Re: [BUGS] about test_parser installation failure problem(PostgreSQL in 9.5.0)?