Re: Logical Replication WIP

Поиск
Список
Период
Сортировка
От Petr Jelinek
Тема Re: Logical Replication WIP
Дата
Msg-id e48834c5-1db9-b381-3b7e-8f1ecb04dddd@2ndquadrant.com
обсуждение исходный текст
Ответ на Re: Logical Replication WIP  (Steve Singer <steve@ssinger.info>)
Список pgsql-hackers
On 05/09/16 23:35, Steve Singer wrote:
> On 09/05/2016 03:58 PM, Steve Singer wrote:
>> On 08/31/2016 04:51 PM, Petr Jelinek wrote:
>>> Hi,
>>>
>>> and one more version with bug fixes, improved code docs and couple
>>> more tests, some general cleanup and also rebased on current master
>>> for the start of CF.
>>>
>>>
>>>
>>
>
> A few more things I noticed when playing with the patches
>
> 1, Creating a subscription to yourself ends pretty badly,
> the 'CREATE SUBSCRIPTION' command seems to get stuck, and you can't kill
> it.  The background process seems to be waiting for a transaction to
> commit (I assume the create subscription command).  I had to kill -9 the
> various processes to get things to stop.  Getting confused about
> hostnames and ports is a common operator error.
>

Hmm I guess there is missing interrupts check, will look. It would be 
great to detect it properly but I am not really sure how to do that as 
afaik there is no accurate way to detect that the connection is to yourself.

> 2. Failures during the initial subscription  aren't recoverable
>
> For example
>
> on db1
>   create table a(id serial4 primary key,b text);
>   insert into a(b) values ('1');
>   create publication testpub for table a;
>
> on db2
>   create table a(id serial4 primary key,b text);
>   insert into a(b) values ('1');
>   create subscription testsub connection 'host=localhost port=5440
> dbname=test' publication testpub;
>
> I then get in my db2 log
>
> ERROR:  duplicate key value violates unique constraint "a_pkey"
> DETAIL:  Key (id)=(1) already exists.
> LOG:  worker process: logical replication worker 16396 sync 16387 (PID
> 10583) exited with exit code 1
> LOG:  logical replication sync for subscription testsub, table a started
> ERROR:  could not crate replication slot "testsub_sync_a": ERROR:
> replication slot "testsub_sync_a" already exists
>
>
> LOG:  worker process: logical replication worker 16396 sync 16387 (PID
> 10585) exited with exit code 1
> LOG:  logical replication sync for subscription testsub, table a started
> ERROR:  could not crate replication slot "testsub_sync_a": ERROR:
> replication slot "testsub_sync_a" already exists
>
>
> and it keeps looping.
> If I then truncate "a" on db2 it doesn't help. (I'd expect at that point
> the initial subscription to work)

Hmm, looks like the error case does not cleanup correctly after itself.

>
> If I then do on db2
>  drop subscription testsub cascade;
>
> I still see a slot in use on db1
>
> select * FROM pg_replication_slots ;
>    slot_name    |  plugin  | slot_type | datoid | database | active |
> active_pid | xmin | catalog_xmin | rest
> art_lsn | confirmed_flush_lsn
> ----------------+----------+-----------+--------+----------+--------+------------+------+--------------+-----
>
> --------+---------------------
>  testsub_sync_a | pgoutput | logical   |  16384 | test     | f
> |            |      |         1173 | 0/15
> 66E08   | 0/1566E40
>

Same as above.

--   Petr Jelinek                  http://www.2ndQuadrant.com/  PostgreSQL Development, 24x7 Support, Training &
Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Stas Kelvich
Дата:
Сообщение: Re: Speedup twophase transactions
Следующее
От: Pavan Deolasee
Дата:
Сообщение: Re: Override compile time log levels of specific messages/modules