Обсуждение: [HACKERS] Alter subscription..SET - NOTICE message is coming for table which isalready removed

Поиск
Список
Период
Сортировка
Hi,

While performing - Alter subscription..SET  , I found that NOTICE 
message is coming duplicate next time , which  is not needed anymore.

X cluster=
create 100 tables
create publication ( create publication pub for all tables;)

Y cluster=
create 100 tables
create subscription ( create subscription sub connection 
'dbname=postgres user=centos host=localhost ) publication pub;

X cluster =
create 1 more table (create table r(n int));
create publication for this above table only (create publication pub1 
for table r;)

Y cluster=
create table r - create table r(n int);
alter publication -alter subscription sub set publication pub1 refresh;

in the notice message -> table 'r' added / 100 tables removed from the 
subscription

postgres=# alter subscription sub set publication pub1 refresh;
NOTICE:  added subscription for table public.r
NOTICE:  removed subscription for table public.t1
NOTICE:  removed subscription for table public.t2
NOTICE:  removed subscription for table public.t3
NOTICE:  removed subscription for table public.t4
NOTICE:  removed subscription for table public.t5
NOTICE:  removed subscription for table public.t6
--
--
--
ALTER SUBSCRIPTION

now again fire the same sql query

postgres=# alter subscription sub set publication pub1 refresh;
NOTICE:  removed subscription for table public.t78
ALTER SUBSCRIPTION

This notice message should not  come  as t.78 is already removed from 
earlier same command.

-- 
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company




On Thu, May 25, 2017 at 6:08 AM, tushar <tushar.ahuja@enterprisedb.com> wrote:
> Hi,
>
> While performing - Alter subscription..SET  , I found that NOTICE message is
> coming duplicate next time , which  is not needed anymore.
>
> X cluster=
> create 100 tables
> create publication ( create publication pub for all tables;)
>
> Y cluster=
> create 100 tables
> create subscription ( create subscription sub connection 'dbname=postgres
> user=centos host=localhost ) publication pub;
>
> X cluster =
> create 1 more table (create table r(n int));
> create publication for this above table only (create publication pub1 for
> table r;)
>
> Y cluster=
> create table r - create table r(n int);
> alter publication -alter subscription sub set publication pub1 refresh;
>
> in the notice message -> table 'r' added / 100 tables removed from the
> subscription

I think you did ALTER SUBSCRIPTION while table sync for 100 tables is
running, right?

>
> postgres=# alter subscription sub set publication pub1 refresh;
> NOTICE:  added subscription for table public.r
> NOTICE:  removed subscription for table public.t1
> NOTICE:  removed subscription for table public.t2
> NOTICE:  removed subscription for table public.t3
> NOTICE:  removed subscription for table public.t4
> NOTICE:  removed subscription for table public.t5
> NOTICE:  removed subscription for table public.t6
> --
> --
> --
> ALTER SUBSCRIPTION
>
> now again fire the same sql query
>
> postgres=# alter subscription sub set publication pub1 refresh;
> NOTICE:  removed subscription for table public.t78
> ALTER SUBSCRIPTION
>
> This notice message should not  come  as t.78 is already removed from
> earlier same command.
>

I'm investigating the cause of this issue but it seems to me that
first ALTER SUBSCRIPTION surely removed t78 record from
pg_subscription_rel but table sync worker inserts new record of t78
when updating its state. SetSubscriptionRelState inserts a new record
if the table doesn't exist otherwise updates the record but we can
divide it into two different functions and call appropriate function
in each place.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center



On 05/25/2017 04:40 PM, Masahiko Sawada wrote:
> I think you did ALTER SUBSCRIPTION while table sync for 100 tables is
> running, right?
Yes, i didn't wait too much while executing the commands.

-- 
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company




On 05/25/2017 03:38 PM, tushar wrote:
> While performing - Alter subscription..SET  , I found that NOTICE 
> message is coming duplicate next time , which  is not needed anymore. 
There is an another example - where i am getting "ERROR: subscription 
table 16435 in subscription 16684 does not exist" in standby log file

2017-05-25 13:51:48.825 BST [32138] NOTICE:  removed subscription for 
table public.t96
2017-05-25 13:51:48.825 BST [32138] NOTICE:  removed subscription for 
table public.t97
2017-05-25 13:51:48.826 BST [32138] NOTICE:  removed subscription for 
table public.t98
2017-05-25 13:51:48.826 BST [32138] NOTICE:  removed subscription for 
table public.t99
2017-05-25 13:51:48.826 BST [32138] NOTICE:  removed subscription for 
table public.t100
2017-05-25 13:51:48.827 BST [32138] LOG:  duration: 35.404 ms statement: 
alter subscription c1 set publication p1 refresh;
2017-05-25 13:51:49.192 BST [32347] LOG:  starting logical replication 
worker for subscription "c1"
2017-05-25 13:51:49.198 BST [32368] LOG:  logical replication sync for 
subscription c1, table t16 started
2017-05-25 13:51:49.198 BST [32368] ERROR:  subscription table 16429 in 
subscription 16684 does not exist
2017-05-25 13:51:49.199 BST [32347] LOG:  starting logical replication 
worker for subscription "c1"
2017-05-25 13:51:49.200 BST [32065] LOG:  worker process: logical 
replication worker for subscription 16684 sync 16429 (PID 32368) exited 
with exit code 1
2017-05-25 13:51:49.204 BST [32369] LOG:  logical replication sync for 
subscription c1, table t17 started
2017-05-25 13:51:49.204 BST [32369] ERROR:  subscription table 16432 in 
subscription 16684 does not exist
2017-05-25 13:51:49.205 BST [32347] LOG:  starting logical replication 
worker for subscription "c1"
2017-05-25 13:51:49.205 BST [32065] LOG:  worker process: logical 
replication worker for subscription 16684 sync 16432 (PID 32369) exited 
with exit code 1
2017-05-25 13:51:49.209 BST [32370] LOG:  logical replication sync for 
subscription c1, table t18 started
2017-05-25 13:51:49.209 BST [32370] ERROR:  subscription table 16435 in 
subscription 16684 does not exist
2017-05-25 13:51:49.210 BST [32347] LOG:  starting logical replication 
worker for subscription "c1"
2017-05-25 13:51:49.210 BST [32065] LOG:  worker process: logical 
replication worker for subscription 16684 sync 16435 (PID 32370) exited 
with exit code 1
2017-05-25 13:51:49.213 BST [32371] LOG:  logical replication sync for 
subscription c1, table t19 started
2017-05-25 13:51:49.213 BST [32371] ERROR:  subscription table 16438 in 
subscription 16684 does not exist
2017-05-25 13:51:49.214 BST [32347] LOG:  starting logical replication 
worker for subscription "c1"


Steps to reproduce -
X cluster ->
create 100 tables , publish all tables (create publication pub for table 
t1,t2,t2..........t100;)
create one more table (create table t101(n int), create publication , 
publish only that table (create publication p1 for table t101;)

Y Cluster ->
create subscription (create subscription c1 connection 'host=localhost  
port=5432 user=centos ' publication pub;
alter subscription c1 set publication p1 refresh;
alter subscription c1 set publication pub refresh;
alter subscription c1 set publication p1 refresh;

check the log file.

-- 
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company




On Thu, May 25, 2017 at 8:15 PM, tushar <tushar.ahuja@enterprisedb.com> wrote:
> On 05/25/2017 04:40 PM, Masahiko Sawada wrote:
>>
>> I think you did ALTER SUBSCRIPTION while table sync for 100 tables is
>> running, right?
>
> Yes, i didn't wait too much while executing the commands.
>

I think it's better to add this to open items so that it doesn't get
missed. Barring any objections I'll add it.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center



On Tue, May 30, 2017 at 10:12 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> On Thu, May 25, 2017 at 8:15 PM, tushar <tushar.ahuja@enterprisedb.com> wrote:
>> On 05/25/2017 04:40 PM, Masahiko Sawada wrote:
>>>
>>> I think you did ALTER SUBSCRIPTION while table sync for 100 tables is
>>> running, right?
>>
>> Yes, i didn't wait too much while executing the commands.
>>
>
> I think it's better to add this to open items so that it doesn't get
> missed. Barring any objections I'll add it.
>

Added.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center



On Thu, May 25, 2017 at 9:54 PM, tushar <tushar.ahuja@enterprisedb.com> wrote:
> On 05/25/2017 03:38 PM, tushar wrote:
>>
>> While performing - Alter subscription..SET  , I found that NOTICE message
>> is coming duplicate next time , which  is not needed anymore.
>
> There is an another example - where i am getting "ERROR: subscription table
> 16435 in subscription 16684 does not exist" in standby log file
>
> 2017-05-25 13:51:48.825 BST [32138] NOTICE:  removed subscription for table
> public.t96
> 2017-05-25 13:51:48.825 BST [32138] NOTICE:  removed subscription for table
> public.t97
> 2017-05-25 13:51:48.826 BST [32138] NOTICE:  removed subscription for table
> public.t98
> 2017-05-25 13:51:48.826 BST [32138] NOTICE:  removed subscription for table
> public.t99
> 2017-05-25 13:51:48.826 BST [32138] NOTICE:  removed subscription for table
> public.t100
> 2017-05-25 13:51:48.827 BST [32138] LOG:  duration: 35.404 ms statement:
> alter subscription c1 set publication p1 refresh;
> 2017-05-25 13:51:49.192 BST [32347] LOG:  starting logical replication
> worker for subscription "c1"
> 2017-05-25 13:51:49.198 BST [32368] LOG:  logical replication sync for
> subscription c1, table t16 started
> 2017-05-25 13:51:49.198 BST [32368] ERROR:  subscription table 16429 in
> subscription 16684 does not exist
> 2017-05-25 13:51:49.199 BST [32347] LOG:  starting logical replication
> worker for subscription "c1"
> 2017-05-25 13:51:49.200 BST [32065] LOG:  worker process: logical
> replication worker for subscription 16684 sync 16429 (PID 32368) exited with
> exit code 1
> 2017-05-25 13:51:49.204 BST [32369] LOG:  logical replication sync for
> subscription c1, table t17 started
> 2017-05-25 13:51:49.204 BST [32369] ERROR:  subscription table 16432 in
> subscription 16684 does not exist
> 2017-05-25 13:51:49.205 BST [32347] LOG:  starting logical replication
> worker for subscription "c1"
> 2017-05-25 13:51:49.205 BST [32065] LOG:  worker process: logical
> replication worker for subscription 16684 sync 16432 (PID 32369) exited with
> exit code 1
> 2017-05-25 13:51:49.209 BST [32370] LOG:  logical replication sync for
> subscription c1, table t18 started
> 2017-05-25 13:51:49.209 BST [32370] ERROR:  subscription table 16435 in
> subscription 16684 does not exist
> 2017-05-25 13:51:49.210 BST [32347] LOG:  starting logical replication
> worker for subscription "c1"
> 2017-05-25 13:51:49.210 BST [32065] LOG:  worker process: logical
> replication worker for subscription 16684 sync 16435 (PID 32370) exited with
> exit code 1
> 2017-05-25 13:51:49.213 BST [32371] LOG:  logical replication sync for
> subscription c1, table t19 started
> 2017-05-25 13:51:49.213 BST [32371] ERROR:  subscription table 16438 in
> subscription 16684 does not exist
> 2017-05-25 13:51:49.214 BST [32347] LOG:  starting logical replication
> worker for subscription "c1"
>
>
> Steps to reproduce -
> X cluster ->
> create 100 tables , publish all tables (create publication pub for table
> t1,t2,t2..........t100;)
> create one more table (create table t101(n int), create publication ,
> publish only that table (create publication p1 for table t101;)
>
> Y Cluster ->
> create subscription (create subscription c1 connection 'host=localhost
> port=5432 user=centos ' publication pub;
> alter subscription c1 set publication p1 refresh;
> alter subscription c1 set publication pub refresh;
> alter subscription c1 set publication p1 refresh;
>
> check the log file.
>

Thanks.

I think this cause is that the relation status entry could be deleted
by ALTER SUBSCRIPTION REFRESH before corresponding table sync worker
starting. Attached patch fixes issues reported on this thread so far.

However there is one more problem here; if the relation status entry
is deleted while corresponding table sync worker is waiting to be
changed its status, the table sync worker can be orphaned in waiting
status. In this case, should table sync worker check the relation
status and exits if the relation status record gets removed? Or should
ALTER SUBSCRIPTION update status of table sync worker to UNKNOWN?

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения
On 5/30/17 13:25, Masahiko Sawada wrote:
> I think this cause is that the relation status entry could be deleted
> by ALTER SUBSCRIPTION REFRESH before corresponding table sync worker
> starting. Attached patch fixes issues reported on this thread so far.

This looks like a reasonable change, but it conflicts with the change
discussed on "logical replication - still unstable after all these
months".  I think I'll deal with that one first.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



On Wed, May 31, 2017 at 10:15:50PM -0400, Peter Eisentraut wrote:
> On 5/30/17 13:25, Masahiko Sawada wrote:
> > I think this cause is that the relation status entry could be deleted
> > by ALTER SUBSCRIPTION REFRESH before corresponding table sync worker
> > starting. Attached patch fixes issues reported on this thread so far.
> 
> This looks like a reasonable change, but it conflicts with the change
> discussed on "logical replication - still unstable after all these
> months".  I think I'll deal with that one first.

The above-described topic is currently a PostgreSQL 10 open item.  You own
this open item.  Please observe the policy on open item ownership and send a
status update within three calendar days of this message.  Include a date for
your subsequent status update.



On 6/3/17 01:04, Noah Misch wrote:
> On Wed, May 31, 2017 at 10:15:50PM -0400, Peter Eisentraut wrote:
>> On 5/30/17 13:25, Masahiko Sawada wrote:
>>> I think this cause is that the relation status entry could be deleted
>>> by ALTER SUBSCRIPTION REFRESH before corresponding table sync worker
>>> starting. Attached patch fixes issues reported on this thread so far.
>>
>> This looks like a reasonable change, but it conflicts with the change
>> discussed on "logical replication - still unstable after all these
>> months".  I think I'll deal with that one first.
> 
> The above-described topic is currently a PostgreSQL 10 open item.  You own
> this open item.  Please observe the policy on open item ownership and send a
> status update within three calendar days of this message.  Include a date for
> your subsequent status update.

I'm working on this now and will report back tomorrow.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



On 5/30/17 13:25, Masahiko Sawada wrote:
> I think this cause is that the relation status entry could be deleted
> by ALTER SUBSCRIPTION REFRESH before corresponding table sync worker
> starting. Attached patch fixes issues reported on this thread so far.

I have committed the part of the patch that changes the
SetSubscriptionRelState() calls.

I think there was a mistake in your patch, in that the calls in
LogicalRepSyncTableStart() used true once and false once.  I think all
the calls in tablesync.c should be the same.

(If you look at the patch again, notice that I have changed the
insert_ok argument to update_only, so true and false are flipped.)

I'm not convinced about the change to the GetSubscriptionRelState()
argument.  In the examples given, no tables are removed from any
publications, so I don't see how the claimed situation can happen.  I
would like to see more reproducible examples.

Right now, if the subscription rel state disappears before the sync
worker starts, the error kills the sync worker, so things should
continue working correctly.  Perhaps the error message isn't the best.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



On Thu, Jun 8, 2017 at 5:36 AM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
> On 5/30/17 13:25, Masahiko Sawada wrote:
>> I think this cause is that the relation status entry could be deleted
>> by ALTER SUBSCRIPTION REFRESH before corresponding table sync worker
>> starting. Attached patch fixes issues reported on this thread so far.
>
> I have committed the part of the patch that changes the
> SetSubscriptionRelState() calls.
>

Thank you!

> I think there was a mistake in your patch, in that the calls in
> LogicalRepSyncTableStart() used true once and false once.  I think all
> the calls in tablesync.c should be the same.

Yes, you're right.

> (If you look at the patch again, notice that I have changed the
> insert_ok argument to update_only, so true and false are flipped.)

Okay.

> I'm not convinced about the change to the GetSubscriptionRelState()
> argument.  In the examples given, no tables are removed from any
> publications, so I don't see how the claimed situation can happen.  I
> would like to see more reproducible examples.

In process_syncing_tables_for_apply(), apply worker gets the list of
all non-ready tables and tries to launch table sync workers
accordingly. But after got the list but before launch workers these
tables can be removed from publication, so launched table sync worker
cannot found its rel state from pg_subscription_rel. It completely
depends on timing and it happens rarely.

The reproduction step is provided by tushar but I could reproduced it
with following step.

X cluster ->
=# select 'create table t' || generate_series(1,100) || '(c
int);';\gexec -- create 100 tables
=# create table t (c int);  -- create one more table
=# create publication all_pub for all tables;
=# create publication one_pub for table t;

Y Cluster ->
(create the same 101 tables as well)
=# create subscription hoge_sub connection 'host=localhost  port=5432'
publication one_pub;
=# alter subscription hoge_sub set publication all_pub; select
pg_sleep(1); alter subscription hoge_sub set publication one_pub;
*Error occurs here*

> Right now, if the subscription rel state disappears before the sync
> worker starts, the error kills the sync worker, so things should
> continue working correctly.  Perhaps the error message isn't the best.
>

The change to GetSubscriptionRelState in that patch solves the error
message problem you mentioned. Returning SUBREL_STATE_UNKNOWN by
GetSubscriptionRelState means that the subscription rel state could
not found at the time. So we can emit the error with appropriate
message.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center



On Tue, Jun 06, 2017 at 03:21:34PM -0400, Peter Eisentraut wrote:
> On 6/3/17 01:04, Noah Misch wrote:
> > On Wed, May 31, 2017 at 10:15:50PM -0400, Peter Eisentraut wrote:
> >> On 5/30/17 13:25, Masahiko Sawada wrote:
> >>> I think this cause is that the relation status entry could be deleted
> >>> by ALTER SUBSCRIPTION REFRESH before corresponding table sync worker
> >>> starting. Attached patch fixes issues reported on this thread so far.
> >>
> >> This looks like a reasonable change, but it conflicts with the change
> >> discussed on "logical replication - still unstable after all these
> >> months".  I think I'll deal with that one first.
> > 
> > The above-described topic is currently a PostgreSQL 10 open item.  You own
> > this open item.  Please observe the policy on open item ownership and send a
> > status update within three calendar days of this message.  Include a date for
> > your subsequent status update.
> 
> I'm working on this now and will report back tomorrow.

IMMEDIATE ATTENTION REQUIRED.  This PostgreSQL 10 open item is past due for
your status update.  Please reacquaint yourself with the policy on open item
ownership[1] and then reply immediately.  If I do not hear from you by
2017-06-10 07:00 UTC, I will transfer this item to release management team
ownership without further notice.

[1] https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com



On 6/8/17 03:54, Masahiko Sawada wrote:
> The reproduction step is provided by tushar but I could reproduced it
> with following step.
> 
> X cluster ->
> =# select 'create table t' || generate_series(1,100) || '(c
> int);';\gexec -- create 100 tables
> =# create table t (c int);  -- create one more table
> =# create publication all_pub for all tables;
> =# create publication one_pub for table t;
> 
> Y Cluster ->
> (create the same 101 tables as well)
> =# create subscription hoge_sub connection 'host=localhost  port=5432'
> publication one_pub;
> =# alter subscription hoge_sub set publication all_pub; select
> pg_sleep(1); alter subscription hoge_sub set publication one_pub;
> *Error occurs here*

Thanks for that explanation.  I have committed the rest of your changes.I didn't add the log message, just exit
silently,since it doesn't seem
 
necessary.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



On 5/30/17 13:25, Masahiko Sawada wrote:
> However there is one more problem here; if the relation status entry
> is deleted while corresponding table sync worker is waiting to be
> changed its status, the table sync worker can be orphaned in waiting
> status. In this case, should table sync worker check the relation
> status and exits if the relation status record gets removed? Or should
> ALTER SUBSCRIPTION update status of table sync worker to UNKNOWN?

I think this would be fixed with the attached patch.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения
On 6/9/17 02:07, Noah Misch wrote:
> On Tue, Jun 06, 2017 at 03:21:34PM -0400, Peter Eisentraut wrote:
>> On 6/3/17 01:04, Noah Misch wrote:
>>> On Wed, May 31, 2017 at 10:15:50PM -0400, Peter Eisentraut wrote:
>>>> On 5/30/17 13:25, Masahiko Sawada wrote:
>>>>> I think this cause is that the relation status entry could be deleted
>>>>> by ALTER SUBSCRIPTION REFRESH before corresponding table sync worker
>>>>> starting. Attached patch fixes issues reported on this thread so far.
>>>>
>>>> This looks like a reasonable change, but it conflicts with the change
>>>> discussed on "logical replication - still unstable after all these
>>>> months".  I think I'll deal with that one first.
>>>
>>> The above-described topic is currently a PostgreSQL 10 open item.  You own
>>> this open item.  Please observe the policy on open item ownership and send a
>>> status update within three calendar days of this message.  Include a date for
>>> your subsequent status update.
>>
>> I'm working on this now and will report back tomorrow.
> 
> IMMEDIATE ATTENTION REQUIRED.

Work is ongoing and progress is being made.  I will check back in on Monday.


-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



On Fri, Jun 9, 2017 at 10:50 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
> On 5/30/17 13:25, Masahiko Sawada wrote:
>> However there is one more problem here; if the relation status entry
>> is deleted while corresponding table sync worker is waiting to be
>> changed its status, the table sync worker can be orphaned in waiting
>> status. In this case, should table sync worker check the relation
>> status and exits if the relation status record gets removed? Or should
>> ALTER SUBSCRIPTION update status of table sync worker to UNKNOWN?
>
> I think this would be fixed with the attached patch.
>

Thank you for the patch. The patch fixes this issue but it takes a
long time to done ALTER SUBSCRIPTION SET PUBLICATION when
max_sync_workers_per_subscription is set high value. Because the
removing entry from pg_subscription_rel and launching a new table sync
worker select a subscription relation state in the same order, the
former doesn't catch up with latter.
For example in my environment, when I test the following step with
max_sync_workers_per_subscription = 15, all table sync workers were
launched once and then killed. How about removing the entry from
pg_subscription_rel in the inverse order?

X cluster ->
=# select 'create table t' || generate_series(1,100) || '(c
int);';\gexec -- create 100 tables
=# create table t (c int);  -- create one more table
=# create publication all_pub for all tables;
=# create publication one_pub for table t;

Y Cluster ->
(create the same 101 tables as well)
=# create subscription hoge_sub connection 'host=localhost  port=5432'
publication all_pub;
=# alter subscription hoge_sub set publication one_pub;  -- execute
this during synchronizing the table, it takes a long time

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center



On 6/11/17 21:40, Masahiko Sawada wrote:
> Thank you for the patch. The patch fixes this issue but it takes a
> long time to done ALTER SUBSCRIPTION SET PUBLICATION when
> max_sync_workers_per_subscription is set high value. Because the
> removing entry from pg_subscription_rel and launching a new table sync
> worker select a subscription relation state in the same order, the
> former doesn't catch up with latter.
> For example in my environment, when I test the following step with
> max_sync_workers_per_subscription = 15, all table sync workers were
> launched once and then killed. How about removing the entry from
> pg_subscription_rel in the inverse order?

I have committed the patch as is.  Optimizations might be possible, but
let's keep in mind that the use case of changing the subscription right
after it was created is a pretty marginal case to begin with.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



On Mon, Jun 12, 2017 at 9:56 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
> On 6/11/17 21:40, Masahiko Sawada wrote:
>> Thank you for the patch. The patch fixes this issue but it takes a
>> long time to done ALTER SUBSCRIPTION SET PUBLICATION when
>> max_sync_workers_per_subscription is set high value. Because the
>> removing entry from pg_subscription_rel and launching a new table sync
>> worker select a subscription relation state in the same order, the
>> former doesn't catch up with latter.
>> For example in my environment, when I test the following step with
>> max_sync_workers_per_subscription = 15, all table sync workers were
>> launched once and then killed. How about removing the entry from
>> pg_subscription_rel in the inverse order?
>
> I have committed the patch as is.  Optimizations might be possible, but
> let's keep in mind that the use case of changing the subscription right
> after it was created is a pretty marginal case to begin with.
>

Thank you for committing the patch. Yes, I understood.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center