Обсуждение: AtEOXact_ApplyLauncher() and subtransactions

Поиск
Список
Период
Сортировка

AtEOXact_ApplyLauncher() and subtransactions

От
Amit Khandekar
Дата:
Hi,

When a SUBSCRIPTION is altered, then the currently running
table-synchronization workers that are no longer needed for the
altered subscription, are terminated. This is done by the function
AtEOXact_ApplyLauncher() inside CommitTransaction(). So during each
ALTER-SUBSCRIPTION command, the on_commit_stop_workers list is
appended with new set of workers to be stopped. And then at commit,
AtEOXact_ApplyLauncher() stops the workers in the list.

But there is no handling for sub-transaction abort. Consider this :

-- On secondary, create a subscription that will initiate the table sync
CREATE SUBSCRIPTION mysub CONNECTION
'dbname=postgres host=localhost user=rep password=Password port=5432'
PUBLICATION mypub;

-- While the sync is going on, change the publication of the
-- subscription in a subtransaction, so that it adds up the synchronization
-- workers for the earlier publication into the 'on_commit_stop_workers' list
select pg_sleep(1);
begin;
savepoint a;
ALTER SUBSCRIPTION mysub SET PUBLICATION mypub_2;
rollback to a;
-- Commit will stop the above sync.
commit;

So the ALTER-SUBSCRIPTION has been rolled back. But the
on_commit_stop_workers is not reverted back to its earlier state. And
then at commit, the workers will be killed unnecessarily. I have
observed that when the workers are killed, they are restarted. But
consider the case where the original subscription was synchronizing
hundreds of tables. And just when it is about to finish, someone
changes the subscription inside a subtransaction and rolls it back,
and then commits the transaction. The whole synchronization gets
re-started from scratch.

Below log snippets show this behaviour :

< CREATE-SUBSCRIPTION command spawns workers >
2018-06-05 15:04:07.841 IST [39951] LOG:  logical replication apply
worker for subscription "mysub" has started
2018-06-05 15:04:07.848 IST [39953] LOG:  logical replication table
synchronization worker for subscription "mysub", table "article" has
started

< After some time, ALTER-SUBSCRIPTION rollbacks inside subtransaction,
and commits transaction. Workers are killed >
2018-06-05 15:04:32.903 IST [39953] FATAL:  terminating logical
replication worker due to administrator command
2018-06-05 15:04:32.903 IST [39953] CONTEXT:  COPY article, line 37116293
2018-06-05 15:04:32.904 IST [39666] LOG:  background worker "logical
replication worker" (PID 39953) exited with exit code 1

< Workers are restarted >
2018-06-05 15:04:32.909 IST [40003] LOG:  logical replication table
synchronization worker for subscription "mysub", table "article" has
started
< Synchronization done after some time >
2018-06-05 15:05:10.042 IST [40003] LOG:  logical replication table
synchronization worker for subscription "mysub", table "article" has
finished

-----------

To reproduce the issue :
1. On master server, create the tables and publications using attached
setup_master.sql.
2. On secondary server, run the attached run_on_secondary.sql. This
will reproduce the issue as can be seen from the log output.

-----------

Fix :

I haven't written a patch for it, but I think we should have a
separate on_commit_stop_workers for each subtransaction. At
subtransaction commit, we replace the on_commit_stop_workers list of
the parent subtransaction with the one from the committed
subtransaction; and on abort, discard the list of the current
subtransaction. So have a stack of the lists.

-----------

Furthermore, before fixing this, we may have to fix another issue
which, I suspect, would surface even without subtransactions, as
explained below :

Suppose publication pubx is set for tables tx1 and and tx2.
And publication puby is for tables ty1 and ty2.

Subscription mysub is set to synchronise tables tx1 and tx2 :
CREATE SUBSCRIPTION mysub ... PUBLICATION pubx;

Now suppose the subscription is altered to synchronise ty1 and ty2,
and then again altered back to synchronise tx1 and tx2 in the same
transaction.
begin;
ALTER SUBSCRIPTION mysub set publication puby;
ALTER SUBSCRIPTION mysub set publication pubx;
commit;

Here, workers for tx1 and tx2 are added to on_commit_stop_workers
after the publication is set to puby. And then workers for ty1 and ty2
are further added to that list after the 2nd ALTER command where
publication is set to pubx, because the earlier ALTER command has
already changed the catalogs to denote that ty1 and ty2 are being
synchronised. Effectively, at commit, all the workers are targetted to
be stopped, when actually at commit time there is no final change in
the tables to be synchronised. What actually we should do is : for
each ALTER command we should compare the very first set of tables
being synchronised since the last committed ALTER command, rather than
checking the catalogs.

-- 
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company

Вложения

Re: AtEOXact_ApplyLauncher() and subtransactions

От
Alvaro Herrera
Дата:
On 2018-Jun-05, Amit Khandekar wrote:

> When a SUBSCRIPTION is altered, then the currently running
> table-synchronization workers that are no longer needed for the
> altered subscription, are terminated. This is done by the function
> AtEOXact_ApplyLauncher() inside CommitTransaction(). So during each
> ALTER-SUBSCRIPTION command, the on_commit_stop_workers list is
> appended with new set of workers to be stopped. And then at commit,
> AtEOXact_ApplyLauncher() stops the workers in the list.
> 
> But there is no handling for sub-transaction abort.

Peter, any comments here?

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: AtEOXact_ApplyLauncher() and subtransactions

От
Peter Eisentraut
Дата:
On 6/5/18 07:02, Amit Khandekar wrote:
> I haven't written a patch for it, but I think we should have a
> separate on_commit_stop_workers for eachyou get subtransaction. At
> subtransaction commit, we replace the on_commit_stop_workers list of
> the parent subtransaction with the one from the committed
> subtransaction; and on abort, discard the list of the current
> subtransaction. So have a stack of the lists.

Is there maybe a more general mechanism we could attach this to?  Maybe
resource owners?

> Subscription mysub is set to synchronise tables tx1 and tx2 :
> CREATE SUBSCRIPTION mysub ... PUBLICATION pubx;
> 
> Now suppose the subscription is altered to synchronise ty1 and ty2,
> and then again altered back to synchronise tx1 and tx2 in the same
> transaction.
> begin;
> ALTER SUBSCRIPTION mysub set publication puby;
> ALTER SUBSCRIPTION mysub set publication pubx;
> commit;
> 
> Here, workers for tx1 and tx2 are added to on_commit_stop_workers
> after the publication is set to puby. And then workers for ty1 and ty2
> are further added to that list after the 2nd ALTER command where
> publication is set to pubx, because the earlier ALTER command has
> already changed the catalogs to denote that ty1 and ty2 are being
> synchronised. Effectively, at commit, all the workers are targetted to
> be stopped, when actually at commit time there is no final change in
> the tables to be synchronised.

I'm not so bothered about this scenario.  When you drop and then
recreate a table in the same transaction, that doesn't mean you keep the
data that was previously in the table.  If you want to *undo* a change,
you need to do rollback, not commit further changes that try to recreate
the previous state.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: AtEOXact_ApplyLauncher() and subtransactions

От
Amit Khandekar
Дата:
On 15 June 2018 at 09:46, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
> On 6/5/18 07:02, Amit Khandekar wrote:
>> I haven't written a patch for it, but I think we should have a
>> separate on_commit_stop_workers for eachyou get subtransaction. At
>> subtransaction commit, we replace the on_commit_stop_workers list of
>> the parent subtransaction with the one from the committed
>> subtransaction; and on abort, discard the list of the current
>> subtransaction. So have a stack of the lists.
>
> Is there maybe a more general mechanism we could attach this to?  Maybe
> resource owners?

You mean using something like ResourceOwnerRelease() ? We need to
merge the on_commit_stop_workers list into the parent transaction's
list. So we can't release the whole list on commit.

The way I am implementing this can be seen in attached
apply_launcher_subtrans_WIP.patch. (check launcher.c changes). I
haven't started testing it yet though.
on_commit_stop_workers denotes a list of subscriptions, and each
element also contains a list of relations for that subscription.
This list is built by ALTER-SUBSCRIPTION commands that have run in the
current (sub)transaction.

At sub-transaction start, the on_commit_stop_workers is pushed into a
stack. Then on_commit_stop_workers starts with empty list. This list
is then populated by ALTER-SUBCSRIPTION commands of the current
sub-transaction.

At sub-transaction commit, this list is merged into the list of parent
subtransaction, the parent transaction stack element is popped out,
and the merged list becomes the new on_commit_stop_workers.
So say, parent has sub1(tab1, tab2), sub2(tab2, tab3), where sub means
subscription.
and current on_commit_workers has sub2(tab4) and sub3(tab1)
At commit, for subscription sub2, we should replace the outer
sub2(tab2, tab3) with the newer sub2(tab4).
So, the merged list will have :
sub1(tab1, tab2), sub2(tab4), sub3(tab1)

At sub-transaction abort, the on_commit_stop_workers is discarded. The
parent transaction worker list is assigned back to
on_commit_stop_workers.


>
>> Subscription mysub is set to synchronise tables tx1 and tx2 :
>> CREATE SUBSCRIPTION mysub ... PUBLICATION pubx;
>>
>> Now suppose the subscription is altered to synchronise ty1 and ty2,
>> and then again altered back to synchronise tx1 and tx2 in the same
>> transaction.
>> begin;
>> ALTER SUBSCRIPTION mysub set publication puby;
>> ALTER SUBSCRIPTION mysub set publication pubx;
>> commit;
>>
>> Here, workers for tx1 and tx2 are added to on_commit_stop_workers
>> after the publication is set to puby. And then workers for ty1 and ty2
>> are further added to that list after the 2nd ALTER command where
>> publication is set to pubx, because the earlier ALTER command has
>> already changed the catalogs to denote that ty1 and ty2 are being
>> synchronised. Effectively, at commit, all the workers are targetted to
>> be stopped, when actually at commit time there is no final change in
>> the tables to be synchronised.
>
> I'm not so bothered about this scenario.  When you drop and then
> recreate a table in the same transaction, that doesn't mean you keep the
> data that was previously in the table.  If you want to *undo* a change,
> you need to do rollback, not commit further changes that try to recreate
> the previous state.

May be the example I gave is not practical. But a user can as well do
something like :

CREATE SUBSCRIPTION mysub ... PUBLICATION pub1;
...
begin;
ALTER SUBSCRIPTION mysub set publication pub2;
....
ALTER SUBSCRIPTION mysub set publication pub3;
commit;
So pub3 is yet another publication. So here the old one is not set back again.

Or may there can be ALTER SUBSCRIPTION REFRESH.

I believe on_commit_stop_workers was designed keeping in mind that all
actions are to be done only at commit; we should not immediately stop
workers. So it implies that the workers it determines in the end of
commit, also should be accurate. I have anyways included the changes
for this in the same attached patch (changes are in
subscriptioncmds.c)

-- 
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company

Вложения

Re: AtEOXact_ApplyLauncher() and subtransactions

От
Amit Khandekar
Дата:
On 16 June 2018 at 00:03, Amit Khandekar <amitdkhan.pg@gmail.com> wrote:
> The way I am implementing this can be seen in attached
> apply_launcher_subtrans_WIP.patch. (check launcher.c changes). I
> haven't started testing it yet though.

Attached patch passes some basic testing I did. Will do some more
testing, and some self-review and code organising, if required. I will
also split the patch into two : one containing the main issue
regarding subtransaction, and the other containing the other issue I
mentioned earlier that shows up without subtransaction as well.

-- 
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company

Вложения

Re: AtEOXact_ApplyLauncher() and subtransactions

От
Amit Khandekar
Дата:
On 18 June 2018 at 15:02, Amit Khandekar <amitdkhan.pg@gmail.com> wrote:
> On 16 June 2018 at 00:03, Amit Khandekar <amitdkhan.pg@gmail.com> wrote:
>> The way I am implementing this can be seen in attached
>> apply_launcher_subtrans_WIP.patch. (check launcher.c changes). I
>> haven't started testing it yet though.
>
> Attached patch passes some basic testing I did. Will do some more
> testing, and some self-review and code organising, if required.

Done. Attached is v2 version of the patch. Comments welcome.

Changed GetSubscriptionRelations() to GetSubscriptionRelids(), which
now returns only the oids, not the subrel states. This was convenient
for storing the exact returned list into the committed subscription
rels. And anyways the subrel states were not used anywhere.

> I will also split the patch into two : one containing the main issue
> regarding subtransaction, and the other containing the other issue I
> mentioned earlier that shows up without subtransaction as well.

Did not split the patch. The changes for the other issue that shows up
without subtransaction are all in subscriptioncmds.c , and that file
contains mostly the changes for this issue. So kept it as a single
patch. But if it gets inconvenient for someone while reviewing, I will
be happy to split it.

-- 
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company

Вложения

Re: AtEOXact_ApplyLauncher() and subtransactions

От
Amit Khandekar
Дата:
Added this into the July 2018 commitfest :

https://commitfest.postgresql.org/18/1696/

On 20 June 2018 at 22:22, Amit Khandekar <amitdkhan.pg@gmail.com> wrote:
> On 18 June 2018 at 15:02, Amit Khandekar <amitdkhan.pg@gmail.com> wrote:
>> On 16 June 2018 at 00:03, Amit Khandekar <amitdkhan.pg@gmail.com> wrote:
>>> The way I am implementing this can be seen in attached
>>> apply_launcher_subtrans_WIP.patch. (check launcher.c changes). I
>>> haven't started testing it yet though.
>>
>> Attached patch passes some basic testing I did. Will do some more
>> testing, and some self-review and code organising, if required.
>
> Done. Attached is v2 version of the patch. Comments welcome.
>
> Changed GetSubscriptionRelations() to GetSubscriptionRelids(), which
> now returns only the oids, not the subrel states. This was convenient
> for storing the exact returned list into the committed subscription
> rels. And anyways the subrel states were not used anywhere.
>
>> I will also split the patch into two : one containing the main issue
>> regarding subtransaction, and the other containing the other issue I
>> mentioned earlier that shows up without subtransaction as well.
>
> Did not split the patch. The changes for the other issue that shows up
> without subtransaction are all in subscriptioncmds.c , and that file
> contains mostly the changes for this issue. So kept it as a single
> patch. But if it gets inconvenient for someone while reviewing, I will
> be happy to split it.
>
> --
> Thanks,
> -Amit Khandekar
> EnterpriseDB Corporation
> The Postgres Database Company



-- 
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company


Re: AtEOXact_ApplyLauncher() and subtransactions

От
Robert Haas
Дата:
On Tue, Jun 26, 2018 at 6:25 AM, Amit Khandekar <amitdkhan.pg@gmail.com> wrote:
> Added this into the July 2018 commitfest :
>
> https://commitfest.postgresql.org/18/1696/

It seems to me that it would probably be better to separate this into
two patches, because I think there are really two separate issues.
With regard to the lack of proper subtransaction handling, I think it
would be best if we could avoid introducing a new AtSubStart function.
I wrote a patch for this issue that works that uses a slightly
different kind of stack than yours, which I have attached to this
email, and it creates stack levels lazily so that it doesn't need an
AtSubStart function. It would probably also be possible to adapt your
patch to create new stack levels on demand instead of at
subtransaction start.  I'm not sure which approach is better, but I do
think it would be best not to use your patch as you have it now,
because that does unnecessary work at the beginning and end of every
subtransaction if there is an ALTER SUBSCRIPTION command pending at an
outer level, even though the subtransaction may never touch the
subscription state.

As for the other part of your fix, which I think basically boils down
to comparing the final states instead of just looking at what got
changed, the logic looks complicated and I don't think I fully
understand it, but here are a few comments.

+       subrelids = GetSubscriptionRelids(sub->oid,
+                                               committed_subrels ?
+                                               CurrentMemoryContext :
TopTransactionContext);

This looks ugly and dangerous.  In the first place, if
GetSubscriptionRelids() needs to work in one of several memory
contexts, the best thing would probably be for the caller to be
responsible for saving and restoring the memory context as needed,
rather than passing it as an argument.  Secondly, it's not very clear
why we need to do this.  The comment says we have to do it, but it
doesn't give a reason.

+                        * Merge the current list into the immediate parent.
+                        * So say, parent has sub1(tab1, tab2),
sub2(tab2, tab3),
+                        * and current on_commit_workers has
sub2(tab4) and sub3(tab1),
+                        * then the merged list will have :
+                        * sub1(tab1, tab2), sub2(tab4), sub3(tab1)

I don't think this is very clear.  Also, why is that the correct
answer?  Why not sub2(tab2, tab3, tab4)?

+                       foreach(lc, on_commit_stop_workers)
+                       {
+                               SubscriptionRels *subrels = lfirst(lc);
+                               ListCell *lc1;
+
+                               /* Search this subrel in the subrels
of the top of stack. */
+                               foreach(lc1,
subtrans_stop_workers->stop_workers)

This might be very expensive if both lists are long.  I guess that's
probably not very likely.  You would need to have modify a lot of
subscriptions and then, within a subtransaction, modify a lot of
subscriptions again.

+       foreach(lc, committed_subrels_list)
+       {
+               SubscriptionRels *subrels = (SubscriptionRels *) lfirst(lc);
+
+               if (sub->oid == subrels->subid)
+               {
+                       committed_subrels = subrels;
+                       break;
+               }
+       }

This looks to be O(n^2) in the number of subscriptions modified in a
single transaction.

Like Peter, I'm not entirely sure how much we should worry about this
problem.  However, if we're going to worry about it, I wonder if we
should worry harder and try to come up with a better data structure
than a bunch of linked lists.  Maybe a hash table indexed by subid?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Вложения

Re: AtEOXact_ApplyLauncher() and subtransactions

От
Amit Khandekar
Дата:
On 4 July 2018 at 00:27, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Jun 26, 2018 at 6:25 AM, Amit Khandekar <amitdkhan.pg@gmail.com> wrote:
>> Added this into the July 2018 commitfest :
>>
>> https://commitfest.postgresql.org/18/1696/
>
> It seems to me that it would probably be better to separate this into
> two patches, because I think there are really two separate issues.

Ok, will do that.

> With regard to the lack of proper subtransaction handling, I think it
> would be best if we could avoid introducing a new AtSubStart function.
> I wrote a patch for this issue that works that uses a slightly
> different kind of stack than yours, which I have attached to this
> email, and it creates stack levels lazily so that it doesn't need an
> AtSubStart function.

Yes, I agree that if we can avoid this function, that would be good.
Couldn't find a proper way to do this. Will have a look at your patch.

> As for the other part of your fix, which I think basically boils down
> to comparing the final states instead of just looking at what got
> changed, the logic looks complicated and I don't think I fully
> understand it.

Once I split the patch, let me try to add up some comments to make it clearer.

> but here are a few comments.
>
> +       subrelids = GetSubscriptionRelids(sub->oid,
> +                                               committed_subrels ?
> +                                               CurrentMemoryContext :
> TopTransactionContext);
>
> This looks ugly and dangerous.  In the first place, if
> GetSubscriptionRelids() needs to work in one of several memory
> contexts, the best thing would probably be for the caller to be
> responsible for saving and restoring the memory context as needed,
> rather than passing it as an argument.

I wanted to make the context switch only around the code which
allocates the list. I understand that all the other code in
GetSubscriptionRelids() such as heap_open() , systable_beginscan() etc
would not be affected if we run that in TopTransactionContext, because
that all gets freed in the cleanup functions at the bottom of the
function, but still I thought this way would be more robust for
future, in case there is some new code that allocates some temporary
memory which is not freed.

> Secondly, it's not very clear
> why we need to do this.  The comment says we have to do it, but it
> doesn't give a reason.

When committed_subrels is NULL, it means this is the first time we are
running the ALTER command since the main transaction started, and we
want to store the committed subrels list in the transaction context,
so that the subsequent ALTER commands compare with this list. May be
when I split the patch, I will come up with an example.

>
> +                        * Merge the current list into the immediate parent.
> +                        * So say, parent has sub1(tab1, tab2),
> sub2(tab2, tab3),
> +                        * and current on_commit_workers has
> sub2(tab4) and sub3(tab1),
> +                        * then the merged list will have :
> +                        * sub1(tab1, tab2), sub2(tab4), sub3(tab1)
>
> I don't think this is very clear.  Also, why is that the correct
> answer?  Why not sub2(tab2, tab3, tab4)?

Parent sub-transaction has sub2(tab2, tab3), meaning that when the
subtransaction sub2 commits (and when the COMMIT happens), we should
stop the worker for (subscription sub2, table tab2) , and the worker
for (subscription sub2, table tab3).

And the current sub-transaction has modified the subscription sub2
such that the workers to be stopped are sub2(tab4). Note that this the
final list of workers to be stopped, regardless of the earlier ALTER
SUBSCRIPTION commands having run in the upper subtransaction. So this
list should completely replace the parent's sub2(tab2, tab3). We
always compare the subscription tables with the ones that are
committed, not with the ones that were last altered in the same
transaction. Hence the other patch.

>
> +                       foreach(lc, on_commit_stop_workers)
> +                       {
> +                               SubscriptionRels *subrels = lfirst(lc);
> +                               ListCell *lc1;
> +
> +                               /* Search this subrel in the subrels
> of the top of stack. */
> +                               foreach(lc1,
> subtrans_stop_workers->stop_workers)
>
> This might be very expensive if both lists are long.  I guess that's
> probably not very likely.  You would need to have modify a lot of
> subscriptions and then, within a subtransaction, modify a lot of
> subscriptions again.

Yes that's what I also think.

>
> +       foreach(lc, committed_subrels_list)
> +       {
> +               SubscriptionRels *subrels = (SubscriptionRels *) lfirst(lc);
> +
> +               if (sub->oid == subrels->subid)
> +               {
> +                       committed_subrels = subrels;
> +                       break;
> +               }
> +       }
>
> This looks to be O(n^2) in the number of subscriptions modified in a
> single transaction.
>
> Like Peter, I'm not entirely sure how much we should worry about this
> problem.

At the upper sections, I have explained it a bit. Basically, without
this second patch, the main patch won't work if we have the same
subscription altered more than once in the sub-transaction stack.

> However, if we're going to worry about it, I wonder if we
> should worry harder and try to come up with a better data structure
> than a bunch of linked lists.  Maybe a hash table indexed by subid?

I had thought about using hash to search a subscription id, but then
thought in practice, there would not be too many subscriptions in the
same transaction. Hence continued to use the existing linked list data
structure; just that instead of a single linked list having
everything, I thought let's keep a list of tables belonging to the
same subscription in one linked list. The reason for this was that it
is easy to abandon or merge a subscription table list when a
sub-transaction rolls back or commits.


-- 
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company


Re: AtEOXact_ApplyLauncher() and subtransactions

От
Amit Khandekar
Дата:
On Thu, 5 Jul 2018 at 3:37 PM, Amit Khandekar <amitdkhan.pg@gmail.com> wrote:
>
> On 4 July 2018 at 00:27, Robert Haas <robertmhaas@gmail.com> wrote:
> > On Tue, Jun 26, 2018 at 6:25 AM, Amit Khandekar <amitdkhan.pg@gmail.com> wrote:
> >> Added this into the July 2018 commitfest :
> >>
> >> https://commitfest.postgresql.org/18/1696/
> >
> > It seems to me that it would probably be better to separate this into
> > two patches, because I think there are really two separate issues.
>
> Ok, will do that.
>
> > With regard to the lack of proper subtransaction handling, I think it
> > would be best if we could avoid introducing a new AtSubStart function.
> > I wrote a patch for this issue that works that uses a slightly
> > different kind of stack than yours, which I have attached to this
> > email, and it creates stack levels lazily so that it doesn't need an
> > AtSubStart function.
>
> Yes, I agree that if we can avoid this function, that would be good.
> Couldn't find a proper way to do this. Will have a look at your patch.

I have split it into two patches.

0001 patch contains the main fix. In this patch I have used some
naming conventions and some comments that you used in your patch,
plus, I used your method of lazily allocating new stack level. The
stack is initially Null.

The fix for the other issue is in 0002 patch. Having separate rel oids
list for each subids is essential only for this issue. So the changes
for using this structure are in this patch, not the 0001 one. As you
suggested, I have kept the subids in hash table as against linked
list.


> > As for the other part of your fix, which I think basically boils down
> > to comparing the final states instead of just looking at what got
> > changed, the logic looks complicated and I don't think I fully
> > understand it.
>
> Once I split the patch, let me try to add up some comments to make it clearer.

When a subscription is altered for the *first* time in a transaction,
an entry is created for that sub, in committed_subrels_table hash
table. That entry represents a cached list of tables belonging to that
subscription since the last committed change. For each ALTER
SUBSCRIPTION command, if we create the stop workers by comparing with
this cached list, we have the final list of stop-workers if committed
in this state. So if there are two ALTER commands for the same
subscription, the second one replaces the earlier stop-worker list by
its own list.

I have added some more comments in the below snippet as shown. Hope
that this helps :

@@ -594,7 +619,16 @@ AlterSubscription_refresh(Subscription *sub, bool
copy_data)
{
RemoveSubscriptionRel(sub->oid, relid);

- logicalrep_worker_stop_at_commit(sub->oid, relid);
+ /*
+ * If we found an entry in committed_subrels for this subid, that
+ * means subrelids represents a modified version of the
+ * committed_subrels_entry->relids. If we didn't find an entry, it
+ * means this is the first time we are altering the sub, so they
+ * both have the same committed list; so in that case, we avoid
+ * another iteration below, and create the stop workers here itself.
+ */
+ if (!sub_found)
+     stop_relids = lappend_oid(stop_relids, relid);

  ereport(DEBUG1,
          (errmsg("table \"%s.%s\" removed from subscription \"%s\"",

Вложения

Re: AtEOXact_ApplyLauncher() and subtransactions

От
Robert Haas
Дата:
On Mon, Jul 16, 2018 at 2:36 AM, Amit Khandekar <amitdkhan.pg@gmail.com> wrote:
> 0001 patch contains the main fix. In this patch I have used some
> naming conventions and some comments that you used in your patch,
> plus, I used your method of lazily allocating new stack level. The
> stack is initially Null.

Committed and back-patched to v11 and v10.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: AtEOXact_ApplyLauncher() and subtransactions

От
Amit Khandekar
Дата:
On 17 July 2018 at 03:29, Robert Haas <robertmhaas@gmail.com> wrote:
> On Mon, Jul 16, 2018 at 2:36 AM, Amit Khandekar <amitdkhan.pg@gmail.com> wrote:
>> 0001 patch contains the main fix. In this patch I have used some
>> naming conventions and some comments that you used in your patch,
>> plus, I used your method of lazily allocating new stack level. The
>> stack is initially Null.
>
> Committed and back-patched to v11 and v10.

Thanks Robert.