Hi Amit,
Sorry for the delayed response.
On Fri, Jan 26, 2018 at 11:58 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Wed, Jan 24, 2018 at 12:44 PM, amul sul <sulamul@gmail.com> wrote:
>> On Tue, Jan 23, 2018 at 7:01 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>> On Fri, Jan 12, 2018 at 11:43 AM, amul sul <sulamul@gmail.com> wrote:
[....]
> I think you can manually (via debugger) hit this by using
> PUBLICATION/SUBSCRIPTION syntax for logical replication. I think what
> you need to do is in node-1, create a partitioned table and subscribe
> it on node-2. Now, perform an Update on node-1, then stop the logical
> replication worker before it calls heap_lock_tuple. Now, in node-2,
> update the same row such that it moves the row. Now, continue the
> logical replication worker. I think it should hit your new code, if
> not then we need to think of some other way.
>
I am able to hit the change log using above steps. Thanks a lot for the
step by step guide, I really needed that.
One strange behavior I found in the logical replication which is reproducible
without attached patch as well -- when I have updated on node2 by keeping
breakpoint before the heap_lock_tuple call in replication worker, I can see
a duplicate row was inserted on the node2, see this:
== NODE 1 ==
postgres=# insert into foo values(1, 'initial insert');
INSERT 0 1
postgres=# select tableoid::regclass, * from foo;
tableoid | a | b
----------+---+----------------
foo1 | 1 | initial insert
(1 row)
=== NODE 2 ==
postgres=# select tableoid::regclass, * from foo;
tableoid | a | b
----------+---+----------------
foo1 | 1 | initial insert
(1 row)
== NODE 1 ==
postgres=# update foo set a=2, b='node1_update' where a=1;
UPDATE 1
<---- BREAK POINT BEFORE heap_lock_tuple IN replication worker --->
== NODE 2 ==
postgres=# update foo set a=2, b='node2_update' where a=1;
<---- RELEASE BREAK POINT --->
postgres=# 2018-02-02 12:35:45.050 IST [91786] LOG: tuple to be
locked was already moved to another partition due to concurrent
update, retrying
postgres=# select tableoid::regclass, * from foo;
tableoid | a | b
----------+---+--------------
foo2 | 2 | node2_update
foo2 | 2 | node1_update
(2 rows)
== NODE 1 ==
postgres=# select tableoid::regclass, * from foo;
tableoid | a | b
----------+---+--------------
foo2 | 2 | node1_update
(1 row)
I am thinking to report this in a separate thread, but not sure if
this is already known behaviour or not.
== schema to reproduce above case ==
-- node1
create table foo (a int2, b text) partition by list (a);
create table foo1 partition of foo for values IN (1);
create table foo2 partition of foo for values IN (2);
insert into foo values(1, 'initial insert');
CREATE PUBLICATION update_row_mov_pub FOR ALL TABLES;
ALTER TABLE foo REPLICA IDENTITY FULL;
ALTER TABLE foo1 REPLICA IDENTITY FULL;
ALTER TABLE foo2 REPLICA IDENTITY FULL;
-- node2
create table foo (a int2, b text) partition by list (a);
create table foo1 partition of foo for values IN (1);
create table foo2 partition of foo for values IN (2);
CREATE SUBSCRIPTION update_row_mov_sub CONNECTION 'host=localhost
dbname=postgres' PUBLICATION update_row_mov_pub;
== END==
Updated patch attached -- correct changes in execReplication.c.
Regards,
Amul Sul