Обсуждение: bug in update tuple routing with foreign partitions

Поиск
Список
Период
Сортировка

bug in update tuple routing with foreign partitions

От
Amit Langote
Дата:
Hi,

(added Fujita-san)

I noticed a bug with how UPDATE tuple routing initializes ResultRelInfos
to use for partition routing targets.  Specifically, the bug occurs when
UPDATE targets include a foreign partition that is locally modified (as
opposed to being modified directly on the remove server) and its
ResultRelInfo (called subplan result rel in the source code) is reused for
tuple routing:

-- setup
create extension postgres_fdw ;
create server loopback foreign data wrapper postgres_fdw;
create user mapping for current_user server loopback;
create table p (a int) partition by list (a);
create table p1 partition of p for values in (1);
create table p2base (a int check (a = 2));
create foreign table p2 partition of p for values in (2) server loopback
options (table_name 'p2base');
insert into p values (1), (2);

-- see in the plan that foreign partition p2 is locally modified
explain verbose update p set a = 2 from (values (1), (2)) s(x) where a =
s.x returning *;
                                   QUERY PLAN

─────────────────────────────────────────────────────────────────────────────────
 Update on public.p  (cost=0.05..236.97 rows=50 width=42)
   Output: p1.a, "*VALUES*".column1
   Update on public.p1
   Foreign Update on public.p2
     Remote SQL: UPDATE public.p2base SET a = $2 WHERE ctid = $1 RETURNING a
   ->  Hash Join  (cost=0.05..45.37 rows=26 width=42)
         Output: 2, p1.ctid, "*VALUES*".*, "*VALUES*".column1
         Hash Cond: (p1.a = "*VALUES*".column1)
         ->  Seq Scan on public.p1  (cost=0.00..35.50 rows=2550 width=10)
               Output: p1.ctid, p1.a
         ->  Hash  (cost=0.03..0.03 rows=2 width=32)
               Output: "*VALUES*".*, "*VALUES*".column1
               ->  Values Scan on "*VALUES*"  (cost=0.00..0.03 rows=2
width=32)
                     Output: "*VALUES*".*, "*VALUES*".column1
   ->  Hash Join  (cost=100.05..191.59 rows=24 width=42)
         Output: 2, p2.ctid, "*VALUES*".*, "*VALUES*".column1
         Hash Cond: (p2.a = "*VALUES*".column1)
         ->  Foreign Scan on public.p2  (cost=100.00..182.27 rows=2409
width=10)
               Output: p2.ctid, p2.a
               Remote SQL: SELECT a, ctid FROM public.p2base FOR UPDATE
         ->  Hash  (cost=0.03..0.03 rows=2 width=32)
               Output: "*VALUES*".*, "*VALUES*".column1
               ->  Values Scan on "*VALUES*"  (cost=0.00..0.03 rows=2
width=32)
                     Output: "*VALUES*".*, "*VALUES*".column1


-- as opposed to directly on remote side (because there's no local join)
explain verbose update p set a = 2 returning *;
                                 QUERY PLAN
─────────────────────────────────────────────────────────────────────────────
 Update on public.p  (cost=0.00..227.40 rows=5280 width=10)
   Output: p1.a
   Update on public.p1
   Foreign Update on public.p2
   ->  Seq Scan on public.p1  (cost=0.00..35.50 rows=2550 width=10)
         Output: 2, p1.ctid
   ->  Foreign Update on public.p2  (cost=100.00..191.90 rows=2730 width=10)
         Remote SQL: UPDATE public.p2base SET a = 2 RETURNING a
(8 rows)

Running the first update query crashes:

update p set a = 2 from (values (1), (2)) s(x) where a = s.x returning
tableoid::regclass, *;
server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

The problem seems to occur because ExecSetupPartitionTupleRouting thinks
it can reuse p2's ResultRelInfo for tuple routing.  In this case, it can't
be used, because its ri_FdwState contains information that will be needed
for postgresExecForeignUpdate to work, but it's still used today.  Because
it's assigned to be used for tuple routing, its ri_FdwState will be
overwritten by postgresBeginForeignInsert that's invoked by the tuple
routing code using the aforementioned ResultRelInfo.  Crash occurs when
postgresExecForeignUpdate () can't find the information it's expecting in
the ri_FdwState.

The solution is to teach ExecSetupPartitionTupleRouting to avoid using a
subplan result rel if its ri_FdwState is already set.  I was wondering if
it should also check ri_usesFdwDirectModify is true, but in that case,
ri_FdwState is unused, so it's perhaps safe for tuple routing code to
scribble on it.

I have attached 2 patches, one for PG 11 where the problem first appeared
and another for HEAD.  The patch for PG 11 is significantly bigger due to
having to handle the complexities of mapping of subplan result rel indexes
to leaf partition indexes.  Most of that complexity is gone in HEAD due to
3f2393edefa5, so the patch for HEAD is much simpler.  I've also added a
test in postgres_fdw.sql to exercise this test case.

Thanks,
Amit

Вложения

Re: bug in update tuple routing with foreign partitions

От
Etsuro Fujita
Дата:
Hi Amit,

(2019/03/06 18:33), Amit Langote wrote:
> I noticed a bug with how UPDATE tuple routing initializes ResultRelInfos
> to use for partition routing targets.  Specifically, the bug occurs when
> UPDATE targets include a foreign partition that is locally modified (as
> opposed to being modified directly on the remove server) and its
> ResultRelInfo (called subplan result rel in the source code) is reused for
> tuple routing:

Will look into this.

Best regards,
Etsuro Fujita



Re: bug in update tuple routing with foreign partitions

От
Etsuro Fujita
Дата:
(2019/03/06 18:33), Amit Langote wrote:
> I noticed a bug with how UPDATE tuple routing initializes ResultRelInfos
> to use for partition routing targets.  Specifically, the bug occurs when
> UPDATE targets include a foreign partition that is locally modified (as
> opposed to being modified directly on the remove server) and its
> ResultRelInfo (called subplan result rel in the source code) is reused for
> tuple routing:
>
> -- setup
> create extension postgres_fdw ;
> create server loopback foreign data wrapper postgres_fdw;
> create user mapping for current_user server loopback;
> create table p (a int) partition by list (a);
> create table p1 partition of p for values in (1);
> create table p2base (a int check (a = 2));
> create foreign table p2 partition of p for values in (2) server loopback
> options (table_name 'p2base');
> insert into p values (1), (2);
>
> -- see in the plan that foreign partition p2 is locally modified
> explain verbose update p set a = 2 from (values (1), (2)) s(x) where a =
> s.x returning *;
>                                     QUERY PLAN
>
> ─────────────────────────────────────────────────────────────────────────────────
>   Update on public.p  (cost=0.05..236.97 rows=50 width=42)
>     Output: p1.a, "*VALUES*".column1
>     Update on public.p1
>     Foreign Update on public.p2
>       Remote SQL: UPDATE public.p2base SET a = $2 WHERE ctid = $1 RETURNING a
>     ->   Hash Join  (cost=0.05..45.37 rows=26 width=42)
>           Output: 2, p1.ctid, "*VALUES*".*, "*VALUES*".column1
>           Hash Cond: (p1.a = "*VALUES*".column1)
>           ->   Seq Scan on public.p1  (cost=0.00..35.50 rows=2550 width=10)
>                 Output: p1.ctid, p1.a
>           ->   Hash  (cost=0.03..0.03 rows=2 width=32)
>                 Output: "*VALUES*".*, "*VALUES*".column1
>                 ->   Values Scan on "*VALUES*"  (cost=0.00..0.03 rows=2
> width=32)
>                       Output: "*VALUES*".*, "*VALUES*".column1
>     ->   Hash Join  (cost=100.05..191.59 rows=24 width=42)
>           Output: 2, p2.ctid, "*VALUES*".*, "*VALUES*".column1
>           Hash Cond: (p2.a = "*VALUES*".column1)
>           ->   Foreign Scan on public.p2  (cost=100.00..182.27 rows=2409
> width=10)
>                 Output: p2.ctid, p2.a
>                 Remote SQL: SELECT a, ctid FROM public.p2base FOR UPDATE
>           ->   Hash  (cost=0.03..0.03 rows=2 width=32)
>                 Output: "*VALUES*".*, "*VALUES*".column1
>                 ->   Values Scan on "*VALUES*"  (cost=0.00..0.03 rows=2
> width=32)
>                       Output: "*VALUES*".*, "*VALUES*".column1
>
>
> -- as opposed to directly on remote side (because there's no local join)
> explain verbose update p set a = 2 returning *;
>                                   QUERY PLAN
> ─────────────────────────────────────────────────────────────────────────────
>   Update on public.p  (cost=0.00..227.40 rows=5280 width=10)
>     Output: p1.a
>     Update on public.p1
>     Foreign Update on public.p2
>     ->   Seq Scan on public.p1  (cost=0.00..35.50 rows=2550 width=10)
>           Output: 2, p1.ctid
>     ->   Foreign Update on public.p2  (cost=100.00..191.90 rows=2730 width=10)
>           Remote SQL: UPDATE public.p2base SET a = 2 RETURNING a
> (8 rows)
>
> Running the first update query crashes:
>
> update p set a = 2 from (values (1), (2)) s(x) where a = s.x returning
> tableoid::regclass, *;
> server closed the connection unexpectedly
>     This probably means the server terminated abnormally
>     before or while processing the request.
> The connection to the server was lost. Attempting reset: Failed.
>
> The problem seems to occur because ExecSetupPartitionTupleRouting thinks
> it can reuse p2's ResultRelInfo for tuple routing.  In this case, it can't
> be used, because its ri_FdwState contains information that will be needed
> for postgresExecForeignUpdate to work, but it's still used today.  Because
> it's assigned to be used for tuple routing, its ri_FdwState will be
> overwritten by postgresBeginForeignInsert that's invoked by the tuple
> routing code using the aforementioned ResultRelInfo.  Crash occurs when
> postgresExecForeignUpdate () can't find the information it's expecting in
> the ri_FdwState.

Agreed, as I said before in another thread.

> The solution is to teach ExecSetupPartitionTupleRouting to avoid using a
> subplan result rel if its ri_FdwState is already set.

I'm not sure that is a good idea, because that requires to create a new 
ResultRelInfo, which is not free; I think it requires a lot of work. 
Another solution to avoid that would be to store the fmstate created by 
postgresBeginForeignInsert() into the ri_FdwState, not overwrite the 
ri_FdwState, like the attached.  This would not need any changes to the 
core, and this would not cause any overheads either, IIUC.  What do you 
think about that?

> I have attached 2 patches, one for PG 11 where the problem first appeared
> and another for HEAD.  The patch for PG 11 is significantly bigger due to
> having to handle the complexities of mapping of subplan result rel indexes
> to leaf partition indexes.  Most of that complexity is gone in HEAD due to
> 3f2393edefa5, so the patch for HEAD is much simpler.

Thanks for the patches!

> I've also added a
> test in postgres_fdw.sql to exercise this test case.

Thanks again, but the test case you added works well without any fix:

+-- Check case where the foreign partition is a subplan target rel and
+-- foreign parttion is locally modified (target table being joined
+-- locally prevents a direct/remote modification plan).
+explain (verbose, costs off)
+update utrtest set a = 1 from (values (1), (2)) s(x) where a = s.x 
returning *;
+                                  QUERY PLAN 

+------------------------------------------------------------------------------
+ Update on public.utrtest
+   Output: remp.a, remp.b, "*VALUES*".column1
+   Foreign Update on public.remp
+     Remote SQL: UPDATE public.loct SET a = $2 WHERE ctid = $1 
RETURNING a, b
+   Update on public.locp
+   ->  Hash Join
+         Output: 1, remp.b, remp.ctid, "*VALUES*".*, "*VALUES*".column1
+         Hash Cond: (remp.a = "*VALUES*".column1)
+         ->  Foreign Scan on public.remp
+               Output: remp.b, remp.ctid, remp.a
+               Remote SQL: SELECT a, b, ctid FROM public.loct FOR UPDATE
+         ->  Hash
+               Output: "*VALUES*".*, "*VALUES*".column1
+               ->  Values Scan on "*VALUES*"
+                     Output: "*VALUES*".*, "*VALUES*".column1
+   ->  Hash Join
+         Output: 1, locp.b, locp.ctid, "*VALUES*".*, "*VALUES*".column1
+         Hash Cond: (locp.a = "*VALUES*".column1)
+         ->  Seq Scan on public.locp
+               Output: locp.b, locp.ctid, locp.a
+         ->  Hash
+               Output: "*VALUES*".*, "*VALUES*".column1
+               ->  Values Scan on "*VALUES*"
+                     Output: "*VALUES*".*, "*VALUES*".column1
+(24 rows)

In this test case, the foreign partition is updated before ri_FdwState 
is overwritten by postgresBeginForeignInsert() that's invoked by the 
tuple routing code, so it would work without any fix.  So I modified 
this test case as such.

Sorry for the long delay, again.

Best regards,
Etsuro Fujita

Вложения

Re: bug in update tuple routing with foreign partitions

От
Amit Langote
Дата:
Fujita-san,

Thanks for the review.

On 2019/04/10 17:38, Etsuro Fujita wrote:
> (2019/03/06 18:33), Amit Langote wrote:
>> I noticed a bug with how UPDATE tuple routing initializes ResultRelInfos
>> to use for partition routing targets.  Specifically, the bug occurs when
>> UPDATE targets include a foreign partition that is locally modified (as
>> opposed to being modified directly on the remove server) and its
>> ResultRelInfo (called subplan result rel in the source code) is reused for
>> tuple routing:
> 
>> The solution is to teach ExecSetupPartitionTupleRouting to avoid using a
>> subplan result rel if its ri_FdwState is already set.
> 
> I'm not sure that is a good idea, because that requires to create a new
> ResultRelInfo, which is not free; I think it requires a lot of work.

After considering this a bit more and studying your proposed solution, I
tend to agree.  Beside the performance considerations you point out that I
too think are valid, I realize that going with my approach would mean
embedding some assumptions in the core code about ri_FdwState; in this
case, assuming that a subplan result rel's ri_FdwState is not to be
re-purposed for tuple routing insert, which is only based on looking at
postgres_fdw's implementation.  Not to mention the ensuing complexity in
the core tuple routing code -- it now needs to account for the fact that
not all subplan result rels may have been re-purposed for tuple-routing,
something that's too complicated to handle in PG 11.

IOW, let's call this a postgres_fdw bug and fix it there as you propose.

> Another solution to avoid that would be to store the fmstate created by
> postgresBeginForeignInsert() into the ri_FdwState, not overwrite the
> ri_FdwState, like the attached.  This would not need any changes to the
> core, and this would not cause any overheads either, IIUC.  What do you
> think about that?

+1.  Just one comment:

+    /*
+     * If the given resultRelInfo already has PgFdwModifyState set, it means
+     * the foreign table is an UPDATE subplan resultrel; in which case, store
+     * the resulting state into the PgFdwModifyState.
+     */

The last "PgFdwModifyState" should be something else?  Maybe,
sub-PgModifyState or sub_fmstate?


>> I've also added a
>> test in postgres_fdw.sql to exercise this test case.
> 
> Thanks again, but the test case you added works well without any fix:

Oops, I should really adopt a habit of adding a test case before code.

> +-- Check case where the foreign partition is a subplan target rel and
> +-- foreign parttion is locally modified (target table being joined
> +-- locally prevents a direct/remote modification plan).
> +explain (verbose, costs off)
> +update utrtest set a = 1 from (values (1), (2)) s(x) where a = s.x
> returning *;
> +                                  QUERY PLAN
> +------------------------------------------------------------------------------
> 
> + Update on public.utrtest
> +   Output: remp.a, remp.b, "*VALUES*".column1
> +   Foreign Update on public.remp
> +     Remote SQL: UPDATE public.loct SET a = $2 WHERE ctid = $1 RETURNING
> a, b
> +   Update on public.locp
> +   ->  Hash Join
> +         Output: 1, remp.b, remp.ctid, "*VALUES*".*, "*VALUES*".column1
> +         Hash Cond: (remp.a = "*VALUES*".column1)
> +         ->  Foreign Scan on public.remp
> +               Output: remp.b, remp.ctid, remp.a
> +               Remote SQL: SELECT a, b, ctid FROM public.loct FOR UPDATE
> +         ->  Hash
> +               Output: "*VALUES*".*, "*VALUES*".column1
> +               ->  Values Scan on "*VALUES*"
> +                     Output: "*VALUES*".*, "*VALUES*".column1
> +   ->  Hash Join
> +         Output: 1, locp.b, locp.ctid, "*VALUES*".*, "*VALUES*".column1
> +         Hash Cond: (locp.a = "*VALUES*".column1)
> +         ->  Seq Scan on public.locp
> +               Output: locp.b, locp.ctid, locp.a
> +         ->  Hash
> +               Output: "*VALUES*".*, "*VALUES*".column1
> +               ->  Values Scan on "*VALUES*"
> +                     Output: "*VALUES*".*, "*VALUES*".column1
> +(24 rows)
> 
> In this test case, the foreign partition is updated before ri_FdwState is
> overwritten by postgresBeginForeignInsert() that's invoked by the tuple
> routing code, so it would work without any fix.  So I modified this test
> case as such.

Thanks for fixing that.   I hadn't noticed that foreign partition remp is
updated before local partition locp; should've added a locp0 for values
(0) so that remp appears second in the ModifyTable's subplans.  Or, well,
make the foreign table remp appear later by making it a partition for
values in (3) as your patch does.

BTW, have you noticed that the RETURNING clause returns the same row twice?

+update utrtest set a = 3 from (values (2), (3)) s(x) where a = s.x
returning *;
+ a |         b         | x
+---+-------------------+---
+ 3 | qux triggered !   | 2
+ 3 | xyzzy triggered ! | 3
+ 3 | qux triggered !   | 3
+(3 rows)

You can see that the row that's moved into remp is returned twice (one
with "qux triggered!" in b column), whereas it should've been only once?
I checked the behavior with moving into a local partition just to be sure
and the changed row is returned only once.

create table p (a int, b text) partition by list (a);
create table p1 partition of p for values in (1);
create table p2 partition of p for values in (2);
insert into p values (1, 'p1'), (2, 'p2');

select tableoid::regclass, * from p;
 tableoid │ a │ b
──────────┼───┼────
 p1       │ 1 │ p1
 p2       │ 2 │ p2
(2 rows)

update p set a = 2 returning tableoid::regclass, *;
 tableoid │ a │ b
──────────┼───┼────
 p2       │ 2 │ p1
 p2       │ 2 │ p2
(2 rows)

Add a foreign table partition in the mix and that's no longer true.

create extension postgres_fdw ;
create server loopback foreign data wrapper postgres_fdw;
create user mapping for current_user server loopback;
create table p3base (a int check (a = 3), b text);
create foreign table p3 partition of p for values in (3) server loopback
options (table_name 'p3base');
delete from p;
insert into p values (1, 'p1'), (2, 'p2'), (3, 'p3');


select tableoid::regclass, * from p;
 tableoid │ a │ b
──────────┼───┼────
 p1       │ 1 │ p1
 p2       │ 2 │ p2
 p3       │ 3 │ p3
(3 rows)

update p set a = 3 returning tableoid::regclass, *;
 tableoid │ a │ b
──────────┼───┼────
 p3       │ 3 │ p1
 p3       │ 3 │ p2
 p3       │ 3 │ p3
 p3       │ 3 │ p1
 p3       │ 3 │ p2
(5 rows)

select tableoid::regclass, * from p;
 tableoid │ a │ b
──────────┼───┼────
 p3       │ 3 │ p3
 p3       │ 3 │ p1
 p3       │ 3 │ p2
(3 rows)

I'm not sure if it's a bug that ought to be fixed, but thought I should
highlight it just in case.

Thanks,
Amit




Re: bug in update tuple routing with foreign partitions

От
Etsuro Fujita
Дата:
(2019/04/11 14:33), Amit Langote wrote:
> On 2019/04/10 17:38, Etsuro Fujita wrote:
>> (2019/03/06 18:33), Amit Langote wrote:
>>> I noticed a bug with how UPDATE tuple routing initializes ResultRelInfos
>>> to use for partition routing targets.  Specifically, the bug occurs when
>>> UPDATE targets include a foreign partition that is locally modified (as
>>> opposed to being modified directly on the remove server) and its
>>> ResultRelInfo (called subplan result rel in the source code) is reused for
>>> tuple routing:
>>
>>> The solution is to teach ExecSetupPartitionTupleRouting to avoid using a
>>> subplan result rel if its ri_FdwState is already set.
>>
>> I'm not sure that is a good idea, because that requires to create a new
>> ResultRelInfo, which is not free; I think it requires a lot of work.
>
> After considering this a bit more and studying your proposed solution, I
> tend to agree.  Beside the performance considerations you point out that I
> too think are valid, I realize that going with my approach would mean
> embedding some assumptions in the core code about ri_FdwState; in this
> case, assuming that a subplan result rel's ri_FdwState is not to be
> re-purposed for tuple routing insert, which is only based on looking at
> postgres_fdw's implementation.

Yeah, that assumption might not apply to some other FDWs.

> Not to mention the ensuing complexity in
> the core tuple routing code -- it now needs to account for the fact that
> not all subplan result rels may have been re-purposed for tuple-routing,
> something that's too complicated to handle in PG 11.

I think so too.

> IOW, let's call this a postgres_fdw bug and fix it there as you propose.

Agreed.

>> Another solution to avoid that would be to store the fmstate created by
>> postgresBeginForeignInsert() into the ri_FdwState, not overwrite the
>> ri_FdwState, like the attached.  This would not need any changes to the
>> core, and this would not cause any overheads either, IIUC.  What do you
>> think about that?
>
> +1.  Just one comment:
>
> +    /*
> +     * If the given resultRelInfo already has PgFdwModifyState set, it means
> +     * the foreign table is an UPDATE subplan resultrel; in which case, store
> +     * the resulting state into the PgFdwModifyState.
> +     */
>
> The last "PgFdwModifyState" should be something else?  Maybe,
> sub-PgModifyState or sub_fmstate?

OK, will revise.  My favorite would be the latter.

>>> I've also added a
>>> test in postgres_fdw.sql to exercise this test case.
>>
>> Thanks again, but the test case you added works well without any fix:
>
> Oops, I should really adopt a habit of adding a test case before code.
>
>> +-- Check case where the foreign partition is a subplan target rel and
>> +-- foreign parttion is locally modified (target table being joined
>> +-- locally prevents a direct/remote modification plan).
>> +explain (verbose, costs off)
>> +update utrtest set a = 1 from (values (1), (2)) s(x) where a = s.x
>> returning *;
>> +                                  QUERY PLAN
>> +------------------------------------------------------------------------------
>>
>> + Update on public.utrtest
>> +   Output: remp.a, remp.b, "*VALUES*".column1
>> +   Foreign Update on public.remp
>> +     Remote SQL: UPDATE public.loct SET a = $2 WHERE ctid = $1 RETURNING
>> a, b
>> +   Update on public.locp
>> +   ->   Hash Join
>> +         Output: 1, remp.b, remp.ctid, "*VALUES*".*, "*VALUES*".column1
>> +         Hash Cond: (remp.a = "*VALUES*".column1)
>> +         ->   Foreign Scan on public.remp
>> +               Output: remp.b, remp.ctid, remp.a
>> +               Remote SQL: SELECT a, b, ctid FROM public.loct FOR UPDATE
>> +         ->   Hash
>> +               Output: "*VALUES*".*, "*VALUES*".column1
>> +               ->   Values Scan on "*VALUES*"
>> +                     Output: "*VALUES*".*, "*VALUES*".column1
>> +   ->   Hash Join
>> +         Output: 1, locp.b, locp.ctid, "*VALUES*".*, "*VALUES*".column1
>> +         Hash Cond: (locp.a = "*VALUES*".column1)
>> +         ->   Seq Scan on public.locp
>> +               Output: locp.b, locp.ctid, locp.a
>> +         ->   Hash
>> +               Output: "*VALUES*".*, "*VALUES*".column1
>> +               ->   Values Scan on "*VALUES*"
>> +                     Output: "*VALUES*".*, "*VALUES*".column1
>> +(24 rows)
>>
>> In this test case, the foreign partition is updated before ri_FdwState is
>> overwritten by postgresBeginForeignInsert() that's invoked by the tuple
>> routing code, so it would work without any fix.  So I modified this test
>> case as such.
>
> Thanks for fixing that.   I hadn't noticed that foreign partition remp is
> updated before local partition locp; should've added a locp0 for values
> (0) so that remp appears second in the ModifyTable's subplans.  Or, well,
> make the foreign table remp appear later by making it a partition for
> values in (3) as your patch does.
>
> BTW, have you noticed that the RETURNING clause returns the same row twice?

I noticed this, but I didn't think it hard.  :(

> +update utrtest set a = 3 from (values (2), (3)) s(x) where a = s.x
> returning *;
> + a |         b         | x
> +---+-------------------+---
> + 3 | qux triggered !   | 2
> + 3 | xyzzy triggered ! | 3
> + 3 | qux triggered !   | 3
> +(3 rows)
>
> You can see that the row that's moved into remp is returned twice (one
> with "qux triggered!" in b column), whereas it should've been only once?

Yeah, this is unexpected behavior, so will look into this.  Thanks for 
pointing that out!

Best regards,
Etsuro Fujita




Re: bug in update tuple routing with foreign partitions

От
Etsuro Fujita
Дата:
(2019/04/11 20:31), Etsuro Fujita wrote:
> (2019/04/11 14:33), Amit Langote wrote:
>> BTW, have you noticed that the RETURNING clause returns the same row
>> twice?
>
> I noticed this, but I didn't think it hard. :(
>
>> +update utrtest set a = 3 from (values (2), (3)) s(x) where a = s.x
>> returning *;
>> + a | b | x
>> +---+-------------------+---
>> + 3 | qux triggered ! | 2
>> + 3 | xyzzy triggered ! | 3
>> + 3 | qux triggered ! | 3
>> +(3 rows)
>>
>> You can see that the row that's moved into remp is returned twice (one
>> with "qux triggered!" in b column), whereas it should've been only once?
>
> Yeah, this is unexpected behavior, so will look into this.

I think the reason for that is: the row routed to remp is incorrectly 
fetched as a to-be-updated row when updating remp as a subplan 
targetrel.  The right way to fix this would be to have some way in 
postgres_fdw in which we don't fetch such rows when updating such 
subplan targetrels.  I tried to figure out a (simple) way to do that, 
but I couldn't.  One probably-simple solution I came up with is to sort 
subplan targetrels into the order in which foreign-table subplan 
targetrels get processed first in ExecModifyTable().  (Note: currently, 
rows can be moved from local partitions to a foreign-table partition, 
but rows cannot be moved from foreign-table partitions to another 
partition, so we wouldn't encounter this situation once we sort like 
that.)  But I think that's ugly, and I don't think it's a good idea to 
change the core, just for postgres_fdw.  So what I'm thinking is to 
throw an error for cases like this.  (Though, I think we should keep to 
allow rows to be moved from local partitions to a foreign-table subplan 
targetrel that has been updated already.)

What do you think about that?

Best regards,
Etsuro Fujita




Re: bug in update tuple routing with foreign partitions

От
Amit Langote
Дата:
Fujita-san,

On 2019/04/17 21:49, Etsuro Fujita wrote:
> (2019/04/11 20:31), Etsuro Fujita wrote:
>> (2019/04/11 14:33), Amit Langote wrote:
>>> BTW, have you noticed that the RETURNING clause returns the same row
>>> twice?
>>
>> I noticed this, but I didn't think it hard. :(
>>
>>> +update utrtest set a = 3 from (values (2), (3)) s(x) where a = s.x
>>> returning *;
>>> + a | b | x
>>> +---+-------------------+---
>>> + 3 | qux triggered ! | 2
>>> + 3 | xyzzy triggered ! | 3
>>> + 3 | qux triggered ! | 3
>>> +(3 rows)
>>>
>>> You can see that the row that's moved into remp is returned twice (one
>>> with "qux triggered!" in b column), whereas it should've been only once?
>>
>> Yeah, this is unexpected behavior, so will look into this.

Thanks for investigating.

> I think the reason for that is: the row routed to remp is incorrectly
> fetched as a to-be-updated row when updating remp as a subplan targetrel.

Yeah.  In the fully-local case, that is, where both the source and the
target partition of a row movement operation are local tables, heap AM
ensures that tuples that's moved into a given relation in the same command
(by way of row movement) are not returned as to-be-updated, because it
deems such tuples invisible.  The "same command" part being crucial for
that to work.

In the case where the target of a row movement operation is a foreign
table partition, the INSERT used as part of row movement and subsequent
UPDATE of the same foreign table are distinct commands for the remote
server.  So, the rows inserted by the 1st command (as part of the row
movement) are deemed visible by the 2nd command (UPDATE) even if both are
operating within the same transaction.

I guess there's no easy way for postgres_fdw to make the remote server
consider them as a single command.  IOW, no way to make the remote server
not touch those "moved-in" rows during the UPDATE part of the local query.
 
> The right way to fix this would be to have some way in postgres_fdw in
> which we don't fetch such rows when updating such subplan targetrels.  I
> tried to figure out a (simple) way to do that, but I couldn't.

Yeah, that seems a bit hard to ensure with our current infrastructure.

> One
> probably-simple solution I came up with is to sort subplan targetrels into
> the order in which foreign-table subplan targetrels get processed first in
> ExecModifyTable().  (Note: currently, rows can be moved from local
> partitions to a foreign-table partition, but rows cannot be moved from
> foreign-table partitions to another partition, so we wouldn't encounter
> this situation once we sort like that.)  But I think that's ugly, and I
> don't think it's a good idea to change the core, just for postgres_fdw.

Agreed that it seems like contorting the core code to accommodate
limitations of postgres_fdw.

> So what I'm thinking is to throw an error for cases like this.  (Though, I
> think we should keep to allow rows to be moved from local partitions to a
> foreign-table subplan targetrel that has been updated already.)

Hmm, how would you distinguish (totally inside postgred_fdw I presume) the
two cases?

Thanks,
Amit




Re: bug in update tuple routing with foreign partitions

От
Amit Langote
Дата:
On 2019/04/18 14:06, Amit Langote wrote:
> Fujita-san,
> 
> On 2019/04/17 21:49, Etsuro Fujita wrote:
>> (2019/04/11 20:31), Etsuro Fujita wrote:
>>> (2019/04/11 14:33), Amit Langote wrote:
>>>> BTW, have you noticed that the RETURNING clause returns the same row
>>>> twice?
>>>
>>> I noticed this, but I didn't think it hard. :(
>>>
>>>> +update utrtest set a = 3 from (values (2), (3)) s(x) where a = s.x
>>>> returning *;
>>>> + a | b | x
>>>> +---+-------------------+---
>>>> + 3 | qux triggered ! | 2
>>>> + 3 | xyzzy triggered ! | 3
>>>> + 3 | qux triggered ! | 3
>>>> +(3 rows)
>>>>
>>>> You can see that the row that's moved into remp is returned twice (one
>>>> with "qux triggered!" in b column), whereas it should've been only once?
>>>
>>> Yeah, this is unexpected behavior, so will look into this.
> 
> Thanks for investigating.
> 
>> I think the reason for that is: the row routed to remp is incorrectly
>> fetched as a to-be-updated row when updating remp as a subplan targetrel.
> 
> Yeah.  In the fully-local case, that is, where both the source and the
> target partition of a row movement operation are local tables, heap AM
> ensures that tuples that's moved into a given relation in the same command
> (by way of row movement) are not returned as to-be-updated, because it
> deems such tuples invisible.  The "same command" part being crucial for
> that to work.
> 
> In the case where the target of a row movement operation is a foreign
> table partition, the INSERT used as part of row movement and subsequent
> UPDATE of the same foreign table are distinct commands for the remote
> server.  So, the rows inserted by the 1st command (as part of the row
> movement) are deemed visible by the 2nd command (UPDATE) even if both are
> operating within the same transaction.
> 
> I guess there's no easy way for postgres_fdw to make the remote server
> consider them as a single command.  IOW, no way to make the remote server
> not touch those "moved-in" rows during the UPDATE part of the local query.
>  
>> The right way to fix this would be to have some way in postgres_fdw in
>> which we don't fetch such rows when updating such subplan targetrels.  I
>> tried to figure out a (simple) way to do that, but I couldn't.
> 
> Yeah, that seems a bit hard to ensure with our current infrastructure.
> 
>> One
>> probably-simple solution I came up with is to sort subplan targetrels into
>> the order in which foreign-table subplan targetrels get processed first in
>> ExecModifyTable().  (Note: currently, rows can be moved from local
>> partitions to a foreign-table partition, but rows cannot be moved from
>> foreign-table partitions to another partition, so we wouldn't encounter
>> this situation once we sort like that.)  But I think that's ugly, and I
>> don't think it's a good idea to change the core, just for postgres_fdw.
> 
> Agreed that it seems like contorting the core code to accommodate
> limitations of postgres_fdw.
> 
>> So what I'm thinking is to throw an error for cases like this.  (Though, I
>> think we should keep to allow rows to be moved from local partitions to a
>> foreign-table subplan targetrel that has been updated already.)
> 
> Hmm, how would you distinguish (totally inside postgred_fdw I presume) the
> two cases?

Forgot to say that since this is a separate issue from the original bug
report, maybe we can first finish fixing the latter.  What to do you think?

Thanks,
Amit




Re: bug in update tuple routing with foreign partitions

От
Etsuro Fujita
Дата:
Amit-san,

Thanks for the comments!

(2019/04/18 14:08), Amit Langote wrote:
> On 2019/04/18 14:06, Amit Langote wrote:
>> On 2019/04/17 21:49, Etsuro Fujita wrote:

>>> I think the reason for that is: the row routed to remp is incorrectly
>>> fetched as a to-be-updated row when updating remp as a subplan targetrel.
>>
>> Yeah.  In the fully-local case, that is, where both the source and the
>> target partition of a row movement operation are local tables, heap AM
>> ensures that tuples that's moved into a given relation in the same command
>> (by way of row movement) are not returned as to-be-updated, because it
>> deems such tuples invisible.  The "same command" part being crucial for
>> that to work.
>>
>> In the case where the target of a row movement operation is a foreign
>> table partition, the INSERT used as part of row movement and subsequent
>> UPDATE of the same foreign table are distinct commands for the remote
>> server.  So, the rows inserted by the 1st command (as part of the row
>> movement) are deemed visible by the 2nd command (UPDATE) even if both are
>> operating within the same transaction.

Yeah, I think so too.

>> I guess there's no easy way for postgres_fdw to make the remote server
>> consider them as a single command.  IOW, no way to make the remote server
>> not touch those "moved-in" rows during the UPDATE part of the local query.

I agree.

>>> The right way to fix this would be to have some way in postgres_fdw in
>>> which we don't fetch such rows when updating such subplan targetrels.  I
>>> tried to figure out a (simple) way to do that, but I couldn't.
>>
>> Yeah, that seems a bit hard to ensure with our current infrastructure.

Yeah, I think we should leave that for future work.

>>> So what I'm thinking is to throw an error for cases like this.  (Though, I
>>> think we should keep to allow rows to be moved from local partitions to a
>>> foreign-table subplan targetrel that has been updated already.)
>>
>> Hmm, how would you distinguish (totally inside postgred_fdw I presume) the
>> two cases?

One thing I came up with to do that is this:

@@ -1917,6 +1937,23 @@ postgresBeginForeignInsert(ModifyTableState *mtstate,

         initStringInfo(&sql);

+       /*
+        * If the foreign table is an UPDATE subplan resultrel that 
hasn't yet
+        * been updated, routing tuples to the table might yield incorrect
+        * results, because if routing tuples, routed tuples will be 
mistakenly
+        * read from the table and updated twice when updating the table 
--- it
+        * would be nice if we could handle this case; but for now, 
throw an error
+        * for safety.
+        */
+       if (plan && plan->operation == CMD_UPDATE &&
+               (resultRelInfo->ri_usesFdwDirectModify ||
+                resultRelInfo->ri_FdwState) &&
+               resultRelInfo > mtstate->resultRelInfo + 
mtstate->mt_whichplan)
+               ereport(ERROR,
+                               (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+                                errmsg("cannot route tuples into 
foreign table to be updated \"%s\"",
+ 
RelationGetRelationName(rel))));

> Forgot to say that since this is a separate issue from the original bug
> report, maybe we can first finish fixing the latter.  What to do you think?

Yeah, I think we can do that, but my favorite would be to fix the two 
issues in a single shot, because they seem to me rather close to each 
other.  So I updated a new version in a single patch, which I'm attaching.

Notes:

* I kept all the changes in the previous patch, because otherwise 
postgres_fdw would fail to release resources for a foreign-insert 
operation created by postgresBeginForeignInsert() for a tuple-routable 
foreign table (ie, a foreign-table subplan resultrel that has been 
updated already) during postgresEndForeignInsert().

* I revised a comment according to your previous comment, though I 
changed a state name: s/sub_fmstate/aux_fmstate/

* DirectModify also has the latter issue.  Here is an example:

postgres=# create table p (a int, b text) partition by list (a);
postgres=# create table p1 partition of p for values in (1);
postgres=# create table p2base (a int check (a = 2 or a = 3), b text);
postgres=# create foreign table p2 partition of p for values in (2, 3) 
server loopback options (table_name 'p2base');

postgres=# insert into p values (1, 'foo');
INSERT 0 1
postgres=# explain verbose update p set a = a + 1;
                                  QUERY PLAN
-----------------------------------------------------------------------------
  Update on public.p  (cost=0.00..176.21 rows=2511 width=42)
    Update on public.p1
    Foreign Update on public.p2
    ->  Seq Scan on public.p1  (cost=0.00..25.88 rows=1270 width=42)
          Output: (p1.a + 1), p1.b, p1.ctid
    ->  Foreign Update on public.p2  (cost=100.00..150.33 rows=1241 
width=42)
          Remote SQL: UPDATE public.p2base SET a = (a + 1)
(7 rows)

postgres=# update p set a = a + 1;
UPDATE 2
postgres=# select * from p;
  a |  b
---+-----
  3 | foo
(1 row)

As shown in the above bit added to postgresBeginForeignInsert(), I 
modified the patch so that we throw an error for cases like this even 
when using a direct modification plan, and I also added regression test 
cases for that.

Best regards,
Etsuro Fujita

Вложения

Re: bug in update tuple routing with foreign partitions

От
Etsuro Fujita
Дата:
(2019/04/18 22:10), Etsuro Fujita wrote:
> Notes:
>
> * I kept all the changes in the previous patch, because otherwise
> postgres_fdw would fail to release resources for a foreign-insert
> operation created by postgresBeginForeignInsert() for a tuple-routable
> foreign table (ie, a foreign-table subplan resultrel that has been
> updated already) during postgresEndForeignInsert().

I noticed that this explanation was not right.  Let me correct myself. 
The reason why I kept those changes is: without those changes, we would 
fail to release the resources for a foreign-update operation (ie, 
fmstate) created by postgresBeginForeignModify(), replaced by the 
fmstate for the foreign-insert operation, because when doing 
ExecEndPlan(), we first call postgresEndForeignModify() and then call 
postgresEndForeignInsert(); so, if we didn't keep those changes, we 
would *mistakenly* release the fmstate for the foreign-insert operation 
in postgresEndForeignModify(), and we wouldn't do anything about the 
fmstate for the foreign-update operation in that function and in the 
subsequent postgresEndForeignInsert().

Best regards,
Etsuro Fujita




Re: bug in update tuple routing with foreign partitions

От
Amit Langote
Дата:
Fujita-san,

On 2019/04/18 22:10, Etsuro Fujita wrote:
>> On 2019/04/18 14:06, Amit Langote wrote:
>>> On 2019/04/17 21:49, Etsuro Fujita wrote:
>>>> So what I'm thinking is to throw an error for cases like this. 
>>>> (Though, I
>>>> think we should keep to allow rows to be moved from local partitions to a
>>>> foreign-table subplan targetrel that has been updated already.)
>>>
>>> Hmm, how would you distinguish (totally inside postgred_fdw I presume) the
>>> two cases?
> 
> One thing I came up with to do that is this:
> 
> @@ -1917,6 +1937,23 @@ postgresBeginForeignInsert(ModifyTableState *mtstate,
> 
>         initStringInfo(&sql);
> 
> +       /*
> +        * If the foreign table is an UPDATE subplan resultrel that hasn't
> yet
> +        * been updated, routing tuples to the table might yield incorrect
> +        * results, because if routing tuples, routed tuples will be
> mistakenly
> +        * read from the table and updated twice when updating the table
> --- it
> +        * would be nice if we could handle this case; but for now, throw
> an error
> +        * for safety.
> +        */
> +       if (plan && plan->operation == CMD_UPDATE &&
> +               (resultRelInfo->ri_usesFdwDirectModify ||
> +                resultRelInfo->ri_FdwState) &&
> +               resultRelInfo > mtstate->resultRelInfo +
> mtstate->mt_whichplan)
> +               ereport(ERROR,
> +                               (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> +                                errmsg("cannot route tuples into foreign
> table to be updated \"%s\"",
> + RelationGetRelationName(rel))));

Oh, I see.

So IIUC, you're making postgresBeginForeignInsert() check two things:

1. Whether the foreign table it's about to begin inserting (moved) rows
into is a subplan result rel, checked by
(resultRelInfo->ri_usesFdwDirectModify || resultRelInfo->ri_FdwState)

2. Whether the foreign table it's about to begin inserting (moved) rows
into will be updated later, checked by (resultRelInfo >
mtstate->resultRelInfo + mtstate->mt_whichplan)

This still allows a foreign table to receive rows moved from the local
partitions if it has already finished the UPDATE operation.

Seems reasonable to me.

>> Forgot to say that since this is a separate issue from the original bug
>> report, maybe we can first finish fixing the latter.  What to do you think?
> 
> Yeah, I think we can do that, but my favorite would be to fix the two
> issues in a single shot, because they seem to me rather close to each
> other.  So I updated a new version in a single patch, which I'm attaching.

I agree that we can move to fix the other issue right away as the fix you
outlined above seems reasonable, but I wonder if it wouldn't be better to
commit the two fixes separately?  The two fixes, although small, are
somewhat complicated and combining them in a single commit might be
confusing.  Also, a combined commit might make it harder for the release
note author to list down the exact set of problems being fixed.  But I
guess your commit message will make it clear that two distinct problems
are being solved, so maybe that shouldn't be a problem.

> Notes:
> 
> * I kept all the changes in the previous patch, because otherwise
> postgres_fdw would fail to release resources for a foreign-insert
> operation created by postgresBeginForeignInsert() for a tuple-routable
> foreign table (ie, a foreign-table subplan resultrel that has been updated
> already) during postgresEndForeignInsert().

Hmm are you saying that the cases for which we'll still allow tuple
routing (foreign table receiving moved-in rows has already been updated),
there will be two fmstates to be released -- the original fmstate
(UPDATE's) and aux_fmstate (INSERT's)?

> * I revised a comment according to your previous comment, though I changed
> a state name: s/sub_fmstate/aux_fmstate/

That seems fine to me.

> * DirectModify also has the latter issue.  Here is an example:
> 
> postgres=# create table p (a int, b text) partition by list (a);
> postgres=# create table p1 partition of p for values in (1);
> postgres=# create table p2base (a int check (a = 2 or a = 3), b text);
> postgres=# create foreign table p2 partition of p for values in (2, 3)
> server loopback options (table_name 'p2base');
> 
> postgres=# insert into p values (1, 'foo');
> INSERT 0 1
> postgres=# explain verbose update p set a = a + 1;
>                                  QUERY PLAN
> -----------------------------------------------------------------------------
>  Update on public.p  (cost=0.00..176.21 rows=2511 width=42)
>    Update on public.p1
>    Foreign Update on public.p2
>    ->  Seq Scan on public.p1  (cost=0.00..25.88 rows=1270 width=42)
>          Output: (p1.a + 1), p1.b, p1.ctid
>    ->  Foreign Update on public.p2  (cost=100.00..150.33 rows=1241 width=42)
>          Remote SQL: UPDATE public.p2base SET a = (a + 1)
> (7 rows)
> 
> postgres=# update p set a = a + 1;
> UPDATE 2
> postgres=# select * from p;
>  a |  b
> ---+-----
>  3 | foo
> (1 row)

Ah, the expected out is "(2, foo)".  Also, with RETURNING, you'd get this:

update p set a = a + 1 returning tableoid::regclass, *;
 tableoid │ a │  b
──────────┼───┼─────
 p2       │ 2 │ foo
 p2       │ 3 │ foo
(2 rows)

> As shown in the above bit added to postgresBeginForeignInsert(), I
> modified the patch so that we throw an error for cases like this even when
> using a direct modification plan, and I also added regression test cases
> for that.

Thanks for adding detailed tests.

Some mostly cosmetic comments on the code changes:

* In the following comment:

+    /*
+     * If the foreign table is an UPDATE subplan resultrel that hasn't yet
+     * been updated, routing tuples to the table might yield incorrect
+     * results, because if routing tuples, routed tuples will be mistakenly
+     * read from the table and updated twice when updating the table --- it
+     * would be nice if we could handle this case; but for now, throw an
error
+     * for safety.
+     */

the part that start with "because if routing tuples..." reads a bit
unclear to me. How about writing this as:

    /*
     * If the foreign table we are about to insert routed rows into is
     * also an UPDATE result rel and the UPDATE hasn't been performed yet,
     * proceeding with the INSERT will result in the later UPDATE
     * incorrectly modifying those routed rows, so prevent the INSERT ---
     * it would be nice if we could handle this case, but for now, throw
     * an error for safety.
     */

I see that in all the hunks involving some manipulation of aux_fmstate,
there's a comment explaining what it is, which seems a bit repetitive.  I
can see more or less the same explanation in postgresExecForeignInsert(),
postgresBeginForeignInsert(), and postgresEndForeignInsert().  Maybe just
keep the description in postgresBeginForeignInsert as follows:

@@ -1983,7 +2020,19 @@ postgresBeginForeignInsert(ModifyTableState *mtstate,
                                     retrieved_attrs != NIL,
                                     retrieved_attrs);

-    resultRelInfo->ri_FdwState = fmstate;
+    /*
+     * If the given resultRelInfo already has PgFdwModifyState set, it means
+     * the foreign table is an UPDATE subplan resultrel; in which case, store
+     * the resulting state into the aux_fmstate of the PgFdwModifyState.
+     */

and change the other sites to refer to postgresBeginForeingInsert for the
detailed explanation of what aux_fmstate is.

BTW, do you think we should update the docs regarding the newly
established limitation of row movement involving foreign tables?

Thanks,
Amit




Re: bug in update tuple routing with foreign partitions

От
Amit Langote
Дата:
On 2019/04/19 13:00, Amit Langote wrote:
> BTW, do you think we should update the docs regarding the newly
> established limitation of row movement involving foreign tables?

Ah, maybe not, because it's a postgres_fdw limitation, not of the core
tuple routing feature.  OTOH, we don't mention at all in postgres-fdw.sgml
that postgres_fdw supports tuple routing.  Maybe we should and list this
limitation or would it be too much burden to maintain?

Thanks,
Amit




Re: bug in update tuple routing with foreign partitions

От
Etsuro Fujita
Дата:
(2019/04/19 13:00), Amit Langote wrote:
> On 2019/04/18 22:10, Etsuro Fujita wrote:
>>> On 2019/04/18 14:06, Amit Langote wrote:
>>>> On 2019/04/17 21:49, Etsuro Fujita wrote:
>>>>> So what I'm thinking is to throw an error for cases like this.
>>>>> (Though, I
>>>>> think we should keep to allow rows to be moved from local partitions to a
>>>>> foreign-table subplan targetrel that has been updated already.)
>>>>
>>>> Hmm, how would you distinguish (totally inside postgred_fdw I presume) the
>>>> two cases?
>>
>> One thing I came up with to do that is this:
>>
>> @@ -1917,6 +1937,23 @@ postgresBeginForeignInsert(ModifyTableState *mtstate,
>>
>>          initStringInfo(&sql);
>>
>> +       /*
>> +        * If the foreign table is an UPDATE subplan resultrel that hasn't
>> yet
>> +        * been updated, routing tuples to the table might yield incorrect
>> +        * results, because if routing tuples, routed tuples will be
>> mistakenly
>> +        * read from the table and updated twice when updating the table
>> --- it
>> +        * would be nice if we could handle this case; but for now, throw
>> an error
>> +        * for safety.
>> +        */
>> +       if (plan&&  plan->operation == CMD_UPDATE&&
>> +               (resultRelInfo->ri_usesFdwDirectModify ||
>> +                resultRelInfo->ri_FdwState)&&
>> +               resultRelInfo>  mtstate->resultRelInfo +
>> mtstate->mt_whichplan)
>> +               ereport(ERROR,
>> +                               (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
>> +                                errmsg("cannot route tuples into foreign
>> table to be updated \"%s\"",
>> + RelationGetRelationName(rel))));
>
> Oh, I see.
>
> So IIUC, you're making postgresBeginForeignInsert() check two things:
>
> 1. Whether the foreign table it's about to begin inserting (moved) rows
> into is a subplan result rel, checked by
> (resultRelInfo->ri_usesFdwDirectModify || resultRelInfo->ri_FdwState)
>
> 2. Whether the foreign table it's about to begin inserting (moved) rows
> into will be updated later, checked by (resultRelInfo>
> mtstate->resultRelInfo + mtstate->mt_whichplan)
>
> This still allows a foreign table to receive rows moved from the local
> partitions if it has already finished the UPDATE operation.
>
> Seems reasonable to me.

Great!

>>> Forgot to say that since this is a separate issue from the original bug
>>> report, maybe we can first finish fixing the latter.  What to do you think?
>>
>> Yeah, I think we can do that, but my favorite would be to fix the two
>> issues in a single shot, because they seem to me rather close to each
>> other.  So I updated a new version in a single patch, which I'm attaching.
>
> I agree that we can move to fix the other issue right away as the fix you
> outlined above seems reasonable, but I wonder if it wouldn't be better to
> commit the two fixes separately?  The two fixes, although small, are
> somewhat complicated and combining them in a single commit might be
> confusing.  Also, a combined commit might make it harder for the release
> note author to list down the exact set of problems being fixed.

OK, I'll separate the patch into two.

>> Notes:
>>
>> * I kept all the changes in the previous patch, because otherwise
>> postgres_fdw would fail to release resources for a foreign-insert
>> operation created by postgresBeginForeignInsert() for a tuple-routable
>> foreign table (ie, a foreign-table subplan resultrel that has been updated
>> already) during postgresEndForeignInsert().
>
> Hmm are you saying that the cases for which we'll still allow tuple
> routing (foreign table receiving moved-in rows has already been updated),
> there will be two fmstates to be released -- the original fmstate
> (UPDATE's) and aux_fmstate (INSERT's)?

Yeah, but I noticed that that explanation was not correct.  (I think I 
was really in hurry.)  See the correction in [1].

>> * I revised a comment according to your previous comment, though I changed
>> a state name: s/sub_fmstate/aux_fmstate/
>
> That seems fine to me.

Cool!

> Some mostly cosmetic comments on the code changes:
>
> * In the following comment:
>
> +    /*
> +     * If the foreign table is an UPDATE subplan resultrel that hasn't yet
> +     * been updated, routing tuples to the table might yield incorrect
> +     * results, because if routing tuples, routed tuples will be mistakenly
> +     * read from the table and updated twice when updating the table --- it
> +     * would be nice if we could handle this case; but for now, throw an
> error
> +     * for safety.
> +     */
>
> the part that start with "because if routing tuples..." reads a bit
> unclear to me. How about writing this as:
>
>      /*
>       * If the foreign table we are about to insert routed rows into is
>       * also an UPDATE result rel and the UPDATE hasn't been performed yet,
>       * proceeding with the INSERT will result in the later UPDATE
>       * incorrectly modifying those routed rows, so prevent the INSERT ---
>       * it would be nice if we could handle this case, but for now, throw
>       * an error for safety.
>       */

I think that's better than mine; will use that wording.

> I see that in all the hunks involving some manipulation of aux_fmstate,
> there's a comment explaining what it is, which seems a bit repetitive.  I
> can see more or less the same explanation in postgresExecForeignInsert(),
> postgresBeginForeignInsert(), and postgresEndForeignInsert().  Maybe just
> keep the description in postgresBeginForeignInsert as follows:
>
> @@ -1983,7 +2020,19 @@ postgresBeginForeignInsert(ModifyTableState *mtstate,
>                                       retrieved_attrs != NIL,
>                                       retrieved_attrs);
>
> -    resultRelInfo->ri_FdwState = fmstate;
> +    /*
> +     * If the given resultRelInfo already has PgFdwModifyState set, it means
> +     * the foreign table is an UPDATE subplan resultrel; in which case, store
> +     * the resulting state into the aux_fmstate of the PgFdwModifyState.
> +     */
>
> and change the other sites to refer to postgresBeginForeingInsert for the
> detailed explanation of what aux_fmstate is.

Good idea; will do.

Thanks for the comments!

Best regards,
Etsuro Fujita

[1] https://www.postgresql.org/message-id/5CB93D3F.6050903%40lab.ntt.co.jp




Re: bug in update tuple routing with foreign partitions

От
Etsuro Fujita
Дата:
(2019/04/19 13:17), Amit Langote wrote:
> OTOH, we don't mention at all in postgres-fdw.sgml
> that postgres_fdw supports tuple routing.  Maybe we should and list this
> limitation or would it be too much burden to maintain?

I think it's better to add this limitation to postgres-fdw.sgml.  Will 
do.  Thanks for pointing that out!

Best regards,
Etsuro Fujita




Re: bug in update tuple routing with foreign partitions

От
Amit Langote
Дата:
On 2019/04/19 14:39, Etsuro Fujita wrote:
> (2019/04/19 13:00), Amit Langote wrote:
>> On 2019/04/18 22:10, Etsuro Fujita wrote:
>>> * I kept all the changes in the previous patch, because otherwise
>>> postgres_fdw would fail to release resources for a foreign-insert
>>> operation created by postgresBeginForeignInsert() for a tuple-routable
>>> foreign table (ie, a foreign-table subplan resultrel that has been updated
>>> already) during postgresEndForeignInsert().
>>
>> Hmm are you saying that the cases for which we'll still allow tuple
>> routing (foreign table receiving moved-in rows has already been updated),
>> there will be two fmstates to be released -- the original fmstate
>> (UPDATE's) and aux_fmstate (INSERT's)?
> 
> Yeah, but I noticed that that explanation was not correct.  (I think I was
> really in hurry.)  See the correction in [1].

Ah, I hadn't noticed your corrected description in [1] even though your
message was in my inbox before I sent my email.

Thanks,
Amit




Re: bug in update tuple routing with foreign partitions

От
Etsuro Fujita
Дата:
(2019/04/19 14:39), Etsuro Fujita wrote:
> (2019/04/19 13:00), Amit Langote wrote:
>> On 2019/04/18 22:10, Etsuro Fujita wrote:
>>>> On 2019/04/18 14:06, Amit Langote wrote:
>>>>> On 2019/04/17 21:49, Etsuro Fujita wrote:
>>>>>> So what I'm thinking is to throw an error for cases like this.
>>>>>> (Though, I
>>>>>> think we should keep to allow rows to be moved from local
>>>>>> partitions to a
>>>>>> foreign-table subplan targetrel that has been updated already.)
>>>>>
>>>>> Hmm, how would you distinguish (totally inside postgred_fdw I
>>>>> presume) the
>>>>> two cases?
>>>
>>> One thing I came up with to do that is this:
>>>
>>> @@ -1917,6 +1937,23 @@ postgresBeginForeignInsert(ModifyTableState
>>> *mtstate,
>>>
>>> initStringInfo(&sql);
>>>
>>> + /*
>>> + * If the foreign table is an UPDATE subplan resultrel that hasn't
>>> yet
>>> + * been updated, routing tuples to the table might yield incorrect
>>> + * results, because if routing tuples, routed tuples will be
>>> mistakenly
>>> + * read from the table and updated twice when updating the table
>>> --- it
>>> + * would be nice if we could handle this case; but for now, throw
>>> an error
>>> + * for safety.
>>> + */
>>> + if (plan&& plan->operation == CMD_UPDATE&&
>>> + (resultRelInfo->ri_usesFdwDirectModify ||
>>> + resultRelInfo->ri_FdwState)&&
>>> + resultRelInfo> mtstate->resultRelInfo +
>>> mtstate->mt_whichplan)
>>> + ereport(ERROR,
>>> + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
>>> + errmsg("cannot route tuples into foreign
>>> table to be updated \"%s\"",
>>> + RelationGetRelationName(rel))));
>>
>> Oh, I see.
>>
>> So IIUC, you're making postgresBeginForeignInsert() check two things:
>>
>> 1. Whether the foreign table it's about to begin inserting (moved) rows
>> into is a subplan result rel, checked by
>> (resultRelInfo->ri_usesFdwDirectModify || resultRelInfo->ri_FdwState)
>>
>> 2. Whether the foreign table it's about to begin inserting (moved) rows
>> into will be updated later, checked by (resultRelInfo>
>> mtstate->resultRelInfo + mtstate->mt_whichplan)
>>
>> This still allows a foreign table to receive rows moved from the local
>> partitions if it has already finished the UPDATE operation.
>>
>> Seems reasonable to me.

>>>> Forgot to say that since this is a separate issue from the original bug
>>>> report, maybe we can first finish fixing the latter. What to do you
>>>> think?
>>>
>>> Yeah, I think we can do that, but my favorite would be to fix the two
>>> issues in a single shot, because they seem to me rather close to each
>>> other. So I updated a new version in a single patch, which I'm
>>> attaching.
>>
>> I agree that we can move to fix the other issue right away as the fix you
>> outlined above seems reasonable, but I wonder if it wouldn't be better to
>> commit the two fixes separately? The two fixes, although small, are
>> somewhat complicated and combining them in a single commit might be
>> confusing. Also, a combined commit might make it harder for the release
>> note author to list down the exact set of problems being fixed.
>
> OK, I'll separate the patch into two.

After I tried to separate the patch a bit, I changed my mind: I think we 
should commit the two issues in a single patch, because the original 
issue that overriding fmstate for the UPDATE operation mistakenly by 
fmstate for the INSERT operation caused backend crash is fixed by what I 
proposed above.  So I add the commit message to the previous single 
patch, as you suggested.

>> Some mostly cosmetic comments on the code changes:
>>
>> * In the following comment:
>>
>> + /*
>> + * If the foreign table is an UPDATE subplan resultrel that hasn't yet
>> + * been updated, routing tuples to the table might yield incorrect
>> + * results, because if routing tuples, routed tuples will be mistakenly
>> + * read from the table and updated twice when updating the table --- it
>> + * would be nice if we could handle this case; but for now, throw an
>> error
>> + * for safety.
>> + */
>>
>> the part that start with "because if routing tuples..." reads a bit
>> unclear to me. How about writing this as:
>>
>> /*
>> * If the foreign table we are about to insert routed rows into is
>> * also an UPDATE result rel and the UPDATE hasn't been performed yet,
>> * proceeding with the INSERT will result in the later UPDATE
>> * incorrectly modifying those routed rows, so prevent the INSERT ---
>> * it would be nice if we could handle this case, but for now, throw
>> * an error for safety.
>> */
>
> I think that's better than mine; will use that wording.

Done.  I simplified your wording a little bit, though.

>> I see that in all the hunks involving some manipulation of aux_fmstate,
>> there's a comment explaining what it is, which seems a bit repetitive. I
>> can see more or less the same explanation in postgresExecForeignInsert(),
>> postgresBeginForeignInsert(), and postgresEndForeignInsert(). Maybe just
>> keep the description in postgresBeginForeignInsert as follows:
>>
>> @@ -1983,7 +2020,19 @@ postgresBeginForeignInsert(ModifyTableState
>> *mtstate,
>> retrieved_attrs != NIL,
>> retrieved_attrs);
>>
>> - resultRelInfo->ri_FdwState = fmstate;
>> + /*
>> + * If the given resultRelInfo already has PgFdwModifyState set, it means
>> + * the foreign table is an UPDATE subplan resultrel; in which case,
>> store
>> + * the resulting state into the aux_fmstate of the PgFdwModifyState.
>> + */
>>
>> and change the other sites to refer to postgresBeginForeingInsert for the
>> detailed explanation of what aux_fmstate is.
>
> Good idea; will do.

Done.

Other changes:
* I updated the docs in postgres-fdw.sgml to mention the limitation.
* I did some cleanups for the regression tests.

Please find attached an updated version of the patch.

Best regards,
Etsuro Fujita

Вложения

Re: bug in update tuple routing with foreign partitions

От
Amit Langote
Дата:
Fujita-san,

On 2019/04/22 20:00, Etsuro Fujita wrote:
> (2019/04/19 14:39), Etsuro Fujita wrote:
>> (2019/04/19 13:00), Amit Langote wrote:
>>> I agree that we can move to fix the other issue right away as the fix you
>>> outlined above seems reasonable, but I wonder if it wouldn't be better to
>>> commit the two fixes separately? The two fixes, although small, are
>>> somewhat complicated and combining them in a single commit might be
>>> confusing. Also, a combined commit might make it harder for the release
>>> note author to list down the exact set of problems being fixed.
>>
>> OK, I'll separate the patch into two.
> 
> After I tried to separate the patch a bit, I changed my mind: I think we
> should commit the two issues in a single patch, because the original issue
> that overriding fmstate for the UPDATE operation mistakenly by fmstate for
> the INSERT operation caused backend crash is fixed by what I proposed
> above.  So I add the commit message to the previous single patch, as you
> suggested.

Ah, you're right.  The case that would return wrong result, that is now
prevented per the latest patch, is also the case that would crash before.
So, it seems to OK to keep this commit this as one patch.  Sorry for the
noise.

I read your commit message and it seems to sufficiently explain the issues
being fixed.  Thanks for adding me as an author, but I think the latest
patch is mostly your work, so I'm happy to be listed as just a reviewer. :)

>>> Some mostly cosmetic comments on the code changes:
>>>
>>> * In the following comment:
>>>
>>> + /*
>>> + * If the foreign table is an UPDATE subplan resultrel that hasn't yet
>>> + * been updated, routing tuples to the table might yield incorrect
>>> + * results, because if routing tuples, routed tuples will be mistakenly
>>> + * read from the table and updated twice when updating the table --- it
>>> + * would be nice if we could handle this case; but for now, throw an
>>> error
>>> + * for safety.
>>> + */
>>>
>>> the part that start with "because if routing tuples..." reads a bit
>>> unclear to me. How about writing this as:
>>>
>>> /*
>>> * If the foreign table we are about to insert routed rows into is
>>> * also an UPDATE result rel and the UPDATE hasn't been performed yet,
>>> * proceeding with the INSERT will result in the later UPDATE
>>> * incorrectly modifying those routed rows, so prevent the INSERT ---
>>> * it would be nice if we could handle this case, but for now, throw
>>> * an error for safety.
>>> */
>>
>> I think that's better than mine; will use that wording.
> 
> Done.  I simplified your wording a little bit, though.

Thanks, looks fine.

> Other changes:
> * I updated the docs in postgres-fdw.sgml to mention the limitation.

Looks fine.

> * I did some cleanups for the regression tests.

Here too.

> Please find attached an updated version of the patch.

I don't have any more comments.  Thanks for working on this.

Thanks,
Amit




Re: bug in update tuple routing with foreign partitions

От
Etsuro Fujita
Дата:
(2019/04/23 10:03), Amit Langote wrote:
> So, it seems to OK to keep this commit this as one patch.

> I read your commit message and it seems to sufficiently explain the issues
> being fixed.

Cool!

> Thanks for adding me as an author, but I think the latest
> patch is mostly your work, so I'm happy to be listed as just a reviewer. :)

You found this bug, analyzed it, and wrote the first version of the 
patch.  I heavily modified the patch, but I used your test case, so I 
think you deserve the first author of this fix.

> I don't have any more comments.  Thanks for working on this.

Pushed.  Many thanks, Amit-san!

Best regards,
Etsuro Fujita




Re: bug in update tuple routing with foreign partitions

От
Amit Langote
Дата:
Fujita-san,

On 2019/04/24 18:55, Etsuro Fujita wrote:
> (2019/04/23 10:03), Amit Langote wrote:
>> Thanks for adding me as an author, but I think the latest
>> patch is mostly your work, so I'm happy to be listed as just a reviewer. :)
> 
> You found this bug, analyzed it, and wrote the first version of the
> patch.  I heavily modified the patch, but I used your test case, so I
> think you deserve the first author of this fix.

OK, thanks.

Regards,
Amit