Re: Optimization for updating foreign tables in Postgres FDW

Поиск
Список
Период
Сортировка
От Kyotaro HORIGUCHI
Тема Re: Optimization for updating foreign tables in Postgres FDW
Дата
Msg-id 20150421.171247.127316705.horiguchi.kyotaro@lab.ntt.co.jp
обсуждение исходный текст
Ответ на Re: Optimization for updating foreign tables in Postgres FDW  (Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp>)
Список pgsql-hackers
Hi, thank you. My understanding became a bit clearer.

At Tue, 21 Apr 2015 15:35:41 +0900, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote in
<5535EFBD.8030006@lab.ntt.co.jp>
> On 2015/04/21 10:07, Kyotaro HORIGUCHI wrote:
> > At Mon, 20 Apr 2015 16:40:52 +0900, Etsuro Fujita
> > <fujita.etsuro@lab.ntt.co.jp> wrote in
> > <5534AD84.3020501@lab.ntt.co.jp>
> >> However, I'd
> >> like to propose to rename "Foreign Update" ("Foreign Delete") of
> >> ModifyTable simply to "Update" ("Delete") not only because (1) that
> >> solves the duplication problem but also because (2) ISTM that is
> >> consistent with the non-inherited updates in both of the
> >> non-pushed-down-update case and the pushed-down-update case.  Here are
> >> examples for (2).
> 
> > Update node without "Foreign" that runs "Remote SQL" looks to me
> > somewhat unusual..
> 
> I think that has a similarity with the existing EXPLAIN outputs for
> non-inherited non-pushed-down updates, as shown in the below exaple.
> 
> >> postgres=# explain verbose update ft1 set c1 = trunc(random() * 9 +
> >> 1)::int;
> >>                                                QUERY PLAN
> >>
> ------------------------------------------------------------------------------------------------------
> >>   Update on public.ft1  (cost=100.00..226.03 rows=2730 width=6)
> >>     Remote SQL: UPDATE public.t1 SET c1 = $2 WHERE ctid = $1
> >>     ->  Foreign Scan on public.ft1  (cost=100.00..226.03 rows=2730 width=6)
> >>           Output: (trunc(((random() * '9'::double precision) +
> >> '1'::double precision)))::integer, ctid
> >>           Remote SQL: SELECT ctid FROM public.t1 FOR UPDATE
> >> (5 rows)

Mmm.. It also looks confusing which needs to be fixed. Now
foreign tables are updated in two ways. One is ModifyTable on
foreign relation and the another is ForeignScan node of update
operation. Though I think that I understand the path to this
form, but I suppose they should confluent into one type of node,
perhaps ForegnScan node. Even if it is hardly archievable for
now, explain representation should be uniform.

Making ModifyTable on foreign relation have the representation
"Foreign Update", the explain results of the queries modifying
foreign tables are looks like,

Foreign Update on public.ft1 (...) Remote SQL: UPDATE public.t1 .... -> Foreign Scan on public.ft1...

Foreign Update on public.ft1 (... Foreign Update on public.ft1 (...   Remote SQL: ...

If Foreign Update has only one internal representation, the two
same Foreign Updates are (ideally) easily eliminated during
planning and explain would naturally shows the following result.

Foreign Update on public.ft1 (... Remote SQL: ...

But if not as is currently so, printing the result needs a bit
complicated calculation.


> > |  Update on public.p  (cost=0.00..360.08 rows=4311 width=14)
> > |    Update on public.p
> > |    ->  Seq Scan on public.p  (cost=0.00..0.00 rows=1 width=14)
> > |          Output: p.a, (p.b + 1), p.ctid
> > |    ->  Foreign Update on public.ft1 (cost=100.00..180.04 rows=2155
> > |    ->  width=14)
> > |          Remote SQL: UPDATE public.t1 SET b = (b + 1)
> > |    ->  Foreign Update on public.ft2 (cost=100.00..180.04 rows=2155
> > |    ->  width=14)
> > |          Remote SQL: UPDATE public.t2 SET b = (b + 1)
> 
> On that point, I agree with Tom that that would cause the problem that
> the user has to guess at which of the child plans goes with which
> target relation of ModifyTable [1].
> 
> [1]
> http://www.postgresql.org/message-id/22505.1426986174@sss.pgh.pa.us

Yeah, that seems to make the plan to be understood
clerer. Combining Tom's suggestion and my suggestion together
would result in the following output of explain.

Update on public.p  (cost=0.00..360.08 rows=4311 width=14) Update on public.p ->  Seq Scan on public.p
(cost=0.00..0.00rows=1 width=14)       Output: p.a, (p.b + 1), p.ctid Foreign Update on public.ft1 (cost=100.00..180.04
rows=2155width=14)    Remote SQL: UPDATE public.t1 SET b = (b + 1) Foreign Update on public.ft2 (cost=100.00..180.04
rows=2155 width=14)    Remote SQL: UPDATE public.t2 SET b = (b + 1)
 

And when not pushed down it would look like,

Update on public.p  (cost=0.00..360.08 rows=4311 width=14) Update on public.p   ->  Seq Scan on public.p
(cost=0.00..0.00rows=1 width=14)         Output: p.a, (p.b + 1), p.ctid Foreign Update on public.ft1
(cost=100.00..180.04rows=2155 width=14)    Remote SQL: UPDATE public.t1 SET b = $2 WHERE ctid = $1    -> Foreign Scan
onpublic.ft1 (cost=....)         Output: a, b, ctid         Remote SQL: SELECT a, ctid FROM public.t1 FOR UPDATE
ForeignUpdate on public.ft2 (cost=100.00..180.04 rows=2155  width=14)    Remote SQL: UPDATE public.t2 SET b = (b + 1)
-> Foreign Scan on public.ft2 (cost=....)         Output: a, b, ctid         Remote SQL: SELECT a, ctid FROM public.t2
FORUPDATE
 

These looks quite reasonable *for me* :)

Of course, the same discussion is applicable on Foreign Delete.

What do you think about this?

Any further thoughts?

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Simon Riggs
Дата:
Сообщение: Re: Replication identifiers, take 4
Следующее
От: David Rowley
Дата:
Сообщение: Re: Parallel Seq Scan