Обсуждение: [ADMIN] foreign tables query performance using postgres_fdw

Поиск
Список
Период
Сортировка

[ADMIN] foreign tables query performance using postgres_fdw

От
armand pirvu
Дата:
Hi

I apologize beore hand if it is in the wrong list and if please advise which one to post in.

I am facing fhe following issue and trying to understand what is wrong. My setup below

1 -
CREATE EXTENSION postgres_fdw;
2 -
CREATE SERVER birst_levreg
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (host '172.16.26.4', dbname 'birstdb');

CREATE USER MAPPING FOR csidba
SERVER birst_levreg
OPTIONS (user 'csidba', password 'secretpwd');
CREATE FOREIGN TABLE dim_item
(   show_id character varying(100),   client_id integer,   item_id character varying(100) NOT NULL,   item_type
charactervarying(100),   item_code character varying(100),   item_name character varying(250),   item_group character
varying(100),  location_id character varying(100),   session_start_date timestamp without time zone,   session_end_date
timestampwithout time zone,   credit_hours numeric,   total_allotted integer,   update_datetime timestamp without time
zone,  is_deleted character varying(10),   item_source character varying(1) 
)
SERVER birst_levreg OPTIONS (table_name 'dim_item');
grant select,update,delete,insert on  dim_item_birst to public;

analyze dim_item;

On remote server I already have

CREATE TABLE dim_item (   show_id character varying(100),   client_id integer,   item_id character varying(100) NOT
NULL,  item_type character varying(100),   item_code character varying(100),   item_name character varying(250),
item_groupcharacter varying(100),   location_id character varying(100),   session_start_date timestamp without time
zone,  session_end_date timestamp without time zone,   credit_hours numeric,   total_allotted integer,
update_datetimetimestamp without time zone,   is_deleted character varying(10),   item_source character varying(1) 
);
ALTER TABLE ONLY dim_item   ADD CONSTRAINT dim_item_pkey PRIMARY KEY (item_id);
CREATE INDEX dim_item_idx ON dim_item USING btree (client_id, update_datetime);


3 -
on remote server
explain analyze select * from dim_item where item_id='156GIEPE14CX-B';Index Scan using dim_item_pkey on dim_item
(cost=0.42..2.44rows=1 width=157) (actual time=0.134..0.135 rows=1 loops=1)  Index Cond: ((item_id)::text =
'156GIEPE14CX-B'::text)Planningtime: 1.836 msExecution time: 0.333 ms 

on local server-foreign table

explain analyze select * from dim_item where item_id='156GIEPE14CX-B';Foreign Scan on dim_item_birst
(cost=100.00..105.53rows=1 width=1934) (actual time=1.049..1.050 rows=1 loops=1)Planning time: 0.226 msExecution time:
1.626ms 

Even if I use use_remote_estimate 'true' in the server and foreign table definition the same result

I did also
explain (analyze,verbose)  select * from dim_item where item_id='156GIEPE14CX-B';
                                                                     QUERY PLAN
                                                                                                

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------

-------------------------------------------------------------------------------------------------------------------Foreign
Scanon csischema.dim_item_birst  (cost=100.00..2731.33 rows=1 width=157) (actual time=1.053..1.054 rows=1 loops=1)
Output:show_id, client_id, item_id, item_type, item_code, item_name, item_group, location_id, session_start_date,
session_end_date,credit_hours, total_allotted, upda 
te_datetime, is_deleted, item_source  Remote SQL: SELECT show_id, client_id, item_id, item_type, item_code, item_name,
item_group,location_id, session_start_date, session_end_date, credit_hours, total_all 
otted, update_datetime, is_deleted, item_source FROM csischema.dim_item WHERE ((item_id =
'156GIEPE14CX-B'::text))Planningtime: 0.222 msExecution time: 1.842 ms 
(5 rows)

And on remote server

explain analyze
SELECT show_id, client_id, item_id, item_type, item_code, item_name, item_group, location_id, session_start_date,
session_end_date, credit_hours, total_allotted, update_datetime, is_deleted, item_source FROM csischema.dim_item WHERE
((item_id= '156GIEPE14CX-B'::text)) 

--------------------------------------------------------------------------------------------------------------------------Index
Scanusing dim_item_pkey on dim_item  (cost=0.42..2.44 rows=1 width=157) (actual time=0.146..0.147 rows=1 loops=1)
IndexCond: ((item_id)::text = '156GIEPE14CX-B'::text)Planning time: 1.949 msExecution time: 0.456 ms 
(4 rows)


So what am I missing ? Why do I get a scan using foreign table as opposed to a pkey scan ?
Shall I understand that in fact the plan I am seing on the local server I should just ignore it, since the processing
isin fact done on the remote(foreign) server and the difference between costs is nothing else but the overhead in
between? 

From the manual I got this

"When use_remote_estimate is true, postgres_fdw obtains row count and cost estimates from the remote server and then
addsfdw_startup_cost and fdw_tuple_cost to the cost estimates. When use_remote_estimate is false, postgres_fdw performs
localrow count and cost estimation and then adds fdw_startup_cost and fdw_tuple_cost to the cost estimates. This local
estimationis unlikely to be very accurate unless local copies of the remote table's statistics are available. Running
ANALYZEon the foreign table is the way to update the local statistics; this will perform a scan of the remote table and
thencalculate and store statistics just as though the table were local. Keeping local statistics can be a useful way to
reduceper-query planning overhead for a remote table — but if the remote table is frequently updated, the local
statisticswill soon be obsolete. 
"


Many thanks
Armand




--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Re: [ADMIN] foreign tables query performance using postgres_fdw

От
Laurenz Albe
Дата:
armand pirvu wrote:
> I am facing fhe following issue and trying to understand what is wrong. My setup below
> 
> CREATE EXTENSION postgres_fdw;
>   
> CREATE FOREIGN TABLE dim_item
> (
[...]
>     item_id character varying(100) NOT NULL,
[...]
> )
> SERVER birst_levreg OPTIONS (table_name 'dim_item');
> grant select,update,delete,insert on  dim_item_birst to public; 
> 
> analyze dim_item;
> 
> On remote server I already have
> 
> CREATE TABLE dim_item (
>     item_id character varying(100) NOT NULL,
> );
> ALTER TABLE ONLY dim_item
>     ADD CONSTRAINT dim_item_pkey PRIMARY KEY (item_id);
> CREATE INDEX dim_item_idx ON dim_item USING btree (client_id, update_datetime);
> 
> 
> on remote server
> explain analyze select * from dim_item where item_id='156GIEPE14CX-B';
>  Index Scan using dim_item_pkey on dim_item  (cost=0.42..2.44 rows=1 width=157) (actual time=0.134..0.135 rows=1
loops=1)
>    Index Cond: ((item_id)::text = '156GIEPE14CX-B'::text)
>  Planning time: 1.836 ms
>  Execution time: 0.333 ms
> 
> on local server-foreign table
> explain (analyze,verbose)  select * from dim_item where item_id='156GIEPE14CX-B';
>                                                                                                               QUERY
PLAN                      
 
>                                                                                                                    
>
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> -------------------------------------------------------------------------------------------------------------------
>  Foreign Scan on csischema.dim_item_birst  (cost=100.00..2731.33 rows=1 width=157) (actual time=1.053..1.054 rows=1
loops=1)
>    Output: show_id, client_id, item_id, item_type, item_code, item_name, item_group, location_id, session_start_date,
session_end_date,credit_hours, total_allotted, upda
 
> te_datetime, is_deleted, item_source
>    Remote SQL: SELECT show_id, client_id, item_id, item_type, item_code, item_name, item_group, location_id,
session_start_date,session_end_date, credit_hours, total_all
 
> otted, update_datetime, is_deleted, item_source FROM csischema.dim_item WHERE ((item_id = '156GIEPE14CX-B'::text))
>  Planning time: 0.222 ms
>  Execution time: 1.842 ms
> (5 rows)
[...]
> So what am I missing ? Why do I get a scan using foreign table as opposed to a pkey scan ?

Everything is in perfect order, and the index will be used (you see
that the execution time is low).

A foreign scan is different from a sequential scan.
It does not describe *how* the query is executed on the foreign server, it
only indicates *that* a query is executed on the foreign server.
Since you can see the WHERE condition in remote query inthe EXPLAIN (VERBOSE)
output, it is pushed down to the foreign server, and there is every reason to
assume that an index scan will be used there.

Yours,
Laurenz Albe


-- 
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Re: [ADMIN] foreign tables query performance using postgres_fdw

От
armand pirvu
Дата:
Great explanation. Thank you so much Laurenz


Armand
> On Nov 16, 2017, at 1:13 AM, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
>
> armand pirvu wrote:
>> I am facing fhe following issue and trying to understand what is wrong. My setup below
>>
>> CREATE EXTENSION postgres_fdw;
>>
>> CREATE FOREIGN TABLE dim_item
>> (
> [...]
>>    item_id character varying(100) NOT NULL,
> [...]
>> )
>> SERVER birst_levreg OPTIONS (table_name 'dim_item');
>> grant select,update,delete,insert on  dim_item_birst to public;
>>
>> analyze dim_item;
>>
>> On remote server I already have
>>
>> CREATE TABLE dim_item (
>>    item_id character varying(100) NOT NULL,
>> );
>> ALTER TABLE ONLY dim_item
>>    ADD CONSTRAINT dim_item_pkey PRIMARY KEY (item_id);
>> CREATE INDEX dim_item_idx ON dim_item USING btree (client_id, update_datetime);
>>
>>
>> on remote server
>> explain analyze select * from dim_item where item_id='156GIEPE14CX-B';
>> Index Scan using dim_item_pkey on dim_item  (cost=0.42..2.44 rows=1 width=157) (actual time=0.134..0.135 rows=1
loops=1)
>>   Index Cond: ((item_id)::text = '156GIEPE14CX-B'::text)
>> Planning time: 1.836 ms
>> Execution time: 0.333 ms
>>
>> on local server-foreign table
>> explain (analyze,verbose)  select * from dim_item where item_id='156GIEPE14CX-B';
>>                                                                                                              QUERY
PLAN                       
>>
>>
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>> -------------------------------------------------------------------------------------------------------------------
>> Foreign Scan on csischema.dim_item_birst  (cost=100.00..2731.33 rows=1 width=157) (actual time=1.053..1.054 rows=1
loops=1)
>>   Output: show_id, client_id, item_id, item_type, item_code, item_name, item_group, location_id, session_start_date,
session_end_date,credit_hours, total_allotted, upda 
>> te_datetime, is_deleted, item_source
>>   Remote SQL: SELECT show_id, client_id, item_id, item_type, item_code, item_name, item_group, location_id,
session_start_date,session_end_date, credit_hours, total_all 
>> otted, update_datetime, is_deleted, item_source FROM csischema.dim_item WHERE ((item_id = '156GIEPE14CX-B'::text))
>> Planning time: 0.222 ms
>> Execution time: 1.842 ms
>> (5 rows)
> [...]
>> So what am I missing ? Why do I get a scan using foreign table as opposed to a pkey scan ?
>
> Everything is in perfect order, and the index will be used (you see
> that the execution time is low).
>
> A foreign scan is different from a sequential scan.
> It does not describe *how* the query is executed on the foreign server, it
> only indicates *that* a query is executed on the foreign server.
> Since you can see the WHERE condition in remote query inthe EXPLAIN (VERBOSE)
> output, it is pushed down to the foreign server, and there is every reason to
> assume that an index scan will be used there.
>
> Yours,
> Laurenz Albe



--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin