Обсуждение: [GENERAL] views and fdw usage and performance

Поиск
Список
Период
Сортировка

[GENERAL] views and fdw usage and performance

От
"Armand Pirvu (home)"
Дата:
Hi

I have two schemas jt1, and jt2 in the same db
In both I have the same table tbl3
The idea is to keep in sync jt1.tbl3 from jt2.tbl3 each time I have an insert/update/delete on jt2.tbl3

So I was thinking about the following cases to avoid replication

1) in jt2 rather than have the tbl3 table, have a view named tbl3 based on jt1.tbl3

2) use the postgtres fdw and in jt2 have a foreign table tbl3 to jt1.tbl3

create table tbl3 (col1 integer, col2 integer, col3 integer, primary key (col1));

insert into tbl3
select generate_series(1, 10000000), 111,222;

Questions:

q1 - Any issues with this logic ? Or any other ways to do this better ? jt2 usage in terms of concurrent users and so
onis by far much less than jt1, at least one order of magnitude less 

q2 - query performance (select * from tbl3 where col1=499123;
) . While using views it is clear cut the output of explain, on fdw not so much



explain analyze select * from jt2.tbl3 where col1=874433;
                                               QUERY PLAN
--------------------------------------------------------------------------------------------------------
 Foreign Scan on tbl3  (cost=100.00..138.66 rows=11 width=12) (actual time=0.204..0.205 rows=1 loops=1)
 Planning time: 0.043 ms
 Execution time: 0.374 ms
(3 rows)


explain analyze select * from jt1.tbl3 where col1=874433;
                                                   QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
 Index Scan using tbl3_pkey on tbl3  (cost=0.43..8.45 rows=1 width=12) (actual time=0.010..0.011 rows=1 loops=1)
   Index Cond: (col1 = 874433)
 Planning time: 0.035 ms
 Execution time: 0.021 ms
(4 rows)


Do I understand correctly that the output of (explain analyze select * from jt2.tbl3 where col1=874433) is in essence
(forall practical purposes) the same as the one from (explain analyze select * from jt1.tbl3 where col1=874433;) and
nota sequential scan like the following ? 

explain analyze select * from jt1.tbl3 where col2=874433;
                                               QUERY PLAN
--------------------------------------------------------------------------------------------------------
 Seq Scan on tbl3  (cost=0.00..179053.25 rows=1 width=12) (actual time=498.020..498.020 rows=0 loops=1)
   Filter: (col2 = 874433)
   Rows Removed by Filter: 9999998
 Planning time: 0.030 ms
 Execution time: 498.034 ms
(5 rows)



Thanks
Armand



Re: [GENERAL] views and fdw usage and performance

От
Adrian Klaver
Дата:
On 05/09/2017 02:36 PM, Armand Pirvu (home) wrote:
> Hi
>
> I have two schemas jt1, and jt2 in the same db
> In both I have the same table tbl3
> The idea is to keep in sync jt1.tbl3 from jt2.tbl3 each time I have an insert/update/delete on jt2.tbl3
>
> So I was thinking about the following cases to avoid replication
>
> 1) in jt2 rather than have the tbl3 table, have a view named tbl3 based on jt1.tbl3
>
> 2) use the postgtres fdw and in jt2 have a foreign table tbl3 to jt1.tbl3

Why?
If it is all on the same database why not just use jt1.tbl3?

>
> create table tbl3 (col1 integer, col2 integer, col3 integer, primary key (col1));
>
> insert into tbl3
> select generate_series(1, 10000000), 111,222;
>
> Questions:
>
> q1 - Any issues with this logic ? Or any other ways to do this better ? jt2 usage in terms of concurrent users and so
onis by far much less than jt1, at least one order of magnitude less 

Connections are to the database not the schema.

>
> q2 - query performance (select * from tbl3 where col1=499123;
> ) . While using views it is clear cut the output of explain, on fdw not so much
>
>
>
> explain analyze select * from jt2.tbl3 where col1=874433;
>                                                 QUERY PLAN
> --------------------------------------------------------------------------------------------------------
>   Foreign Scan on tbl3  (cost=100.00..138.66 rows=11 width=12) (actual time=0.204..0.205 rows=1 loops=1)
>   Planning time: 0.043 ms
>   Execution time: 0.374 ms
> (3 rows)
>
>
> explain analyze select * from jt1.tbl3 where col1=874433;
>                                                     QUERY PLAN
> -----------------------------------------------------------------------------------------------------------------
>   Index Scan using tbl3_pkey on tbl3  (cost=0.43..8.45 rows=1 width=12) (actual time=0.010..0.011 rows=1 loops=1)
>     Index Cond: (col1 = 874433)
>   Planning time: 0.035 ms
>   Execution time: 0.021 ms
> (4 rows)
>
>
> Do I understand correctly that the output of (explain analyze select * from jt2.tbl3 where col1=874433) is in essence
(forall practical purposes) the same as the one from (explain analyze select * from jt1.tbl3 where col1=874433;) and
nota sequential scan like the following ? 
>
> explain analyze select * from jt1.tbl3 where col2=874433;
>                                                 QUERY PLAN
> --------------------------------------------------------------------------------------------------------
>   Seq Scan on tbl3  (cost=0.00..179053.25 rows=1 width=12) (actual time=498.020..498.020 rows=0 loops=1)
>     Filter: (col2 = 874433)
>     Rows Removed by Filter: 9999998
>   Planning time: 0.030 ms
>   Execution time: 498.034 ms
> (5 rows)
>
>
>
> Thanks
> Armand
>
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: [GENERAL] views and fdw usage and performance

От
armand pirvu
Дата:
Well

Jt1 is prod and jt2 is dev
Before someone pushes to prod it does work in dev. The jdbc connection routes to jt2. In the mean time it wad needed that some tables in prod are synced at all times from dev. Hence the view/fdw.
What I meant by connections was more to say the type of load or users doing something in each schema.
So my questions still remain

Sent from my iPhone

On May 9, 2017, at 6:52 PM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:

On 05/09/2017 02:36 PM, Armand Pirvu (home) wrote:
Hi
I have two schemas jt1, and jt2 in the same db
In both I have the same table tbl3
The idea is to keep in sync jt1.tbl3 from jt2.tbl3 each time I have an insert/update/delete on jt2.tbl3
So I was thinking about the following cases to avoid replication
1) in jt2 rather than have the tbl3 table, have a view named tbl3 based on jt1.tbl3
2) use the postgtres fdw and in jt2 have a foreign table tbl3 to jt1.tbl3

Why?
If it is all on the same database why not just use jt1.tbl3?

create table tbl3 (col1 integer, col2 integer, col3 integer, primary key (col1));
insert into tbl3
select generate_series(1, 10000000), 111,222;
Questions:
q1 - Any issues with this logic ? Or any other ways to do this better ? jt2 usage in terms of concurrent users and so on is by far much less than jt1, at least one order of magnitude less

Connections are to the database not the schema.

q2 - query performance (select * from tbl3 where col1=499123;
) . While using views it is clear cut the output of explain, on fdw not so much
explain analyze select * from jt2.tbl3 where col1=874433;
                                               QUERY PLAN
--------------------------------------------------------------------------------------------------------
 Foreign Scan on tbl3  (cost=100.00..138.66 rows=11 width=12) (actual time=0.204..0.205 rows=1 loops=1)
 Planning time: 0.043 ms
 Execution time: 0.374 ms
(3 rows)
explain analyze select * from jt1.tbl3 where col1=874433;
                                                   QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
 Index Scan using tbl3_pkey on tbl3  (cost=0.43..8.45 rows=1 width=12) (actual time=0.010..0.011 rows=1 loops=1)
   Index Cond: (col1 = 874433)
 Planning time: 0.035 ms
 Execution time: 0.021 ms
(4 rows)
Do I understand correctly that the output of (explain analyze select * from jt2.tbl3 where col1=874433) is in essence (for all practical purposes) the same as the one from (explain analyze select * from jt1.tbl3 where col1=874433;) and not a sequential scan like the following ?
explain analyze select * from jt1.tbl3 where col2=874433;
                                               QUERY PLAN
--------------------------------------------------------------------------------------------------------
 Seq Scan on tbl3  (cost=0.00..179053.25 rows=1 width=12) (actual time=498.020..498.020 rows=0 loops=1)
   Filter: (col2 = 874433)
   Rows Removed by Filter: 9999998
 Planning time: 0.030 ms
 Execution time: 498.034 ms
(5 rows)
Thanks
Armand


--
Adrian Klaver
adrian.klaver@aklaver.com

Re: [GENERAL] views and fdw usage and performance

От
Adrian Klaver
Дата:
On 05/09/2017 05:02 PM, armand pirvu wrote:
> Well
>
> Jt1 is prod and jt2 is dev

You are talking schemas, not databases, correct?


> Before someone pushes to prod it does work in dev. The jdbc connection

That would concern me, as anything bad that happened in the dev schema
could bring the entire database to its knees, including the prod schema.

How does data get into the prod schema if the connection is to the dev
schema?

> routes to jt2. In the mean time it wad needed that some tables in prod
> are synced at all times from dev. Hence the view/fdw.
> What I meant by connections was more to say the type of load or users
> doing something in each schema.

The issue being that if you are pushing data from jt2 --> jt1 you are
also pushing the load in the same direction.

> So my questions still remain
>
> Sent from my iPhone
>
> On May 9, 2017, at 6:52 PM, Adrian Klaver <adrian.klaver@aklaver.com
> <mailto:adrian.klaver@aklaver.com>> wrote:
>

--
Adrian Klaver
adrian.klaver@aklaver.com


Re: [GENERAL] views and fdw usage and performance

От
"Armand Pirvu (home)"
Дата:
On May 9, 2017, at 7:11 PM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:

> On 05/09/2017 05:02 PM, armand pirvu wrote:
>> Well
>> Jt1 is prod and jt2 is dev
>
> You are talking schemas, not databases, correct?
>
>

Correct

>> Before someone pushes to prod it does work in dev. The jdbc connection
>
> That would concern me, as anything bad that happened in the dev schema could bring the entire database to its knees,
includingthe prod schema. 
>
> How does data get into the prod schema if the connection is to the dev schema?

If you are a user in say category  B you get to dev where you do your thing. If you deem okay you push to prod.
If you are a user in say category A you get to prod



>
>> routes to jt2. In the mean time it wad needed that some tables in prod are synced at all times from dev. Hence the
view/fdw.
>> What I meant by connections was more to say the type of load or users doing something in each schema.
>
> The issue being that if you are pushing data from jt2 --> jt1 you are also pushing the load in the same direction.

I see but short of using something like Slony in between the  two schemas I don’t see a pretty simple choice


>
>> So my questions still remain

And about the plan from the fdw am I right or wrong ? I am inclined to say I am right based on the numbers in the
timings


>> Sent from my iPhone
>> On May 9, 2017, at 6:52 PM, Adrian Klaver <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:
>
> --
> Adrian Klaver
> adrian.klaver@aklaver.com



Re: [GENERAL] views and fdw usage and performance

От
Adrian Klaver
Дата:
On 05/09/2017 07:03 PM, Armand Pirvu (home) wrote:
>
> On May 9, 2017, at 7:11 PM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
>
>> On 05/09/2017 05:02 PM, armand pirvu wrote:
>>> Well
>>> Jt1 is prod and jt2 is dev
>>
>> You are talking schemas, not databases, correct?
>>
>>
>
> Correct
>
>>> Before someone pushes to prod it does work in dev. The jdbc connection
>>
>> That would concern me, as anything bad that happened in the dev schema could bring the entire database to its knees,
includingthe prod schema. 
>>
>> How does data get into the prod schema if the connection is to the dev schema?
>
> If you are a user in say category  B you get to dev where you do your thing. If you deem okay you push to prod.
> If you are a user in say category A you get to prod
>
>
>
>>
>>> routes to jt2. In the mean time it wad needed that some tables in prod are synced at all times from dev. Hence the
view/fdw.
>>> What I meant by connections was more to say the type of load or users doing something in each schema.
>>
>> The issue being that if you are pushing data from jt2 --> jt1 you are also pushing the load in the same direction.
>
> I see but short of using something like Slony in between the  two schemas I don’t see a pretty simple choice

Create a separate dev database or cluster?

>
>
>>
>>> So my questions still remain
>
> And about the plan from the fdw am I right or wrong ? I am inclined to say I am right based on the numbers in the
timings

The timings where for a simple select. I am going to say doing something
that INSERTs/UPDATEs is going to be different. Still since a FDW
involves establishing another connection and passing data across it I
would say it is going to be less efficient.

>
>
>>> Sent from my iPhone
>>> On May 9, 2017, at 6:52 PM, Adrian Klaver <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:
>>
>> --
>> Adrian Klaver
>> adrian.klaver@aklaver.com
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com