Обсуждение: Performance issues with custom functions
I currently have an infrastructure that's based around SQL Server 2000. I'm trying to move some of the data over to Postgres, partly to reduce the load on the SQL Server machine and partly because some queries I'd like to run are too slow to be usuable on SQL Server. Mostly likely over time more and more data will move to Postgres. To help with this transition, I created a Postgres plugin which queries the contents of SQL Server tables via ODBC and returns a recordset. I then create views around the function and I can then read from the SQL Server tables as if they were local to Postgres. I have four tables involved in this query. The major one is provider_location, which has about 2 million rows and is stored in Postgres. The other three are stored in SQL Server and accessed via views like I mentioned above. They are network, network_state, and xlat_tbl, and contain about 40, 250, and 500 rows. A simple select * from any of the views takes somewhere around 50ms. This query in question was written for SQL Server. I have no idea why it was written in the form it was, but it ran at a reasonable speed when all the tables were on one machine. Running the original query (after adjusting for syntax differences) on Postgres resulted in a query that would run for hours, continually allocating more RAM. I eventually had to kill the process as it was devouring swap space. My assumption is that Postgres is doing the ODBC query for each row of a join somewhere, even though the function is marked stable (immutable didn't make a difference). Flattening the query made it run in a few minutes. I think the flattened query is easier to read, and it runs faster, so I'm not complaining that I can't use the original query. But I'd like to know exactly what causes the bottleneck in the original query, and if there are other approaches to solving the issue in case I need them in future queries. Below is the original query, the explain output, the modified query, the explain output, and the explain analyze output. Ed select pl.network_id,n.name as network_name,pl.state_cd,count(pl.state_cd) as provider_count from development.provider_location pl,development.network n where pl.network_id in (select ns.network_id from development.network_state ns where ns.from_date < current_time and (ns.thru_date > current_time or ns.thru_date is null) and (ns.state_cd = pl.state_cd or ns.state_cd='') ) and pl.network_id = n.network_id and pl.state_cd is not null and pl.state_cd in (select field_value from development.xlat_tbl where field_name ='State_CD') group by pl.state_cd,n.name,pl.network_id order by pl.state_cd,network_name; Explain: GroupAggregate (cost=80548547.83..80549256.80 rows=47265 width=52) -> Sort (cost=80548547.83..80548665.99 rows=47265 width=52) Sort Key: pl.state_cd, odbc_select.name, pl.network_id -> Hash Join (cost=30.01..80543806.14 rows=47265 width=52) Hash Cond: (("outer".network_id)::text = ("inner".network_id)::text) -> Hash IN Join (cost=15.01..80540931.61 rows=9453 width=20) Hash Cond: (("outer".state_cd)::text = ("inner".field_value)::text) -> Seq Scan on provider_location pl (cost=0.00..80535150.29 rows=1890593 width=20) Filter: ((state_cd IS NOT NULL) AND (subplan)) SubPlan -> Function Scan on odbc_select (cost=0.00..42.50 rows=2 width=32) Filter: (((from_date)::text < (('now'::text)::time(6) with time zone)::text) AND (((thru_date)::text > (('now'::text)::time(6) with time zone)::text) OR (thru_date IS NULL)) AND (((state_cd)::text = ($0)::text) OR ((state_cd)::text = ''::text))) -> Hash (cost=15.00..15.00 rows=5 width=32) -> Function Scan on odbc_select (cost=0.00..15.00 rows=5 width=32) Filter: ((field_name)::text = 'State_CD'::text) -> Hash (cost=12.50..12.50 rows=1000 width=64) -> Function Scan on odbc_select (cost=0.00..12.50 rows=1000 width=64) Flattened query: select pl.network_id, n.name as network_name, pl.state_cd, count(pl.state_cd) as provider_count from development.network n, development.network_state ns, development.xlat_tbl xt, development.provider_location pl where xt.field_name = 'State_CD' and n.network_id = ns.network_id and ns.from_date < current_timestamp and (ns.thru_date > current_timestamp or ns.thru_date is null) and (ns.state_cd = pl.state_cd or ns.state_cd='') and pl.network_id = n.network_id and pl.state_cd is not null and pl.state_cd = xt.field_value group by pl.state_cd, n.name, pl.network_id order by pl.state_cd, network_name; Explain: GroupAggregate (cost=190089.94..190129.90 rows=2664 width=52) -> Sort (cost=190089.94..190096.60 rows=2664 width=52) Sort Key: pl.state_cd, odbc_select.name, pl.network_id -> Merge Join (cost=189895.73..189938.37 rows=2664 width=52) Merge Cond: ("outer"."?column4?" = "inner"."?column3?") -> Sort (cost=189833.40..189834.73 rows=533 width=52) Sort Key: (pl.network_id)::text -> Hash Join (cost=42.80..189809.26 rows=533 width=52) Hash Cond: (("outer".network_id)::text = ("inner".network_id)::text) Join Filter: ((("inner".state_cd)::text = ("outer".state_cd)::text) OR (("inner".state_cd)::text = ''::text)) -> Hash Join (cost=15.01..185908.10 rows=94530 width=20) Hash Cond: (("outer".state_cd)::text = ("inner".field_value)::text) -> Seq Scan on provider_location pl (cost=0.00..166041.86 rows=3781186 width=20) Filter: (state_cd IS NOT NULL) -> Hash (cost=15.00..15.00 rows=5 width=32) -> Function Scan on odbc_select (cost=0.00..15.00 rows=5 width=32) Filter: ((field_name)::text = 'State_CD'::text) -> Hash (cost=27.50..27.50 rows=113 width=64) -> Function Scan on odbc_select (cost=0.00..27.50 rows=113 width=64) Filter: ((from_date < ('now'::text)::timestamp(6) with time zone) AND ((thru_date > ('now'::text)::timestamp(6) with time zone) OR (thru_date IS NULL))) -> Sort (cost=62.33..64.83 rows=1000 width=64) Sort Key: (odbc_select.network_id)::text -> Function Scan on odbc_select (cost=0.00..12.50 rows=1000 width=64) Explain Analyze: "GroupAggregate (cost=190089.94..190129.90 rows=2664 width=52) (actual time=254757.742..261725.786 rows=350 loops=1)" " -> Sort (cost=190089.94..190096.60 rows=2664 width=52) (actual time=254757.438..257267.224 rows=1316774 loops=1)" " Sort Key: pl.state_cd, odbc_select.name, pl.network_id" " -> Merge Join (cost=189895.73..189938.37 rows=2664 width=52) (actual time=189325.877..203579.050 rows=1316774 loops=1)" " Merge Cond: ("outer"."?column4?" = "inner"."?column3?")" " -> Sort (cost=189833.40..189834.73 rows=533 width=52) (actual time=189282.504..192284.766 rows=1316774 loops=1)" " Sort Key: (pl.network_id)::text" " -> Hash Join (cost=42.80..189809.26 rows=533 width=52) (actual time=1177.758..151180.472 rows=1316774 loops=1)" " Hash Cond: (("outer".network_id)::text = ("inner".network_id)::text)" " Join Filter: ((("inner".state_cd)::text = ("outer".state_cd)::text) OR (("inner".state_cd)::text = ''::text))" " -> Hash Join (cost=15.01..185908.10 rows=94530 width=20) (actual time=1095.949..50495.766 rows=1890457 loops=1)" " Hash Cond: (("outer".state_cd)::text = ("inner".field_value)::text)" " -> Seq Scan on provider_location pl (cost=0.00..166041.86 rows=3781186 width=20) (actual time=1071.011..36224.961 rows=1891183 loops=1)" " Filter: (state_cd IS NOT NULL)" " -> Hash (cost=15.00..15.00 rows=5 width=32) (actual time=24.832..24.832 rows=0 loops=1)" " -> Function Scan on odbc_select (cost=0.00..15.00 rows=5 width=32) (actual time=24.469..24.724 rows=51 loops=1)" " Filter: ((field_name)::text = 'State_CD'::text)" " -> Hash (cost=27.50..27.50 rows=113 width=64) (actual time=81.684..81.684 rows=0 loops=1)" " -> Function Scan on odbc_select (cost=0.00..27.50 rows=113 width=64) (actual time=75.288..81.200 rows=211 loops=1)" " Filter: ((from_date < ('now'::text)::timestamp(6) with time zone) AND ((thru_date > ('now'::text)::timestamp(6) with time zone) OR (thru_date IS NULL)))" " -> Sort (cost=62.33..64.83 rows=1000 width=64) (actual time=43.301..1258.901 rows=1289952 loops=1)" " Sort Key: (odbc_select.network_id)::text" " -> Function Scan on odbc_select (cost=0.00..12.50 rows=1000 width=64) (actual time=43.010..43.109 rows=34 loops=1)" "Total runtime: 261902.966 ms"
"Edward Di Geronimo Jr." <edigeronimo@xtracards.com> writes: > ... I'd like to know exactly what causes > the bottleneck in the original query, and if there are other approaches > to solving the issue in case I need them in future queries. This is fairly hard to read ... it would help a lot if you had shown the view definitions that the query relies on, so that we could match up the plan elements with the query a bit better. However, I'm thinking the problem is with this IN clause: > where pl.network_id in (select ns.network_id > from development.network_state ns > where ns.from_date < current_time > and (ns.thru_date > current_time or > ns.thru_date is null) > and (ns.state_cd = pl.state_cd or ns.state_cd='') > ) Because the sub-SELECT references pl.state_cd (an outer variable reference), there's no chance of optimizing this into a join-style IN. So the sub-SELECT has to be re-executed for each row of the outer query. BTW, it's not apparent to me that your "flattened" query gives the same answers as the original. What if a pl row can join to more than one row of the ns output? regards, tom lane
Tom Lane wrote:
create view development.network as
select * from odbc_select('amsterdam', 'bob.dbo.network') as (
network_id varchar ,
status_cd varchar ,
name varchar ,
network_action varchar ,
physical_type_cd varchar ,
service_type_cd varchar ,
parent_network_id varchar ,
commission_network_id varchar ,
rep_id varchar ,
tax_id varchar ,
url varchar ,
entry_method_cd varchar ,
entry_individual_type_cd varchar ,
entry_individual_id varchar ,
service varchar (30),
cost_routine varchar (150),
commission_rate numeric(5, 5) ,
directory_number varchar (11),
search_url varchar (200),
member_rate numeric(15, 2) ,
free_months numeric(18, 0) ,
eligibility_hound varchar (60)
)
create view development.network_state as
select * from odbc_select('amsterdam', 'bob.dbo.network_state') as (
network_id varchar,
state_cd varchar,
product varchar (100) ,
status_cd varchar,
entry_method_cd varchar,
entry_individual_type_cd varchar,
entry_individual_id varchar,
logo_id int ,
from_date timestamp ,
thru_date timestamp
)
create view development.xlat_tbl as
select * from odbc_select('amsterdam', 'xlat_tbl') as (
field_name varchar ,
field_value varchar ,
status_cd varchar ,
descr varchar ,
descrshort varchar ,
entry_method_cd varchar ,
entry_individual_type_cd varchar ,
entry_individual_id varchar
)
I guess the next question is, is there any way I can give postgres hints about what constraints exist on the data in these views?
Ed
I wasn't sure how helpful it would be. Here they are:This is fairly hard to read ... it would help a lot if you had shown the view definitions that the query relies on, so that we could match up the plan elements with the query a bit better.
create view development.network as
select * from odbc_select('amsterdam', 'bob.dbo.network') as (
network_id varchar ,
status_cd varchar ,
name varchar ,
network_action varchar ,
physical_type_cd varchar ,
service_type_cd varchar ,
parent_network_id varchar ,
commission_network_id varchar ,
rep_id varchar ,
tax_id varchar ,
url varchar ,
entry_method_cd varchar ,
entry_individual_type_cd varchar ,
entry_individual_id varchar ,
service varchar (30),
cost_routine varchar (150),
commission_rate numeric(5, 5) ,
directory_number varchar (11),
search_url varchar (200),
member_rate numeric(15, 2) ,
free_months numeric(18, 0) ,
eligibility_hound varchar (60)
)
create view development.network_state as
select * from odbc_select('amsterdam', 'bob.dbo.network_state') as (
network_id varchar,
state_cd varchar,
product varchar (100) ,
status_cd varchar,
entry_method_cd varchar,
entry_individual_type_cd varchar,
entry_individual_id varchar,
logo_id int ,
from_date timestamp ,
thru_date timestamp
)
create view development.xlat_tbl as
select * from odbc_select('amsterdam', 'xlat_tbl') as (
field_name varchar ,
field_value varchar ,
status_cd varchar ,
descr varchar ,
descrshort varchar ,
entry_method_cd varchar ,
entry_individual_type_cd varchar ,
entry_individual_id varchar
)
Well, I guess you are right. As far as the database can tell, the queries aren't the same. In practice, they are. network_state is essentially tracking our contract dates with different discount healthcare networks. from_date and thru_date track the timeframe we use that network, with thru_date being null for the current networks. Some networks cover all states, in which case state_cd is an empty string. Otherwise, there will be a row per state covered. I can't think of any way to enforce data integrity on this other than maybe via triggers. Is there any way to make things more clear to the database (both in general and on the postgres end of this) ? At the moment, the SQL Server table has the primary key defined as (network_id, state_cd, product), which is ok for now, but I'm realizing going forward could be an issue if we ever stopped using a network in a state and then went back to it.However, I'm thinking the problem is with this IN clause:where pl.network_id in (select ns.network_id from development.network_state ns where ns.from_date < current_time and (ns.thru_date > current_time or ns.thru_date is null) and (ns.state_cd = pl.state_cd or ns.state_cd='') )Because the sub-SELECT references pl.state_cd (an outer variable reference), there's no chance of optimizing this into a join-style IN. So the sub-SELECT has to be re-executed for each row of the outer query. BTW, it's not apparent to me that your "flattened" query gives the same answers as the original. What if a pl row can join to more than one row of the ns output?
I guess the next question is, is there any way I can give postgres hints about what constraints exist on the data in these views?
Ed