Обсуждение: performance change from 8.3.1 to later releases

Поиск
Список
Период
Сортировка

performance change from 8.3.1 to later releases

От
Roger Ging
Дата:
Hi,

I have access to servers running 8.3.1, 8.3.8, 8.4.2 and 8.4.3.  I have
noticed that on the 8.4.* versions, a lot of our code is either taking
much longer to complete, or never completing.  I think I have isolated
the problem to queries using in(), not in() or not exists().  I've put
together a test case with one particular query that demonstrates the
problem.

select count(*) from traderhank.vendor_catalog = 147,352

select count(*) from traderhank.xc_products = 8,610

The sub query (select vc1.th_sku from traderhank.vendor_catalog vc1
join traderhank.vendor_catalog vc2 on vc2.short_desc_75 = vc1.short_desc_75
and vc2.th_sku != vc1.th_sku
where vc1.cutoff_date is null and vc2.cutoff_date is null
group by vc1.th_sku
)  yields 54,390 rows

The sub query (select vc_th_Sku from traderhank.xc_products where
vc_th_sku is not null) yields 5,132 rows

These 2 tables have been loaded from a pg_dump on all servers, vacuum
analyze run after load.

1st case: pg 8.3.1 using left join finishes the query in about 3.5 seconds

explain analyze
select vc.* from traderhank.vendor_catalog vc
left join
(
select vc1.th_sku from traderhank.vendor_catalog vc1
join traderhank.vendor_catalog vc2 on vc2.short_desc_75 = vc1.short_desc_75
and vc2.th_sku != vc1.th_sku
where vc1.cutoff_date is null and vc2.cutoff_date is null
group by vc1.th_sku
) vcj on vcj.th_sku = vc.th_sku
left join traderhank.xc_products xc on xc.vc_th_sku = vc.th_sku
where vcj.th_sku is null
and xc.vc_th_sku is null

"Merge Left Join  (cost=71001.53..72899.35 rows=36838 width=310) (actual
time=9190.446..10703.509 rows=78426 loops=1)"
"  Merge Cond: ((vc.th_sku)::text = (vc1.th_sku)::text)"
"  Filter: (vc1.th_sku IS NULL)"
"  ->  Merge Left Join  (cost=19362.72..20201.46 rows=73676 width=310)
(actual time=917.947..1784.593 rows=141962 loops=1)"
"        Merge Cond: ((vc.th_sku)::text = (xc.vc_th_sku)::text)"
"        Filter: (xc.vc_th_sku IS NULL)"
"        ->  Sort  (cost=17630.88..17999.26 rows=147352 width=310)
(actual time=871.130..1114.453 rows=147352 loops=1)"
"              Sort Key: vc.th_sku"
"              Sort Method:  quicksort  Memory: 45285kB"
"              ->  Seq Scan on vendor_catalog vc  (cost=0.00..4981.52
rows=147352 width=310) (actual time=0.020..254.023 rows=147352 loops=1)"
"        ->  Sort  (cost=1731.84..1753.37 rows=8610 width=8) (actual
time=46.783..62.347 rows=9689 loops=1)"
"              Sort Key: xc.vc_th_sku"
"              Sort Method:  quicksort  Memory: 734kB"
"              ->  Seq Scan on xc_products xc  (cost=0.00..1169.10
rows=8610 width=8) (actual time=0.013..25.490 rows=8610 loops=1)"
"  ->  Sort  (cost=51638.80..51814.57 rows=70309 width=32) (actual
time=8272.483..8382.258 rows=66097 loops=1)"
"        Sort Key: vc1.th_sku"
"        Sort Method:  quicksort  Memory: 4086kB"
"        ->  HashAggregate  (cost=44572.25..45275.34 rows=70309 width=8)
(actual time=7978.928..8080.317 rows=54390 loops=1)"
"              ->  Merge Join  (cost=27417.09..42493.30 rows=831580
width=8) (actual time=1317.874..6380.928 rows=810012 loops=1)"
"                    Merge Cond: ((vc1.short_desc_75)::text =
(vc2.short_desc_75)::text)"
"                    Join Filter: ((vc2.th_sku)::text <>
(vc1.th_sku)::text)"
"                    ->  Sort  (cost=13708.55..13970.22 rows=104669
width=27) (actual time=661.319..834.131 rows=104624 loops=1)"
"                          Sort Key: vc1.short_desc_75"
"                          Sort Method:  quicksort  Memory: 11235kB"
"                          ->  Seq Scan on vendor_catalog vc1
(cost=0.00..4981.52 rows=104669 width=27) (actual time=0.010..268.552
rows=104624 loops=1)"
"                                Filter: (cutoff_date IS NULL)"
"                    ->  Sort  (cost=13708.55..13970.22 rows=104669
width=27) (actual time=656.447..2130.290 rows=914636 loops=1)"
"                          Sort Key: vc2.short_desc_75"
"                          Sort Method:  quicksort  Memory: 11235kB"
"                          ->  Seq Scan on vendor_catalog vc2
(cost=0.00..4981.52 rows=104669 width=27) (actual time=0.015..266.926
rows=104624 loops=1)"
"                                Filter: (cutoff_date IS NULL)"
"Total runtime: 10837.005 ms"


This query returns same set of rows, in about 2.8 seconds:

explain analyze
select vc.* from traderhank.vendor_catalog vc
where vc.th_sku not in
(
select vc1.th_sku from traderhank.vendor_catalog vc1
join traderhank.vendor_catalog vc2 on vc2.short_desc_75 = vc1.short_desc_75
and vc2.th_sku != vc1.th_sku
where vc1.cutoff_date is null and vc2.cutoff_date is null and vc1.th_sku
is not null
group by vc1.th_sku
)
and vc.th_sku not in
(select vc_th_Sku from traderhank.xc_products where vc_th_sku is not null)


"Seq Scan on vendor_catalog vc  (cost=46633.03..52351.31 rows=36838
width=310) (actual time=8216.197..8506.825 rows=78426 loops=1)"
"  Filter: ((NOT (hashed subplan)) AND (NOT (hashed subplan)))"
"  SubPlan"
"    ->  Seq Scan on xc_products  (cost=0.00..1169.10 rows=5129 width=8)
(actual time=0.026..16.907 rows=5132 loops=1)"
"          Filter: (vc_th_sku IS NOT NULL)"
"    ->  HashAggregate  (cost=44572.25..45275.34 rows=70309 width=8)
(actual time=7973.792..8076.297 rows=54390 loops=1)"
"          ->  Merge Join  (cost=27417.09..42493.30 rows=831580 width=8)
(actual time=1325.988..6377.197 rows=810012 loops=1)"
"                Merge Cond: ((vc1.short_desc_75)::text =
(vc2.short_desc_75)::text)"
"                Join Filter: ((vc2.th_sku)::text <> (vc1.th_sku)::text)"
"                ->  Sort  (cost=13708.55..13970.22 rows=104669
width=27) (actual time=669.237..841.978 rows=104624 loops=1)"
"                      Sort Key: vc1.short_desc_75"
"                      Sort Method:  quicksort  Memory: 11235kB"
"                      ->  Seq Scan on vendor_catalog vc1
(cost=0.00..4981.52 rows=104669 width=27) (actual time=0.014..272.037
rows=104624 loops=1)"
"                            Filter: ((cutoff_date IS NULL) AND (th_sku
IS NOT NULL))"
"                ->  Sort  (cost=13708.55..13970.22 rows=104669
width=27) (actual time=656.638..2130.440 rows=914636 loops=1)"
"                      Sort Key: vc2.short_desc_75"
"                      Sort Method:  quicksort  Memory: 11235kB"
"                      ->  Seq Scan on vendor_catalog vc2
(cost=0.00..4981.52 rows=104669 width=27) (actual time=0.016..266.767
rows=104624 loops=1)"
"                            Filter: (cutoff_date IS NULL)"
"Total runtime: 8631.652 ms"


So far, so good.

Same 2 queries on 8.4.2:

Left join version will return same rows in about 42 seconds

explain analyze
select vc.* from traderhank.vendor_catalog vc
left join
(
select vc1.th_sku from traderhank.vendor_catalog vc1
join traderhank.vendor_catalog vc2 on vc2.short_desc_75 = vc1.short_desc_75
and vc2.th_sku != vc1.th_sku
where vc1.cutoff_date is null and vc2.cutoff_date is null
group by vc1.th_sku
) vcj on vcj.th_sku = vc.th_sku
left join traderhank.xc_products xc on xc.vc_th_sku = vc.th_sku
where vcj.th_sku is null
and xc.vc_th_sku is null

"Hash Anti Join  (cost=142357.84..167341.98 rows=140877 width=309)
(actual time=42455.615..44244.251 rows=78426 loops=1)"
"  Hash Cond: ((vc.th_sku)::text = (vc1.th_sku)::text)"
"  ->  Hash Anti Join  (cost=1829.48..12853.64 rows=141143 width=309)
(actual time=62.380..1049.863 rows=141962 loops=1)"
"        Hash Cond: ((vc.th_sku)::text = (xc.vc_th_sku)::text)"
"        ->  Seq Scan on vendor_catalog vc  (cost=0.00..8534.52
rows=147352 width=309) (actual time=0.009..351.005 rows=147352 loops=1)"
"        ->  Hash  (cost=1716.99..1716.99 rows=8999 width=8) (actual
time=62.348..62.348 rows=5132 loops=1)"
"              ->  Seq Scan on xc_products xc  (cost=0.00..1716.99
rows=8999 width=8) (actual time=0.009..45.818 rows=8610 loops=1)"
"  ->  Hash  (cost=139067.10..139067.10 rows=75541 width=32) (actual
time=42393.149..42393.149 rows=54390 loops=1)"
"        ->  Group  (cost=134997.43..138311.69 rows=75541 width=8)
(actual time=35987.418..42264.948 rows=54390 loops=1)"
"              ->  Sort  (cost=134997.43..136654.56 rows=662853 width=8)
(actual time=35987.407..40682.275 rows=810012 loops=1)"
"                    Sort Key: vc1.th_sku"
"                    Sort Method:  external merge  Disk: 14256kB"
"                    ->  Merge Join  (cost=39600.73..52775.08
rows=662853 width=8) (actual time=5762.785..13763.041 rows=810012 loops=1)"
"                          Merge Cond: ((vc1.short_desc_75)::text =
(vc2.short_desc_75)::text)"
"                          Join Filter: ((vc2.th_sku)::text <>
(vc1.th_sku)::text)"
"                          ->  Sort  (cost=19800.37..20062.75
rows=104954 width=27) (actual time=2884.012..3604.405 rows=104624 loops=1)"
"                                Sort Key: vc1.short_desc_75"
"                                Sort Method:  external merge  Disk: 3776kB"
"                                ->  Seq Scan on vendor_catalog vc1
(cost=0.00..8534.52 rows=104954 width=27) (actual time=0.009..395.976
rows=104624 loops=1)"
"                                      Filter: (cutoff_date IS NULL)"
"                          ->  Materialize  (cost=19800.37..21112.29
rows=104954 width=27) (actual time=2878.550..5291.205 rows=914636 loops=1)"
"                                ->  Sort  (cost=19800.37..20062.75
rows=104954 width=27) (actual time=2878.538..3607.201 rows=104624 loops=1)"
"                                      Sort Key: vc2.short_desc_75"
"                                      Sort Method:  external merge
Disk: 3776kB"
"                                      ->  Seq Scan on vendor_catalog
vc2  (cost=0.00..8534.52 rows=104954 width=27) (actual
time=0.018..392.270 rows=104624 loops=1)"
"                                            Filter: (cutoff_date IS NULL)"
"Total runtime: 45145.977 ms"



on any version from 8.3.8 on, this query has never returned, and explain
analyze never returns, so I am only posting explain output

explain --analyze
select vc.* from traderhank.vendor_catalog vc
where vc.th_sku not in
(
select vc1.th_sku from traderhank.vendor_catalog vc1
join traderhank.vendor_catalog vc2 on vc2.short_desc_75 = vc1.short_desc_75
and vc2.th_sku != vc1.th_sku
where vc1.cutoff_date is null and vc2.cutoff_date is null and vc1.th_sku
is not null
group by vc1.th_sku
)
and vc.th_sku not in
(select vc_th_Sku from traderhank.xc_products where vc_th_sku is not null)


"Seq Scan on vendor_catalog vc  (cost=140413.05..91527264.28 rows=36838
width=309)"
"  Filter: ((NOT (hashed SubPlan 2)) AND (NOT (SubPlan 1)))"
"  SubPlan 2"
"    ->  Seq Scan on xc_products  (cost=0.00..1716.99 rows=5132 width=8)"
"          Filter: (vc_th_sku IS NOT NULL)"
"  SubPlan 1"
"    ->  Materialize  (cost=138683.23..139734.64 rows=75541 width=8)"
"          ->  Group  (cost=134997.43..138311.69 rows=75541 width=8)"
"                ->  Sort  (cost=134997.43..136654.56 rows=662853 width=8)"
"                      Sort Key: vc1.th_sku"
"                      ->  Merge Join  (cost=39600.73..52775.08
rows=662853 width=8)"
"                            Merge Cond: ((vc1.short_desc_75)::text =
(vc2.short_desc_75)::text)"
"                            Join Filter: ((vc2.th_sku)::text <>
(vc1.th_sku)::text)"
"                            ->  Sort  (cost=19800.37..20062.75
rows=104954 width=27)"
"                                  Sort Key: vc1.short_desc_75"
"                                  ->  Seq Scan on vendor_catalog vc1
(cost=0.00..8534.52 rows=104954 width=27)"
"                                        Filter: ((cutoff_date IS NULL)
AND (th_sku IS NOT NULL))"
"                            ->  Materialize  (cost=19800.37..21112.29
rows=104954 width=27)"
"                                  ->  Sort  (cost=19800.37..20062.75
rows=104954 width=27)"
"                                        Sort Key: vc2.short_desc_75"
"                                        ->  Seq Scan on vendor_catalog
vc2  (cost=0.00..8534.52 rows=104954 width=27)"
"                                              Filter: (cutoff_date IS
NULL)"




I've also tried changing the code to not exists, but that query never
comes back on any version I have available:

explain --analyze
select vc.* from traderhank.vendor_catalog vc
where not exists
(
select 1 from traderhank.vendor_catalog vc1
join traderhank.vendor_catalog vc2 on vc2.short_desc_75 = vc1.short_desc_75
and vc2.th_sku != vc1.th_sku
where vc1.cutoff_date is null and vc2.cutoff_date is null and vc1.th_sku
= vc.th_sku
group by vc1.th_sku
)
and not exists
(select 1 from traderhank.xc_products where vc_th_sku is not null and
vc_th_sku = vc.th_sku)

"Nested Loop Anti Join  (cost=63650.74..93617.53 rows=1 width=309)"
"  Join Filter: ((xc_products.vc_th_sku)::text = (vc.th_sku)::text)"
"  ->  Hash Anti Join  (cost=63650.74..91836.39 rows=1 width=309)"
"        Hash Cond: ((vc.th_sku)::text = (vc1.th_sku)::text)"
"        ->  Seq Scan on vendor_catalog vc  (cost=0.00..8534.52
rows=147352 width=309)"
"        ->  Hash  (cost=52775.08..52775.08 rows=662853 width=8)"
"              ->  Merge Join  (cost=39600.73..52775.08 rows=662853
width=8)"
"                    Merge Cond: ((vc1.short_desc_75)::text =
(vc2.short_desc_75)::text)"
"                    Join Filter: ((vc2.th_sku)::text <>
(vc1.th_sku)::text)"
"                    ->  Sort  (cost=19800.37..20062.75 rows=104954
width=27)"
"                          Sort Key: vc1.short_desc_75"
"                          ->  Seq Scan on vendor_catalog vc1
(cost=0.00..8534.52 rows=104954 width=27)"
"                                Filter: (cutoff_date IS NULL)"
"                    ->  Materialize  (cost=19800.37..21112.29
rows=104954 width=27)"
"                          ->  Sort  (cost=19800.37..20062.75
rows=104954 width=27)"
"                                Sort Key: vc2.short_desc_75"
"                                ->  Seq Scan on vendor_catalog vc2
(cost=0.00..8534.52 rows=104954 width=27)"
"                                      Filter: (cutoff_date IS NULL)"
"  ->  Seq Scan on xc_products  (cost=0.00..1716.99 rows=5132 width=8)"
"        Filter: (xc_products.vc_th_sku IS NOT NULL)"



So, my question is, do I need  to re-write all of my in() and not in ()
queries to left joins, is this something that might get resolved in
another release in the future?

Thanks for any help.

Roger Ging



Re: performance change from 8.3.1 to later releases

От
Scott Marlowe
Дата:
On Tue, Apr 20, 2010 at 12:38 PM, Roger Ging <rging@musicreports.com> wrote:
> Hi,
>
> I have access to servers running 8.3.1, 8.3.8, 8.4.2 and 8.4.3.  I have
> noticed that on the 8.4.* versions, a lot of our code is either taking much
> longer to complete, or never completing.  I think I have isolated the
> problem to queries using in(), not in() or not exists().  I've put together
> a test case with one particular query that demonstrates the problem.
>
> select count(*) from traderhank.vendor_catalog = 147,352
>
> select count(*) from traderhank.xc_products = 8,610
>
> The sub query (select vc1.th_sku from traderhank.vendor_catalog vc1
> join traderhank.vendor_catalog vc2 on vc2.short_desc_75 = vc1.short_desc_75
> and vc2.th_sku != vc1.th_sku
> where vc1.cutoff_date is null and vc2.cutoff_date is null
> group by vc1.th_sku
> )  yields 54,390 rows
>
> The sub query (select vc_th_Sku from traderhank.xc_products where vc_th_sku
> is not null) yields 5,132 rows
>
> These 2 tables have been loaded from a pg_dump on all servers, vacuum
> analyze run after load.
>
> 1st case: pg 8.3.1 using left join finishes the query in about 3.5 seconds
>
> explain analyze
> select vc.* from traderhank.vendor_catalog vc
> left join
> (
> select vc1.th_sku from traderhank.vendor_catalog vc1
> join traderhank.vendor_catalog vc2 on vc2.short_desc_75 = vc1.short_desc_75
> and vc2.th_sku != vc1.th_sku
> where vc1.cutoff_date is null and vc2.cutoff_date is null
> group by vc1.th_sku
> ) vcj on vcj.th_sku = vc.th_sku
> left join traderhank.xc_products xc on xc.vc_th_sku = vc.th_sku
> where vcj.th_sku is null
> and xc.vc_th_sku is null
>
> "Merge Left Join  (cost=71001.53..72899.35 rows=36838 width=310) (actual
> time=9190.446..10703.509 rows=78426 loops=1)"
> "  Merge Cond: ((vc.th_sku)::text = (vc1.th_sku)::text)"
> "  Filter: (vc1.th_sku IS NULL)"
> "  ->  Merge Left Join  (cost=19362.72..20201.46 rows=73676 width=310)
> (actual time=917.947..1784.593 rows=141962 loops=1)"
> "        Merge Cond: ((vc.th_sku)::text = (xc.vc_th_sku)::text)"
> "        Filter: (xc.vc_th_sku IS NULL)"
> "        ->  Sort  (cost=17630.88..17999.26 rows=147352 width=310) (actual
> time=871.130..1114.453 rows=147352 loops=1)"
> "              Sort Key: vc.th_sku"
> "              Sort Method:  quicksort  Memory: 45285kB"
> "              ->  Seq Scan on vendor_catalog vc  (cost=0.00..4981.52
> rows=147352 width=310) (actual time=0.020..254.023 rows=147352 loops=1)"
> "        ->  Sort  (cost=1731.84..1753.37 rows=8610 width=8) (actual
> time=46.783..62.347 rows=9689 loops=1)"
> "              Sort Key: xc.vc_th_sku"
> "              Sort Method:  quicksort  Memory: 734kB"
> "              ->  Seq Scan on xc_products xc  (cost=0.00..1169.10 rows=8610
> width=8) (actual time=0.013..25.490 rows=8610 loops=1)"
> "  ->  Sort  (cost=51638.80..51814.57 rows=70309 width=32) (actual
> time=8272.483..8382.258 rows=66097 loops=1)"
> "        Sort Key: vc1.th_sku"
> "        Sort Method:  quicksort  Memory: 4086kB"

So here we get a hash agg in ~4M memory:

> "        ->  HashAggregate  (cost=44572.25..45275.34 rows=70309 width=8)
> (actual time=7978.928..8080.317 rows=54390 loops=1)"

And the row estimate is similar.

(much deleted)

> on any version from 8.3.8 on, this query has never returned, and explain
> analyze never returns, so I am only posting explain output

We get a Seq Scan with a huge cost, and no hash agg or quick sort.  Is
the work_mem the same or similar?  I'd crank it up for testing just to
see if it helps.  16Meg is pretty safe on a low traffic machine.

> "Seq Scan on vendor_catalog vc  (cost=140413.05..91527264.28 rows=36838
> width=309)"
> "  Filter: ((NOT (hashed SubPlan 2)) AND (NOT (SubPlan 1)))"
> "  SubPlan 2"
> "    ->  Seq Scan on xc_products  (cost=0.00..1716.99 rows=5132 width=8)"
> "          Filter: (vc_th_sku IS NOT NULL)"
> "  SubPlan 1"
> "    ->  Materialize  (cost=138683.23..139734.64 rows=75541 width=8)"
> "          ->  Group  (cost=134997.43..138311.69 rows=75541 width=8)"
> "                ->  Sort  (cost=134997.43..136654.56 rows=662853 width=8)"
> "                      Sort Key: vc1.th_sku"
> "                      ->  Merge Join  (cost=39600.73..52775.08 rows=662853
> width=8)"
> "                            Merge Cond: ((vc1.short_desc_75)::text =
> (vc2.short_desc_75)::text)"
> "                            Join Filter: ((vc2.th_sku)::text <>
> (vc1.th_sku)::text)"
> "                            ->  Sort  (cost=19800.37..20062.75 rows=104954
> width=27)"
> "                                  Sort Key: vc1.short_desc_75"
> "                                  ->  Seq Scan on vendor_catalog vc1
>  (cost=0.00..8534.52 rows=104954 width=27)"
> "                                        Filter: ((cutoff_date IS NULL) AND
> (th_sku IS NOT NULL))"
> "                            ->  Materialize  (cost=19800.37..21112.29
> rows=104954 width=27)"
> "                                  ->  Sort  (cost=19800.37..20062.75
> rows=104954 width=27)"
> "                                        Sort Key: vc2.short_desc_75"
> "                                        ->  Seq Scan on vendor_catalog vc2
>  (cost=0.00..8534.52 rows=104954 width=27)"
> "                                              Filter: (cutoff_date IS
> NULL)"
>
>
>
>
> I've also tried changing the code to not exists, but that query never comes
> back on any version I have available:
>
> explain --analyze
> select vc.* from traderhank.vendor_catalog vc
> where not exists
> (
> select 1 from traderhank.vendor_catalog vc1
> join traderhank.vendor_catalog vc2 on vc2.short_desc_75 = vc1.short_desc_75
> and vc2.th_sku != vc1.th_sku
> where vc1.cutoff_date is null and vc2.cutoff_date is null and vc1.th_sku =
> vc.th_sku
> group by vc1.th_sku
> )
> and not exists
> (select 1 from traderhank.xc_products where vc_th_sku is not null and
> vc_th_sku = vc.th_sku)
>
> "Nested Loop Anti Join  (cost=63650.74..93617.53 rows=1 width=309)"
> "  Join Filter: ((xc_products.vc_th_sku)::text = (vc.th_sku)::text)"
> "  ->  Hash Anti Join  (cost=63650.74..91836.39 rows=1 width=309)"
> "        Hash Cond: ((vc.th_sku)::text = (vc1.th_sku)::text)"
> "        ->  Seq Scan on vendor_catalog vc  (cost=0.00..8534.52 rows=147352
> width=309)"
> "        ->  Hash  (cost=52775.08..52775.08 rows=662853 width=8)"
> "              ->  Merge Join  (cost=39600.73..52775.08 rows=662853
> width=8)"
> "                    Merge Cond: ((vc1.short_desc_75)::text =
> (vc2.short_desc_75)::text)"
> "                    Join Filter: ((vc2.th_sku)::text <>
> (vc1.th_sku)::text)"
> "                    ->  Sort  (cost=19800.37..20062.75 rows=104954
> width=27)"
> "                          Sort Key: vc1.short_desc_75"
> "                          ->  Seq Scan on vendor_catalog vc1
>  (cost=0.00..8534.52 rows=104954 width=27)"
> "                                Filter: (cutoff_date IS NULL)"
> "                    ->  Materialize  (cost=19800.37..21112.29 rows=104954
> width=27)"
> "                          ->  Sort  (cost=19800.37..20062.75 rows=104954
> width=27)"
> "                                Sort Key: vc2.short_desc_75"
> "                                ->  Seq Scan on vendor_catalog vc2
>  (cost=0.00..8534.52 rows=104954 width=27)"
> "                                      Filter: (cutoff_date IS NULL)"
> "  ->  Seq Scan on xc_products  (cost=0.00..1716.99 rows=5132 width=8)"
> "        Filter: (xc_products.vc_th_sku IS NOT NULL)"
>
>
>
> So, my question is, do I need  to re-write all of my in() and not in ()
> queries to left joins, is this something that might get resolved in another
> release in the future?
>
> Thanks for any help.
>
> Roger Ging
>
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>



--
When fascism comes to America, it will be intolerance sold as diversity.

Re: performance change from 8.3.1 to later releases

От
Tom Lane
Дата:
Scott Marlowe <scott.marlowe@gmail.com> writes:
> On Tue, Apr 20, 2010 at 12:38 PM, Roger Ging <rging@musicreports.com> wrote:
>> I have access to servers running 8.3.1, 8.3.8, 8.4.2 and 8.4.3. �I have
>> noticed that on the 8.4.* versions, a lot of our code is either taking much
>> longer to complete, or never completing. �I think I have isolated the
>> problem to queries using in(), not in() or not exists(). �I've put together
>> a test case with one particular query that demonstrates the problem.

> We get a Seq Scan with a huge cost, and no hash agg or quick sort.  Is
> the work_mem the same or similar?

It looks to me like it's not.  The 8.4 plan is showing sorts spilling to
disk for amounts of data that the 8.3 plan is perfectly willing to hold
in memory.  I'm also wondering if the 8.4 server is on comparable
hardware, because it seems to be only about half as fast for the plain
seqscan steps, which surely ought to be no worse than before.

            regards, tom lane