Обсуждение: Slow execution time when querying view with WHERE clause

Поиск
Список
Период
Сортировка

Slow execution time when querying view with WHERE clause

От
Mike Mascari
Дата:
I have the following view:

create or replace view market.p_areas as
select a.*
from _areas a
where a.area in (
  select b.area
  from _bins b, _inventories i, _offers o, _pricemembers p
  where b.bin = i.bin and
  i.inventory = o.inventory and
  o.pricegroup = p.pricegroup and
  p.buyer in (
   select s.store
   from _stores s, _webusers w
   where w.webuser = getWebuser() and
   w.company = s.company
    union
   select s.store
   from _stores s, _companies c
   where s.company = c.company and
   c.companyid = 'DEFAULT'
  )
);

When I query the view without a where clause I get:

  explain analyze select * from p_areas;
                                                                                                  QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Hash Join  (cost=1273.12..1276.31 rows=47 width=163) (actual time=438.739..439.574 rows=34 loops=1)
    Hash Cond: ("outer".area = "inner".area)
    ->  Seq Scan on _areas a  (cost=0.00..2.48 rows=48 width=163) (actual time=0.015..0.169 rows=48 loops=1)
    ->  Hash  (cost=1273.01..1273.01 rows=47 width=8) (actual time=438.532..438.532 rows=0 loops=1)
          ->  HashAggregate  (cost=1273.01..1273.01 rows=47 width=8) (actual time=438.286..438.395 rows=34 loops=1)
                ->  Hash Join  (cost=558.53..1266.68 rows=2532 width=8) (actual time=160.923..416.968 rows=5264
loops=1)
                      Hash Cond: ("outer".bin = "inner".bin)
                      ->  Hash Join  (cost=544.02..1207.86 rows=2531 width=8) (actual time=156.097..356.560 rows=5264
loops=1)
                            Hash Cond: ("outer".inventory = "inner".inventory)
                            ->  Seq Scan on _inventories i  (cost=0.00..265.96 rows=11396 width=16) (actual
time=0.010..44.047rows=11433 loops=1) 
                            ->  Hash  (cost=537.14..537.14 rows=2751 width=8) (actual time=155.891..155.891 rows=0
loops=1)
                                  ->  Hash Join  (cost=13.96..537.14 rows=2751 width=8) (actual time=11.967..136.598
rows=5264loops=1) 
                                        Hash Cond: ("outer".pricegroup = "inner".pricegroup)
                                        ->  Seq Scan on _offers o  (cost=0.00..379.24 rows=15524 width=16) (actual
time=0.008..50.335rows=15599 loops=1) 
                                        ->  Hash  (cost=13.94..13.94 rows=9 width=8) (actual time=11.861..11.861 rows=0
loops=1)
                                              ->  Hash IN Join  (cost=8.74..13.94 rows=9 width=8) (actual
time=10.801..11.801rows=12 loops=1) 
                                                    Hash Cond: ("outer".buyer = "inner".store)
                                                    ->  Seq Scan on _pricemembers p  (cost=0.00..4.07 rows=207
width=16)(actual time=0.011..0.548 rows=207 loops=1) 
                                                    ->  Hash  (cost=8.72..8.72 rows=8 width=8) (actual
time=10.687..10.687rows=0 loops=1) 
                                                          ->  Subquery Scan "IN_subquery"  (cost=8.60..8.72 rows=8
width=8)(actual time=10.645..10.654 rows=1 loops=1) 
                                                                ->  Unique  (cost=8.60..8.64 rows=8 width=8) (actual
time=10.631..10.636rows=1 loops=1) 
                                                                      ->  Sort  (cost=8.60..8.62 rows=8 width=8)
(actualtime=10.625..10.627 rows=1 loops=1) 
                                                                            Sort Key: store
                                                                            ->  Append  (cost=2.86..8.48 rows=8
width=8)(actual time=10.529..10.583 rows=1 loops=1) 
                                                                                  ->  Subquery Scan "*SELECT* 1"
(cost=2.86..5.15rows=5 width=8) (actual time=10.222..10.222 rows=0 loops=1) 
                                                                                        ->  Hash Join  (cost=2.86..5.10
rows=5width=8) (actual time=10.214..10.214 rows=0 loops=1) 
                                                                                              Hash Cond:
("outer".company= "inner".company) 
                                                                                              ->  Seq Scan on _stores s
(cost=0.00..2.13 rows=13 width=16) (actual time=0.019..0.059 rows=13 loops=1) 
                                                                                              ->  Hash
(cost=2.85..2.85rows=1 width=8) (actual time=10.031..10.031 rows=0 loops=1) 
                                                                                                    ->  Seq Scan on
_webusersw  (cost=0.00..2.85 rows=1 width=8) (actual time=10.023..10.023 rows=0 loops=1) 
                                                                                                          Filter:
(webuser= getwebuser()) 
                                                                                  ->  Subquery Scan "*SELECT* 2"
(cost=1.08..3.33rows=3 width=8) (actual time=0.298..0.349 rows=1 loops=1) 
                                                                                        ->  Hash Join  (cost=1.08..3.30
rows=3width=8) (actual time=0.287..0.335 rows=1 loops=1) 
                                                                                              Hash Cond:
("outer".company= "inner".company) 
                                                                                              ->  Seq Scan on _stores s
(cost=0.00..2.13 rows=13 width=16) (actual time=0.008..0.048 rows=13 loops=1) 
                                                                                              ->  Hash
(cost=1.07..1.07rows=1 width=8) (actual time=0.111..0.111 rows=0 loops=1) 
                                                                                                    ->  Seq Scan on
_companiesc  (cost=0.00..1.07 rows=1 width=8) (actual time=0.059..0.080 rows=1 loops=1) 
                                                                                                          Filter:
((companyid)::text= 'DEFAULT'::text) 
                      ->  Hash  (cost=13.01..13.01 rows=601 width=16) (actual time=4.735..4.735 rows=0 loops=1)
                            ->  Seq Scan on _bins b  (cost=0.00..13.01 rows=601 width=16) (actual time=0.054..2.846
rows=601loops=1) 
  Total runtime: 441.685 ms
(41 rows)

---

When I query the view with a simple filter, I get:

explain analyze select * from p_areas where deactive is null;
                                                                                   QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Nested Loop IN Join  (cost=8.60..524.28 rows=1 width=163) (actual time=1023.291..20025.620 rows=34 loops=1)
    Join Filter: ("outer".area = "inner".area)
    ->  Seq Scan on _areas a  (cost=0.00..2.48 rows=1 width=163) (actual time=0.037..0.804 rows=48 loops=1)
          Filter: (deactive IS NULL)
    ->  Nested Loop  (cost=8.60..25530.60 rows=2532 width=8) (actual time=0.345..402.775 rows=3408 loops=48)
          ->  Nested Loop  (cost=8.60..16893.61 rows=2531 width=8) (actual time=0.304..264.929 rows=3408 loops=48)
                ->  Merge Join  (cost=8.60..2912.00 rows=2751 width=8) (actual time=0.258..120.841 rows=3408 loops=48)
                      Merge Cond: ("outer".pricegroup = "inner".pricegroup)
                      ->  Nested Loop IN Join  (cost=8.60..1837.73 rows=9 width=8) (actual time=0.216..4.612 rows=8
loops=48)
                            Join Filter: ("outer".buyer = "inner".store)
                            ->  Index Scan using i_pricemembers3 on _pricemembers p  (cost=0.00..10.96 rows=207
width=16)(actual time=0.022..1.045 rows=138 loops=48) 
                            ->  Subquery Scan "IN_subquery"  (cost=8.60..8.72 rows=8 width=8) (actual time=0.011..0.017
rows=1loops=6606) 
                                  ->  Unique  (cost=8.60..8.64 rows=8 width=8) (actual time=0.006..0.010 rows=1
loops=6606)
                                        ->  Sort  (cost=8.60..8.62 rows=8 width=8) (actual time=0.003..0.004 rows=1
loops=6606)
                                              Sort Key: store
                                              ->  Append  (cost=2.86..8.48 rows=8 width=8) (actual time=7.667..7.757
rows=1loops=1) 
                                                    ->  Subquery Scan "*SELECT* 1"  (cost=2.86..5.15 rows=5 width=8)
(actualtime=7.362..7.362 rows=0 loops=1) 
                                                          ->  Hash Join  (cost=2.86..5.10 rows=5 width=8) (actual
time=7.355..7.355rows=0 loops=1) 
                                                                Hash Cond: ("outer".company = "inner".company)
                                                                ->  Seq Scan on _stores s  (cost=0.00..2.13 rows=13
width=16)(actual time=0.013..0.054 rows=13 loops=1) 
                                                                ->  Hash  (cost=2.85..2.85 rows=1 width=8) (actual
time=7.163..7.163rows=0 loops=1) 
                                                                      ->  Seq Scan on _webusers w  (cost=0.00..2.85
rows=1width=8) (actual time=7.154..7.154 rows=0 loops=1) 
                                                                            Filter: (webuser = getwebuser())
                                                    ->  Subquery Scan "*SELECT* 2"  (cost=1.08..3.33 rows=3 width=8)
(actualtime=0.295..0.381 rows=1 loops=1) 
                                                          ->  Hash Join  (cost=1.08..3.30 rows=3 width=8) (actual
time=0.286..0.368rows=1 loops=1) 
                                                                Hash Cond: ("outer".company = "inner".company)
                                                                ->  Seq Scan on _stores s  (cost=0.00..2.13 rows=13
width=16)(actual time=0.008..0.080 rows=13 loops=1) 
                                                                ->  Hash  (cost=1.07..1.07 rows=1 width=8) (actual
time=0.116..0.116rows=0 loops=1) 
                                                                      ->  Seq Scan on _companies c  (cost=0.00..1.07
rows=1width=8) (actual time=0.062..0.083 rows=1 loops=1) 
                                                                            Filter: ((companyid)::text =
'DEFAULT'::text)
                      ->  Index Scan using i_offers4 on _offers o  (cost=0.00..1007.93 rows=15524 width=16) (actual
time=0.023..67.183rows=10049 loops=48) 
                ->  Index Scan using i_inventories1 on _inventories i  (cost=0.00..5.07 rows=1 width=16) (actual
time=0.025..0.029rows=1 loops=163561) 
                      Index Cond: (i.inventory = "outer".inventory)
          ->  Index Scan using i_bins1 on _bins b  (cost=0.00..3.40 rows=1 width=16) (actual time=0.021..0.026 rows=1
loops=163561)
                Index Cond: (b.bin = "outer".bin)
  Total runtime: 20027.414 ms
(36 rows)

---

That's a slow-down on execution time by a factor of 50, even
though the row count was the same: 34. In fact, it's MUCH
faster to do:

create temporary table foo as
select * from p_areas;

select * from foo
where deactive is null;

The database has been analyzed.

Any tips would be greatly appreciated.

Mike Mascari

P.S.: I turned off word-wrap in my mail client for this post.
Is that the right thing to do for analyze output?



Re: Slow execution time when querying view with WHERE clause

От
Mike Mascari
Дата:
Mike Mascari wrote:
> I have the following view:
>
> create or replace view market.p_areas as
> select a.*
> from _areas a
> where a.area in (
>  select b.area
>  from _bins b, _inventories i, _offers o, _pricemembers p
>  where b.bin = i.bin and
>  i.inventory = o.inventory and
>  o.pricegroup = p.pricegroup and
>  p.buyer in (
>   select s.store
>   from _stores s, _webusers w
>   where w.webuser = getWebuser() and
>   w.company = s.company
>    union
>   select s.store
>   from _stores s, _companies c
>   where s.company = c.company and
>   c.companyid = 'DEFAULT'
>  )
> );

...

I failed to report the version:

select version();

PostgreSQL 7.4.5 on i686-pc-linux-gnu, compiled by GCC
i686-pc-linux-gnu-gcc (GCC) 3.4.0 20040204 (prerelease)

Sorry.

Mike Mascari

Re: Slow execution time when querying view with WHERE clause

От
Tom Lane
Дата:
Mike Mascari <mascarm@mascari.com> writes:
> When I query the view with a simple filter, I get:

> explain analyze select * from p_areas where deactive is null;

The problem seems to be here:

>     ->  Seq Scan on _areas a  (cost=0.00..2.48 rows=1 width=163) (actual time=0.037..0.804 rows=48 loops=1)
>           Filter: (deactive IS NULL)

Why is it so completely off about the selectivity of the IS NULL clause?
Are you sure you ANALYZEd this table recently?

            regards, tom lane

Re: Slow execution time when querying view with WHERE clause

От
Mike Mascari
Дата:
Tom Lane wrote:
> Mike Mascari <mascarm@mascari.com> writes:
>
>>When I query the view with a simple filter, I get:
>
>
>>explain analyze select * from p_areas where deactive is null;
>
>
> The problem seems to be here:
>
>
>>    ->  Seq Scan on _areas a  (cost=0.00..2.48 rows=1 width=163) (actual time=0.037..0.804 rows=48 loops=1)
>>          Filter: (deactive IS NULL)
>
>
> Why is it so completely off about the selectivity of the IS NULL clause?
> Are you sure you ANALYZEd this table recently?


Yes. I just did:

[estore@lexus] vacuum full analyze;
VACUUM
[estore@lexus] explain analyze select * from p_areas where deactive is null;
                                                                                   QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Nested Loop IN Join  (cost=8.62..512.47 rows=1 width=162) (actual time=1143.969..21811.417 rows=37 loops=1)
    Join Filter: ("outer".area = "inner".area)
    ->  Seq Scan on _areas a  (cost=0.00..2.49 rows=1 width=162) (actual time=0.037..1.673 rows=49 loops=1)
          Filter: (deactive IS NULL)
    ->  Nested Loop  (cost=8.62..25740.20 rows=2681 width=8) (actual time=1.172..429.501 rows=3566 loops=49)
          ->  Nested Loop  (cost=8.62..16674.93 rows=2680 width=8) (actual time=1.125..281.570 rows=3566 loops=49)
                ->  Merge Join  (cost=8.62..3012.72 rows=2778 width=8) (actual time=0.876..128.908 rows=3566 loops=49)
                      Merge Cond: ("outer".pricegroup = "inner".pricegroup)
                      ->  Nested Loop IN Join  (cost=8.62..1929.41 rows=9 width=8) (actual time=0.613..5.504 rows=9
loops=49)
                            Join Filter: ("outer".buyer = "inner".store)
                            ->  Index Scan using i_pricemembers3 on _pricemembers p  (cost=0.00..11.13 rows=217
width=16)(actual time=0.403..1.476 rows=142 loops=49) 
                            ->  Subquery Scan "IN_subquery"  (cost=8.62..8.74 rows=8 width=8) (actual time=0.013..0.019
rows=1loops=6950) 
                                  ->  Unique  (cost=8.62..8.66 rows=8 width=8) (actual time=0.007..0.010 rows=1
loops=6950)
                                        ->  Sort  (cost=8.62..8.64 rows=8 width=8) (actual time=0.003..0.004 rows=1
loops=6950)
                                              Sort Key: store
                                              ->  Append  (cost=2.87..8.50 rows=8 width=8) (actual time=8.394..8.446
rows=1loops=1) 
                                                    ->  Subquery Scan "*SELECT* 1"  (cost=2.87..5.17 rows=5 width=8)
(actualtime=8.112..8.112 rows=0 loops=1) 
                                                          ->  Hash Join  (cost=2.87..5.12 rows=5 width=8) (actual
time=8.106..8.106rows=0 loops=1) 
                                                                Hash Cond: ("outer".company = "inner".company)
                                                                ->  Seq Scan on _stores s  (cost=0.00..2.13 rows=13
width=16)(actual time=0.014..0.052 rows=13 loops=1) 
                                                                ->  Hash  (cost=2.87..2.87 rows=1 width=8) (actual
time=7.878..7.878rows=0 loops=1) 
                                                                      ->  Seq Scan on _webusers w  (cost=0.00..2.87
rows=1width=8) (actual time=7.868..7.868 rows=0 loops=1) 
                                                                            Filter: (webuser = getwebuser())
                                                    ->  Subquery Scan "*SELECT* 2"  (cost=1.08..3.33 rows=3 width=8)
(actualtime=0.273..0.322 rows=1 loops=1) 
                                                          ->  Hash Join  (cost=1.08..3.30 rows=3 width=8) (actual
time=0.263..0.308rows=1 loops=1) 
                                                                Hash Cond: ("outer".company = "inner".company)
                                                                ->  Seq Scan on _stores s  (cost=0.00..2.13 rows=13
width=16)(actual time=0.008..0.042 rows=13 loops=1) 
                                                                ->  Hash  (cost=1.07..1.07 rows=1 width=8) (actual
time=0.093..0.093rows=0 loops=1) 
                                                                      ->  Seq Scan on _companies c  (cost=0.00..1.07
rows=1width=8) (actual time=0.061..0.081 rows=1 loops=1) 
                                                                            Filter: ((companyid)::text =
'DEFAULT'::text)
                      ->  Index Scan using i_offers4 on _offers o  (cost=0.00..1014.76 rows=16298 width=16) (actual
time=0.244..72.742rows=10433 loops=49) 
                ->  Index Scan using i_inventories1 on _inventories i  (cost=0.00..4.91 rows=1 width=16) (actual
time=0.025..0.029rows=1 loops=174715) 
                      Index Cond: (i.inventory = "outer".inventory)
          ->  Index Scan using i_bins1 on _bins b  (cost=0.00..3.37 rows=1 width=16) (actual time=0.022..0.027 rows=1
loops=174715)
                Index Cond: (b.bin = "outer".bin)
  Total runtime: 21813.209 ms

_areas looks like:

[estore@lexus] \d _areas
                         Table "temporal._areas"
        Column       |           Type           |       Modifiers
--------------------+--------------------------+------------------------
  area               | bigint                   | not null
  store              | bigint                   | not null
  name               | character varying(32)    | not null
  description        | character varying(64)    | not null
  email              | character varying(48)    | not null
  phoneno            | character varying(16)    | not null
  requisition_device | bigint                   | not null
  inventory_device   | bigint                   | not null
  receive_device     | bigint                   | not null
  invoice_device     | bigint                   | not null
  activation_device  | bigint                   | not null
  active             | timestamp with time zone | not null default now()
  deactive           | timestamp with time zone |
Indexes:
     "i_areas1" unique, btree (area)
     "i_areas2" unique, btree (store, name) WHERE (deactive IS NULL)
     "i_areas3" btree (store, name)
Triggers:
     t_areas1 BEFORE INSERT OR DELETE OR UPDATE ON _areas FOR EACH ROW EXECUTE PROCEDURE ri_areas()


Note that if I disable nestedloop plans, I get:

[estore@lexus] explain analyze select * from p_areas where deactive is null;
                                                                                               QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Hash Join  (cost=1456.90..1457.15 rows=1 width=162) (actual time=423.273..424.156 rows=37 loops=1)
    Hash Cond: ("outer".area = "inner".area)
    ->  HashAggregate  (cost=1454.40..1454.40 rows=48 width=8) (actual time=422.192..422.334 rows=37 loops=1)
          ->  Hash Join  (cost=582.79..1447.70 rows=2681 width=8) (actual time=188.687..406.584 rows=5694 loops=1)
                Hash Cond: ("outer".bin = "inner".bin)
                ->  Hash Join  (cost=568.07..1386.07 rows=2680 width=8) (actual time=182.756..358.441 rows=5694
loops=1)
                      Hash Cond: ("outer".inventory = "inner".inventory)
                      ->  Seq Scan on _inventories i  (cost=0.00..280.04 rows=12004 width=16) (actual
time=0.013..38.221rows=12004 loops=1) 
                      ->  Hash  (cost=561.12..561.12 rows=2778 width=8) (actual time=182.543..182.543 rows=0 loops=1)
                            ->  Hash Join  (cost=14.13..561.12 rows=2778 width=8) (actual time=9.854..160.963 rows=5694
loops=1)
                                  Hash Cond: ("outer".pricegroup = "inner".pricegroup)
                                  ->  Seq Scan on _offers o  (cost=0.00..396.98 rows=16298 width=16) (actual
time=0.011..58.422rows=16298 loops=1) 
                                  ->  Hash  (cost=14.10..14.10 rows=9 width=8) (actual time=9.728..9.728 rows=0
loops=1)
                                        ->  Hash IN Join  (cost=8.76..14.10 rows=9 width=8) (actual time=8.616..9.657
rows=13loops=1) 
                                              Hash Cond: ("outer".buyer = "inner".store)
                                              ->  Seq Scan on _pricemembers p  (cost=0.00..4.17 rows=217 width=16)
(actualtime=0.011..0.565 rows=217 loops=1) 
                                              ->  Hash  (cost=8.74..8.74 rows=8 width=8) (actual time=8.465..8.465
rows=0loops=1) 
                                                    ->  Subquery Scan "IN_subquery"  (cost=8.62..8.74 rows=8 width=8)
(actualtime=8.446..8.455 rows=1 loops=1) 
                                                          ->  Unique  (cost=8.62..8.66 rows=8 width=8) (actual
time=8.430..8.435rows=1 loops=1) 
                                                                ->  Sort  (cost=8.62..8.64 rows=8 width=8) (actual
time=8.424..8.426rows=1 loops=1) 
                                                                      Sort Key: store
                                                                      ->  Append  (cost=2.87..8.50 rows=8 width=8)
(actualtime=8.004..8.058 rows=1 loops=1) 
                                                                            ->  Subquery Scan "*SELECT* 1"
(cost=2.87..5.17rows=5 width=8) (actual time=7.710..7.710 rows=0 loops=1) 
                                                                                  ->  Hash Join  (cost=2.87..5.12
rows=5width=8) (actual time=7.701..7.701 rows=0 loops=1) 
                                                                                        Hash Cond: ("outer".company =
"inner".company)
                                                                                        ->  Seq Scan on _stores s
(cost=0.00..2.13rows=13 width=16) (actual time=0.013..0.052 rows=13 loops=1) 
                                                                                        ->  Hash  (cost=2.87..2.87
rows=1width=8) (actual time=7.486..7.486 rows=0 loops=1) 
                                                                                              ->  Seq Scan on _webusers
w (cost=0.00..2.87 rows=1 width=8) (actual time=7.478..7.478 rows=0 loops=1) 
                                                                                                    Filter: (webuser =
getwebuser())
                                                                            ->  Subquery Scan "*SELECT* 2"
(cost=1.08..3.33rows=3 width=8) (actual time=0.284..0.336 rows=1 loops=1) 
                                                                                  ->  Hash Join  (cost=1.08..3.30
rows=3width=8) (actual time=0.274..0.321 rows=1 loops=1) 
                                                                                        Hash Cond: ("outer".company =
"inner".company)
                                                                                        ->  Seq Scan on _stores s
(cost=0.00..2.13rows=13 width=16) (actual time=0.008..0.046 rows=13 loops=1) 
                                                                                        ->  Hash  (cost=1.07..1.07
rows=1width=8) (actual time=0.096..0.096 rows=0 loops=1) 
                                                                                              ->  Seq Scan on
_companiesc  (cost=0.00..1.07 rows=1 width=8) (actual time=0.064..0.083 rows=1 loops=1) 
                                                                                                    Filter:
((companyid)::text= 'DEFAULT'::text) 
                ->  Hash  (cost=13.18..13.18 rows=618 width=16) (actual time=5.849..5.849 rows=0 loops=1)
                      ->  Seq Scan on _bins b  (cost=0.00..13.18 rows=618 width=16) (actual time=0.027..3.554 rows=618
loops=1)
    ->  Hash  (cost=2.49..2.49 rows=1 width=162) (actual time=0.960..0.960 rows=0 loops=1)
          ->  Seq Scan on _areas a  (cost=0.00..2.49 rows=1 width=162) (actual time=0.033..0.197 rows=49 loops=1)
                Filter: (deactive IS NULL)
  Total runtime: 427.390 ms


Thanks!

Mike Mascari




Re: Slow execution time when querying view with WHERE clause

От
Jaime Casanova
Дата:
 --- Mike Mascari <mascarm@mascari.com> escribió:
> Tom Lane wrote:
> > Mike Mascari <mascarm@mascari.com> writes:
> >
> >>When I query the view with a simple filter, I get:
> >
> >
> >>explain analyze select * from p_areas where
> deactive is null;
> >
> >
> > The problem seems to be here:
> >
> >
> >>    ->  Seq Scan on _areas a  (cost=0.00..2.48
> rows=1 width=163) (actual time=0.037..0.804 rows=48
> loops=1)
> >>          Filter: (deactive IS NULL)
> >
> >
> > Why is it so completely off about the selectivity
> of the IS NULL clause?

null values are not indexable, is that your question?
If it is your question then create a partial index
with where deactive is null.

regards,
Jaime Casanova

_________________________________________________________
Do You Yahoo!?
Información de Estados Unidos y América Latina, en Yahoo! Noticias.
Visítanos en http://noticias.espanol.yahoo.com

Re: Slow execution time when querying view with WHERE clause

От
Tom Lane
Дата:
Jaime Casanova <systemguards@yahoo.com> writes:
> Tom Lane wrote:
>> Why is it so completely off about the selectivity
>> of the IS NULL clause?

> null values are not indexable, is that your question?

Uh, no.  The problem is that the IS NULL condition matched all 48 rows
of the table, but the planner thought it would only match one row.  This
is definitely covered by the pg_stats statistics, and with only 48 live
rows there couldn't possibly have been any sampling error, so what the
heck went wrong there?

            regards, tom lane

Re: Slow execution time when querying view with WHERE clause

От
Tom Lane
Дата:
Mike Mascari <mascarm@mascari.com> writes:
> Tom Lane wrote:
>> Why is it so completely off about the selectivity of the IS NULL clause?

> I think this is a bug in ANALYZE not constructing statistics for columns
> whose data is entirely NULL:

Um ... doh ... analyze.c about line 1550:

    /* We can only compute valid stats if we found some non-null values. */
    if (nonnull_cnt > 0)
       ...

There's a bit of an epistemological issue here: if we didn't actually
find any nonnull values in our sample, is it legitimate to assume that
the column is entirely null?  On the other hand, if we find only "3" in
our sample we will happily assume the column contains only "3", so I
dunno why we are discriminating against null.  This seems like a case
that just hasn't come up before.

            regards, tom lane

Re: Slow execution time when querying view with WHERE clause

От
Mike Mascari
Дата:
Tom Lane wrote:
> Um ... doh ... analyze.c about line 1550:
>
>     /* We can only compute valid stats if we found some non-null values. */
>     if (nonnull_cnt > 0)
>        ...
>
> There's a bit of an epistemological issue here: if we didn't actually
> find any nonnull values in our sample, is it legitimate to assume that
> the column is entirely null?  On the other hand, if we find only "3" in
> our sample we will happily assume the column contains only "3", so I
> dunno why we are discriminating against null.  This seems like a case
> that just hasn't come up before.

Will this discriminatory policy toward null end for 8.0?

Mike Mascari