Обсуждение: bad selectivity estimates for CASE

Поиск
Список
Период
Сортировка

bad selectivity estimates for CASE

От
"Robert Haas"
Дата:
While looking at a complex query that is being poorly planned by
PostgreSQL 8.2.9, I discovered that any non-trivial CASE...WHEN
expression seems to produce a selectivity estimate of 0.005.  This
also happens on HEAD.

psql (8.4devel)
Type "help" for help.

head=# create table tenk (c) as select generate_series(1,10000);
SELECT
head=# alter table tenk alter column c set statistics 100;
ALTER TABLE
head=# analyze tenk;
ANALYZE
head=# explain select * from tenk where c in (1,2,3,4);
                      QUERY PLAN
------------------------------------------------------
 Seq Scan on tenk  (cost=0.00..190.00 rows=4 width=4)
   Filter: (c = ANY ('{1,2,3,4}'::integer[]))
(2 rows)

head=# explain select * from tenk where case when c in (1,2,3,4) then 1 end = 1;
                                         QUERY PLAN

--------------------------------------------------------------------------------
------------
 Seq Scan on tenk  (cost=0.00..215.00 rows=50 width=4)
   Filter: (CASE WHEN (c = ANY ('{1,2,3,4}'::integer[])) THEN 1 ELSE NULL::integ
er END = 1)
(2 rows)

head=# explain select * from tenk where case when c in (1,2,3,4) then 2 end = 1;
                                         QUERY PLAN

--------------------------------------------------------------------------------
------------
 Seq Scan on tenk  (cost=0.00..215.00 rows=50 width=4)
   Filter: (CASE WHEN (c = ANY ('{1,2,3,4}'::integer[])) THEN 2 ELSE NULL::integ
er END = 1)
(2 rows)

head=# \q

The last example is particularly egregious, since it can never return
true, but the previous example is not much better, since in my actual
query the actual selectivity (against a CASE with multiple WHEN
branches) can be as high as ~0.8, so a value of 0.005 isn't close.  It
ends up causing a very expensive nested loop plan when something else
would be better.

Any suggestions would be appreciated.

...Robert

Re: bad selectivity estimates for CASE

От
Tom Lane
Дата:
"Robert Haas" <robertmhaas@gmail.com> writes:
> While looking at a complex query that is being poorly planned by
> PostgreSQL 8.2.9, I discovered that any non-trivial CASE...WHEN
> expression seems to produce a selectivity estimate of 0.005.

If you have an idea for a non-silly estimate, feel free to enlighten
us...

            regards, tom lane

Re: bad selectivity estimates for CASE

От
"Robert Haas"
Дата:
On Mon, Jan 5, 2009 at 11:40 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Robert Haas" <robertmhaas@gmail.com> writes:
>> While looking at a complex query that is being poorly planned by
>> PostgreSQL 8.2.9, I discovered that any non-trivial CASE...WHEN
>> expression seems to produce a selectivity estimate of 0.005.
>
> If you have an idea for a non-silly estimate, feel free to enlighten
> us...

Well, presumably CASE WHEN <expr1> THEN <constant1> WHEN <expr2> THEN
<constant2> WHEN <expr3> THEN <constant3> ... END = <constantn> could
be simplified to <exprn>.  But that's not going to happen in time to
do me any good on this query, if it ever happens (and might not be
sufficient anyway since the selectivity estimates of <expr1> may not
be very good either), so I was more looking for suggestions on coping
with the situation, since I'm sure that I'm not the first person to
have this type of problem.

...Robert

Re: bad selectivity estimates for CASE

От
Tom Lane
Дата:
"Robert Haas" <robertmhaas@gmail.com> writes:
> Well, presumably CASE WHEN <expr1> THEN <constant1> WHEN <expr2> THEN
> <constant2> WHEN <expr3> THEN <constant3> ... END = <constantn> could
> be simplified to <exprn>.

Not without breaking the order-of-evaluation guarantees.  Consider

    case when x=0 then 0 when 1/x = 42 then 1 end = 1

This expression should not suffer a divide-by-zero failure but your
proposal would allow it to do so.

            regards, tom lane