Re: Selectivity of "=" (Re: [HACKERS] Index not used on simple select)

Поиск
Список
Период
Сортировка
От Bruce Momjian
Тема Re: Selectivity of "=" (Re: [HACKERS] Index not used on simple select)
Дата
Msg-id 199907272258.SAA19324@candle.pha.pa.us
обсуждение исходный текст
Ответ на Selectivity of "=" (Re: [HACKERS] Index not used on simple select)  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
> > It looks like the highly skewed distribution of nsn values (what you
> > sent me had 997 '' entries, only 3 non-empty strings) is confusing the
> > selectivity estimation code somehow, such that the system thinks that
> > the query is going to match most of the rows.  Notice it is estimating
> > 995 returned rows for the nsn select!  Under these circumstances it will
> > prefer a sequential scan, since the more-expensive-per-tuple index scan
> > doesn't look like it will be able to avoid reading most of the table.
> > That logic is OK, it's the 0.995 selectivity estimate that's wrong...
> 
> It turns out that the selectivity estimate for an "=" comparison is just
> the attdisbursion statistic calculated by VACUUM ANALYZE, which can be
> roughly defined as the frequency of the most common value in the column.
> (I took statistics too long ago to recall the exact definition.)
> Anyway, given that the test data Ole sent me contains nearly all ''
> entries, I'd say that the 0.995 value is about right for disbursion.

Yes, you are correct, though it does look at potentially one or two
other unique values, depending on the distribution.  It basically
perfectly computes disbursion for unique columns, and columns that
contain only two unique values, and it figures in NULL.  In other cases,
the disbursion is imperfect, but pretty decent.

> My inclination is to hack up eqsel() to never return a selectivity
> estimate larger than, say, 0.5, even when the measured disbursion
> is more.  I am not sure that this is a good idea, however.  Comments?

I would discourage this.  I can imagine many cases there >0.5
selectivites would be valid, i.e. state = "PA".

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Ross J. Reedstrom"
Дата:
Сообщение: Re: [HACKERS] Checking if a system is ELF
Следующее
От: The Hermit Hacker
Дата:
Сообщение: # of Index' Tuples != Heap'