> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> >>>> * Prevent index lookups (or index entries using partial index) on most
> >>>> common values; instead use sequential scan
> >>
> >> This behavior already exists for the most common value, and would
> >> exist for any additional values that we had stats for. Don't see
> >> why you think a separate TODO item is needed.
>
> > You mean the optimizer already skips an index lookup for the most common
> > value, and instead does a sequential scan?
>
> No, it goes for the sequential scan if it estimates the cost of the
> indexscan as more than sequential. Indexscan cost depends on estimated
> number of retrieved rows --- which it can estimate from pg_statistic
> if the query is WHERE column = mostcommonvalue. So which plan you get
> depends on just how common the most common value is.
>
> Hard-wiring either choice of plan for the most common value would be
> inferior to what the code already does, AFAICS. But for values other
> than the-most-common, we don't have adequate stats in pg_statistic,
> and so you may or may not get a good estimated row count and hence
> a good choice of plan. That's what needs to be fixed.
OK, I remember now. If the most common value is used as a constant, it
uses the value from pg_statistic for most common, rather than use
the dispersion value. That is great.
What I am more concerned about is a join that uses the most common
value. We do an index scan in that case. I wonder of we could get
something into the executor that would switch to sequential scan when
the most common value is hit. Is that worth persuing?
-- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610)
853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill,
Pennsylvania19026