Re: Force a merge join?

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Force a merge join?
Дата
Msg-id 25096.1021766907@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: Force a merge join?  (Doug Fields <dfields-pg-general@pexicom.com>)
Список pgsql-general
Doug Fields <dfields-pg-general@pexicom.com> writes:
> So that correlates pretty well with it's guesses as to the top 10:
> {38,192,369,330,332,501,229,493,319,424}

> But not so well as to their relative distributions:
> {0.389667,0.123667,0.0156667,0.013,0.00933333,0.00933333,0.009,0.00866667,0.00833333,0.00833333}

Very curious.  I'd have expected it to recognize 192 as the most common
value, given that actual distribution.  Is the analyze result repeatable?

> Or, if I could tell it to do a
> more detailed sampling during ANALYZE. I could also tell it to keep more
> than the top 10 in the statistics table (SET STATISTICS), but I'm not sure
> what it would buy me, other than forcing a larger sample

The sample size scales linearly with the SET STATISTICS target (more
specifically, with the largest target among the columns being analyzed).
I was just about to suggest that you try setting a larger target and see
if the stats get better.

> How much would I slow the ANALYZE statement, and more importantly, the
> query optimizer, if I told it to keep statistics on the top 200 instead of
> the default 10 values?

200 seems like overkill... 20 or 30 might well be enough.

FWIW, I went back and loaded the test case into 7.2.1, and I still get
the same estimates I showed from current sources.  So there's something
very fishy about your results.  Don't know where to look for the cause
of the discrepancy at the moment.

            regards, tom lane

PS: sorry about the blank post before...

В списке pgsql-general по дате отправления:

Предыдущее
От: Doug Fields
Дата:
Сообщение: Re: Force a merge join?
Следующее
От: "Joel Burton"
Дата:
Сообщение: Walking a view to find all source tables