Re: benchmarking the query planner

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: benchmarking the query planner
Дата
Msg-id 15327.1229047970@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: benchmarking the query planner  ("Robert Haas" <robertmhaas@gmail.com>)
Ответы Re: benchmarking the query planner  ("Robert Haas" <robertmhaas@gmail.com>)
Список pgsql-hackers
"Robert Haas" <robertmhaas@gmail.com> writes:
>> It might be best to stop when the frequency drops below some threshold,
>> rather than taking a fixed number of entries.

> OK, I'll bite.  How do we decide where to put the cutoff?  If we make
> it too high, it will have a negative effect on join selectivity
> estimates; if it's too low, it won't really address the problem we're
> trying to fix.  I randomly propose p = 0.001, which should limit
> eqjoinsel() to about a million equality tests in the worst case.  In
> the synthetic example we were just benchmarking, that causes the
> entire MCV array to be tossed out the window (which feels about
> right).

Yeah.  One idle thought I had was that maybe the cutoff needs to
consider both probabilities: if the high-frequency MCVs on one side
chance to match to not-so-high-frequency MCVs on the other, you
would like to know about that.  As long as we keep the lists in
frequency order, it seems easy to implement this: for each MCV
examined by the outer loop, you run the inner loop until the product of
the outer and current inner frequency drops below whatever your
threshold is.  This doesn't immediately suggest what the threshold
ought to be, but it seems like it ought to be possible to determine
that given a desired maximum error in the overall estimate.  I'm also
not very clear on what the "total frequency" computations (matchfreq2
and unmatchfreq2 in the current code) ought to look like if we are using
a variable subset of the inner list.
        regards, tom lane


В списке pgsql-hackers по дате отправления:

Предыдущее
От: KaiGai Kohei
Дата:
Сообщение: Re: Updates of SE-PostgreSQL 8.4devel patches (r1268)
Следующее
От: "Greg Stark"
Дата:
Сообщение: Re: benchmarking the query planner