Re: benchmarking the query planner

Поиск

Список

Период

Сортировка

От	Robert Haas
Тема	Re: benchmarking the query planner
Дата	12 декабря 2008 г. 02:05:05
Msg-id	603c8f070812111904j4b839c17sabde937e3d4cf08a@mail.gmail.com обсуждение исходный текст
Ответ на	Re: benchmarking the query planner (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы	Re: benchmarking the query planner
Список	pgsql-hackers

Дерево обсуждения

>> OK, I'll bite.  How do we decide where to put the cutoff?  If we make
>> it too high, it will have a negative effect on join selectivity
>> estimates; if it's too low, it won't really address the problem we're
>> trying to fix.  I randomly propose p = 0.001, which should limit
>> eqjoinsel() to about a million equality tests in the worst case.  In
>> the synthetic example we were just benchmarking, that causes the
>> entire MCV array to be tossed out the window (which feels about
>> right).
>
> Yeah.  One idle thought I had was that maybe the cutoff needs to
> consider both probabilities: if the high-frequency MCVs on one side
> chance to match to not-so-high-frequency MCVs on the other, you
> would like to know about that.  As long as we keep the lists in
> frequency order, it seems easy to implement this: for each MCV
> examined by the outer loop, you run the inner loop until the product of
> the outer and current inner frequency drops below whatever your
> threshold is.  This doesn't immediately suggest what the threshold

I had this idle thought too, but I didn't write it down because...

> ought to be, but it seems like it ought to be possible to determine
> that given a desired maximum error in the overall estimate.  I'm also
> not very clear on what the "total frequency" computations (matchfreq2
> and unmatchfreq2 in the current code) ought to look like if we are using
> a variable subset of the inner list.

...of this exact concern, which I think is an insurmountable problem.
If you don't consider some of the MCVs AT ALL, I think you can add
their frequencies back in to otherfreq{1,2} and go home, but if you
consider them for some elements of the other list but not all, I'm not
sure there's an appropriate way to proceed.

...Robert

В списке pgsql-hackers по дате отправления:

Предыдущее

От: "Nathan Boley"
Дата: 12 декабря 2008 г., 01:40:02
Сообщение: Re: benchmarking the query planner

Следующее

От: "Robert Haas"
Дата: 12 декабря 2008 г., 02:12:00
Сообщение: Re: Updates of SE-PostgreSQL 8.4devel patches (r1268)

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: benchmarking the query planner

Предыдущее

Следующее