Re: Proposal: Partitioning Advisor for PostgreSQL

Поиск
Список
Период
Сортировка
От Julien Rouhaud
Тема Re: Proposal: Partitioning Advisor for PostgreSQL
Дата
Msg-id CAOBaU_bwG6hX9xOZ3n7xq43ztEwZR4bS78aFj12vfLDL7rPffw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Proposal: Partitioning Advisor for PostgreSQL  (Dilip Kumar <dilipbalaut@gmail.com>)
Список pgsql-hackers
Hi,

On Tue, Jun 12, 2018 at 11:14 AM, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> On Thu, May 24, 2018 at 4:16 PM, Yuzuko Hosoya <hosoya.yuzuko@lab.ntt.co.jp>
> wrote:
>>
>> This is
>> why we are working on partitioning advisor.  We plan to release the first
>> version
>> of partitioning advisor for PostgreSQL 11, and then, improve it for
>> PostgreSQL 12.
>>
>
> Interesting.

Thanks!

>> - Estimating stats
>> It is complicated because hypothetical partition has no data.  Currently,
>> we compute
>> hypothetical partition's size using clauselist_selectivity() according to
>> their partition
>> bound and original table stats.  As a result, estimate is done with low
>> accuracy,
>> especially if there is WHERE clause.  We will improve during developing,
>> but for now,
>> we don't have good ideas.
>
>
> I haven't yet read the patch but curious to know.  Suppose we have table
> which is already loaded with some data.  Now, if I create  hypothetical
> partitions on that will we create any stat data (mcv, histogram) for
> hypothetical table? because, in this case we already have the data from the
> main table and we also have partition boundary for the hypothetical table.
> I am not sure you are already doing this or its an open item?


For now we're simply using the original table statistics, and
appending the partition bounds as qual on the hypothetical partition.
It'll give good result if the query doesn't have quals for the table,
or for simple cases where selectivity functions understand that
expressions such as

(id BETWEEN 1 AND 1000000) AND (id < 6)

will return only 5 rows, while they can't for expressions like

(id IN (x,y...)) AND (id < z)

In this second case, the estimates are for now therefore quite wrong.
I think that we'd have no other choice than to generate hypothetical
statistics according to the partition bounds, and only compute
selectivity based on the query quals.  It's definitely not simple to
do, but it should be doable with the hooks currently available.


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Dilip Kumar
Дата:
Сообщение: Re: Proposal: Partitioning Advisor for PostgreSQL
Следующее
От: Geoff Winkless
Дата:
Сообщение: late binding of shared libs for C functions