Re: Proposal: Partitioning Advisor for PostgreSQL

Поиск

Список

Период

Сортировка

От	Julien Rouhaud
Тема	Re: Proposal: Partitioning Advisor for PostgreSQL
Дата	12 июня 2018 г. 13:28:15
Msg-id	CAOBaU_bwG6hX9xOZ3n7xq43ztEwZR4bS78aFj12vfLDL7rPffw@mail.gmail.com обсуждение исходный текст
Ответ на	Re: Proposal: Partitioning Advisor for PostgreSQL (Dilip Kumar <dilipbalaut@gmail.com>)
Список	pgsql-hackers

Дерево обсуждения

Hi,

On Tue, Jun 12, 2018 at 11:14 AM, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> On Thu, May 24, 2018 at 4:16 PM, Yuzuko Hosoya <hosoya.yuzuko@lab.ntt.co.jp>
> wrote:
>>
>> This is
>> why we are working on partitioning advisor.  We plan to release the first
>> version
>> of partitioning advisor for PostgreSQL 11, and then, improve it for
>> PostgreSQL 12.
>>
>
> Interesting.

Thanks!

>> - Estimating stats
>> It is complicated because hypothetical partition has no data.  Currently,
>> we compute
>> hypothetical partition's size using clauselist_selectivity() according to
>> their partition
>> bound and original table stats.  As a result, estimate is done with low
>> accuracy,
>> especially if there is WHERE clause.  We will improve during developing,
>> but for now,
>> we don't have good ideas.
>
>
> I haven't yet read the patch but curious to know.  Suppose we have table
> which is already loaded with some data.  Now, if I create  hypothetical
> partitions on that will we create any stat data (mcv, histogram) for
> hypothetical table? because, in this case we already have the data from the
> main table and we also have partition boundary for the hypothetical table.
> I am not sure you are already doing this or its an open item?

For now we're simply using the original table statistics, and
appending the partition bounds as qual on the hypothetical partition.
It'll give good result if the query doesn't have quals for the table,
or for simple cases where selectivity functions understand that
expressions such as

(id BETWEEN 1 AND 1000000) AND (id < 6)

will return only 5 rows, while they can't for expressions like

(id IN (x,y...)) AND (id < z)

In this second case, the estimates are for now therefore quite wrong.
I think that we'd have no other choice than to generate hypothetical
statistics according to the partition bounds, and only compute
selectivity based on the query quals.  It's definitely not simple to
do, but it should be doable with the hooks currently available.

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Proposal: Partitioning Advisor for PostgreSQL