Re: [HACKERS] Surjective functional indexes

Поиск
Список
Период
Сортировка
От Konstantin Knizhnik
Тема Re: [HACKERS] Surjective functional indexes
Дата
Msg-id bd2b12b9-ed2a-75d3-c501-128f78831a61@postgrespro.ru
обсуждение исходный текст
Ответ на Re: [HACKERS] Surjective functional indexes  (Simon Riggs <simon@2ndquadrant.com>)
Ответы Re: [HACKERS] Surjective functional indexes  (Simon Riggs <simon@2ndquadrant.com>)
Список pgsql-hackers

On 14.09.2017 13:19, Simon Riggs wrote:
> On 14 September 2017 at 10:42, Konstantin Knizhnik
> <k.knizhnik@postgrespro.ru> wrote:
>>
>> On 13.09.2017 14:00, Simon Riggs wrote:
>>> On 13 September 2017 at 11:30, Konstantin Knizhnik
>>> <k.knizhnik@postgrespro.ru> wrote:
>>>
>>>> The only reason of all this discussion about terms is that I need to
>>>> choose
>>>> name for correspondent index option.
>>>> Simon think that we do not need this option at all. In this case we
>>>> should
>>>> not worry about right term.
>>>>   From my point of view, "projection" is quite clear notion and not only
>>>> for
>>>> mathematics. It is also widely used in IT and especially in DBMSes.
>>> If we do have an option it won't be using fancy mathematical
>>> terminology at all, it would be described in terms of its function,
>>> e.g. recheck_on_update
>>>
>>> Yes, I'd rather not have an option at all, just some simple code with
>>> useful effect, like we have in many other places.
>>>
>> Attached please find new version of projection functional index optimization
>> patch.
>> I have implemented very simple autotune strategy: now I use table statistic
>> to compare total number of updates with number of hot updates.
>> If fraction of hot updates is relatively small, then there is no sense to
>> spend time performing extra evaluation of index expression and comparing its
>> old and new values.
>> Right now the formula is the following:
>>
>> #define MIN_UPDATES_THRESHOLD 10
>> #define HOT_RATIO_THRESHOLD   2
>>
>>          if (stat->tuples_updated > MIN_UPDATES_THRESHOLD
>>              && stat->tuples_updated >
>> stat->tuples_hot_updated*HOT_RATIO_THRESHOLD)
>>          {
>>              /* If percent of hot updates is small, then disable projection
>> index function
>>               * optimization to eliminate overhead of extra index expression
>> evaluations.
>>               */
>>              ii->ii_Projection = false;
>>          }
>>
>> This threshold values are pulled out of a hat: I am not sure if this
>> heuristic is right.
>> I will be please to get feedback if such approach to autotune is promising.
> Hmm, not really, but thanks for trying.
>
> This works by looking at overall stats, and only looks at the overall
> HOT %, so its too heavyweight and coarse.
>
> I suggested storing stat info on the relcache and was expecting you
> would look at how often the expression evaluates to new == old. If we
> evaluate new against old many times, then if the success rate is low
> we should stop attempting the comparison. (<10%?)
>
> Another idea:
> If we don't make a check when we should have done then we will get a
> non-HOT update, so we waste time extra time difference between a HOT
> and non-HOT update. If we check and fail we waste time take to perform
> check. So the question is how expensive the check is against how
> expensive a non-HOT update is. Could we simply say we don't bother to
> check functions that have a cost higher than 10000? So if the user
> doesn't want to perform the check they can just increase the cost of
> the function above the check threshold?
>
Attached pleased find one more patch which calculates hot update check 
hit rate more precisely: I have to extended PgStat_StatTabEntry with two 
new fields:
hot_update_hits and hot_update_misses.

Concerning your idea to check cost of index function: it certainly makes 
sense.
The only problems: I do not understand now how to calculate this cost.
It can be easily calculated by optimizer when it is building query 
execution plan.
But inside BuildIndexInfo I have just reference to Relation and have no 
idea how
I can propagate here information about index expression cost from optimizer.


-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: [HACKERS] Is it time to kill support for very old servers?
Следующее
От: Jesper Pedersen
Дата:
Сообщение: Re: [HACKERS] [POC] hash partitioning