Re: [HACKERS] Surjective functional indexes

Поиск
Список
Период
Сортировка
От Konstantin Knizhnik
Тема Re: [HACKERS] Surjective functional indexes
Дата
Msg-id 251f2f30-7e38-f417-5eaf-fd9f11d00932@postgrespro.ru
обсуждение исходный текст
Ответ на Re: [HACKERS] Surjective functional indexes  (Simon Riggs <simon@2ndquadrant.com>)
Список pgsql-hackers

On 14.09.2017 18:53, Simon Riggs wrote:
>
>>> This works by looking at overall stats, and only looks at the overall
>>> HOT %, so its too heavyweight and coarse.
>>>
>>> I suggested storing stat info on the relcache and was expecting you
>>> would look at how often the expression evaluates to new == old. If we
>>> evaluate new against old many times, then if the success rate is low
>>> we should stop attempting the comparison. (<10%?)
>>>
>>> Another idea:
>>> If we don't make a check when we should have done then we will get a
>>> non-HOT update, so we waste time extra time difference between a HOT
>>> and non-HOT update. If we check and fail we waste time take to perform
>>> check. So the question is how expensive the check is against how
>>> expensive a non-HOT update is. Could we simply say we don't bother to
>>> check functions that have a cost higher than 10000? So if the user
>>> doesn't want to perform the check they can just increase the cost of
>>> the function above the check threshold?
>>>
>> Attached pleased find one more patch which calculates hot update check hit
>> rate more precisely: I have to extended PgStat_StatTabEntry with two new
>> fields:
>> hot_update_hits and hot_update_misses.
> It's not going to work, as already mentioned above. Those stats are at
> table level and very little to do with this particular index.
>
> But you've not commented on the design I mention that can work: index relcache.
Sorry, I do not completely agree with you.
Yes, certainly whether functional index is projective or not is property 
of the index, not of the table.
But the decision whether hot update is applicable or not is made for the 
whole table - for all indexes.
If a value of just one indexed expressions is changed then we can not 
use hot update and have to update all indexes.

Assume that we have table with "bookinfo" field of type JSONB.
And we create several functional indexes on this column: 
(bookinfo->'isbn'), (bookinfo->'title'), (bookinfo->'author'), 
(bookinfo->'rating').
Probability that indexed expression is changed is case of updating 
"bookinfo" field my be different for all this three indexes.
But there is completely no sense to check if 'isbn' is changed or not, 
if we already detect that most updates cause change of 'rating' 
attribute and
so comparing old and new values of (bookinfo->'rating') is just waste of 
time. In this case we should not also compare (bookinfo->'isbn') and
other indexed expressions because for already rejected possibility of 
hot update.

So despite to the fact that this information depends on particular 
index, it affects behavior of the whole table and it is reasonable (and 
simpler) to collect it in table's statistic.

>> Concerning your idea to check cost of index function: it certainly makes
>> sense.
>> The only problems: I do not understand now how to calculate this cost.
>> It can be easily calculated by optimizer when it is building query execution
>> plan.
>> But inside BuildIndexInfo I have just reference to Relation and have no idea
>> how
>> I can propagate here information about index expression cost from optimizer.
> We could copy at create index, if we took that route. Or we can look
> up the cost for the index expression and cache it.
>
>
> Anyway, this is just jumping around because we still have a parameter
> and the idea was to remove the parameter entirely by autotuning, which
> I think is both useful and possible, just as HOT itself is autotuned.
>

Hot update in almost all cases is preferable to normal update, causing 
update of indexes.
There are can be some scenarios when hot updates reduce speed of some 
queries,
but it is very difficult to predict such cases user level.

But usually nature of index is well known by DBA or programmer. In 
almost all cases it is clear for person creating functional index 
whether it will perform projection or not
and whether comparing old/new expression value makes sense or is just 
waste of time. We can guess it from autotune, but such decision may be 
wrong (just because of application
business logic). Postgres indexes already have a lot of options. And I 
think that "projection" option (or whatever we name it) is also needed.


-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alexander Korotkov
Дата:
Сообщение: Re: [HACKERS] COMMIT TRIGGERs, take n, implemented with CONSTRAINT TRIGGERS
Следующее
От: Kyotaro HORIGUCHI
Дата:
Сообщение: Re: [HACKERS] [PATCH] Improve geometric types