Re: [HACKERS] Surjective functional indexes

Поиск
Список
Период
Сортировка
От Konstantin Knizhnik
Тема Re: [HACKERS] Surjective functional indexes
Дата
Msg-id 59CD5D94.3080501@postgrespro.ru
обсуждение исходный текст
Ответ на Re: [HACKERS] Surjective functional indexes  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On 09/28/2017 10:10 PM, Robert Haas wrote:
> On Wed, Sep 13, 2017 at 7:00 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> If we do have an option it won't be using fancy mathematical
>> terminology at all, it would be described in terms of its function,
>> e.g. recheck_on_update
> +1.

I have nothing against renaming "projection" option to "recheck_on_update" or whatever else is suggested.
Just let me know the best version. From my point of view "recheck_on_update" is too verbose and still not
self-explained(to understand the meaning of this option it is necessary to uunderstand how heap_update works).
"projection"/"non-injective"/...are 
 
more declarative notions, explaining the characteristic of the index, while "recheck_on_update"  is more procedural
notion,explaining behavior of heap_update.
 

>
>> Yes, I'd rather not have an option at all, just some simple code with
>> useful effect, like we have in many other places.
> I think the question we need to be able to answer is: What is the
> probability that an update that would otherwise be non-HOT can be made
> into a HOT update by performing a recheck to see whether the value has
> changed?  It doesn't seem easy to figure that out from any of the
> statistics we have available today or could easily get, because it
> depends not only on the behavior of the expression which appears in
> the index definition but also on the application behavior.  For
> example, consider a JSON blob representing a bank account.
> b->'balance' is likely to change most of the time, but
> b->'account_holder_name' only rarely.  That's going to be hard for an
> automated system to determine.
>
> We should clearly check as many of the other criteria for a HOT update
> as possible before performing a recheck of this type, so that it only
> gets performed when it might help.  For example, if column a is
> indexed and b->'foo' is indexed, there's no point in checking whether
> b->'foo' has changed if we know that a has changed.  I don't know
> whether it would be feasible to postpone deciding whether to do a
> recheck until after we've figured out whether the page seems to
> contain enough free space to allow a HOT update.
>
> Turning non-HOT updates into HOT updates is really good, so it seems
> likely that the rechecks will often be worthwhile.  If we avoid a HOT
> update in 25% of cases, that's probably easily worth the CPU overhead
> of a recheck assuming the function isn't something ridiculously
> expensive to compute; the extra CPU cost will be repaid by reduced
> bloat.  However, if we avoid a HOT update only one time in a million,
> it's probably not worth the cost of recomputing the expression the
> other 999,999 times.  I wonder where the crossover point is -- it
> seems like something that could be figured out by benchmarking.
>
> While I agree that it would be nice to have this be a completely
> automatic determination, I am not sure that will be practical.  I
> oppose overloading some other marker (like function_cost>10000) for
> this; that's too magical.
>
I almost agree with you.
Just few remarks: indexes are rarely created for frequently changed attributes, like b->'balance'.
So in case of proper database schema design it is possible to expect that most of updates are hot updates: do not
actuallyaffect any index.
 
But certainly different attributes may have different probability of been updated.
Unfortunately we do not know before check which attribute of JSON field (or any other fields used in indexed
expression)is changed.
 

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Oleg Bartunov
Дата:
Сообщение: Re: [HACKERS] Surjective functional indexes
Следующее
От: Peter Geoghegan
Дата:
Сообщение: Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple