Re: Expanding HOT updates for expression and partial indexes

Поиск
Список
Период
Сортировка
От Greg Burd
Тема Re: Expanding HOT updates for expression and partial indexes
Дата
Msg-id C44FBBC0-6DA3-41D4-A389-35C9313157A8@greg.burd.me
обсуждение исходный текст
Ответ на Re: Expanding HOT updates for expression and partial indexes  (Greg Burd <greg@burd.me>)
Список pgsql-hackers
I've updated the patch set a tad and I've got some benchmark results
(and questions).

PATCHES
===========================================================

* 0001 - Prepare heapam_tuple_update() and simple_heap_update() for divergence

Unchanged.

* 0002 - Track changed indexed columns in the executor during UPDATEs

 Bug/oversight minor fix related to partial index attributes.

Also, I mistakenly said that v25 removed the $subject (ability to allow
expression indexes to be HOT).  That's not true, they can go HOT with
this patch provided that the result of the expression evaluated using
the before/after attribute values are equal using datumIsEqual().  When
that is the case, as can happen with updates to fields within JSONB
columns when indexes are on other fields, the update can be HOT should
the heap find room on the page to store the new tuple.

* 0003 - Replace index_unchanged_by_update() with ri_ChangedIndexedCols

Unchanged.

* 0004 - Identify if partial indexes are impacted by an update

This is the new piece, it existed in the v24 patch set and now it is
back. This checks the before/after partial index expression and when
both are outside the predicate then it is possible that heap can use the
HOT path whereas in the past this couldn't happen.  In the past any
update to an attribute in an index, even if it was outside the
predicate, was (is) HOT blocking.


SUMMARY
===========================================================

I've just started to scratch the surface of performance testing for
this, attached is a very simple comparison of master/patch for a basic
update load that should always go HOT in either case.  It shows about 1%
variance between the two (-O0), tests run on my laptop so that's
essentially no difference despite more overhead of the new function and
that it seems to be called more frequently due to (guessing here) more
opportunity for TM_Updated to be the return from heapam_tuple_update. 
Your thoughts welcome here, or best/worst case ideas for tests to run.

Next up I plan to layer the controversial type-specific piece into this
patch set if nothing else just as a record of what's left over.  Then
I'll try to better isolate good/bad performance implications of this
patch set.

Ideally, this patch set and the one (under development) for catalog
tuples could combine to completely restructure the heap update process
and open the door to more HOT updates and faster catalog updates.  But,
I still have to demonstrate that.  For JSONB heavy applications this
should be a net win, for the rest it should be a minor or zero
regression.  For other custom implementations of indexes over
specialized types (as is the case for the new open sourced DocumentDB
work) this opens the door for HOT updates when possible.  All of that is
the the hope, it's time to measure hope against reality. :)

This patch set does start to move the executor away from a heap-specific
view of the world where updates are all/none/summarizing.  This
potentially eases the integration of WARM or PHOT-like solutions where
we only update those indexes that are materially impacted by an update. 
It should be clear by now, that's my ultimate goal.

best.

-greg

Вложения

В списке pgsql-hackers по дате отправления: