On Tue, Dec 3, 2019 at 12:13 PM Peter Geoghegan <pg@bowt.ie> wrote:
> The new criteria/heuristic for unique indexes is very simple: If a
> unique index has an existing item that is a duplicate on the incoming
> item at the point that we might have to split the page, then apply
> deduplication. Otherwise (when the incoming item has no duplicates),
> don't apply deduplication at all -- just accept that we'll have to
> split the page. We already cache the bounds of our initial binary
> search in insert state, so we can reuse that information within
> _bt_findinsertloc() when considering deduplication in unique indexes.
Attached is v26, which adds this new criteria/heuristic for unique
indexes. We now seem to consistently get good results with unique
indexes.
Other changes:
* A commit message is now included for the main patch/commit.
* The btree_deduplication GUC is now a boolean, since it is no longer
up to the user to indicate when deduplication is appropriate in unique
indexes (the new heuristic does that instead). The GUC now only
affects non-unique indexes.
* Simplified the user docs. They now only mention deduplication of
unique indexes in passing, in line with the general idea that
deduplication in unique indexes is an internal optimization.
* Fixed bug that made backwards scans that touch posting lists fail to
set LP_DEAD bits when that was possible (i.e. the kill_prior_tuple
optimization wasn't always applied there with posting lists, for no
good reason). Also documented the assumptions made by the new code in
_bt_readpage()/_bt_killitems() -- if that was clearer in the first
place, then the LP_DEAD/kill_prior_tuple bug might never have
happened.
* Fixed some memory leaks in nbtree VACUUM.
Still waiting for some review of the first patch, to get it out of the
way. Anastasia?
--
Peter Geoghegan