Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan

Поиск
Список
Период
Сортировка
On Wed, Jul 26, 2023 at 5:29 AM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:
> Considering that it caches/reuses the page across SAOP operations, can
> (or does) this also improve performance for index scans on the outer
> side of a join if the order of join columns matches the order of the
> index?

It doesn't really cache leaf pages at all. What it does is advance the
array keys locally, while the original buffer lock is still held on
that same page.

> That is, I believe this caches (leaf) pages across scan keys, but can
> (or does) it also reuse these already-cached leaf pages across
> restarts of the index scan/across multiple index lookups in the same
> plan node, so that retrieval of nearby index values does not need to
> do an index traversal?

I'm not sure what you mean. There is no reason why you need to do more
than one single descent of an index to scan many leaf pages using many
distinct sets of array keys. Obviously, this depends on being able to
observe that we really don't need to redescend the index to advance
the array keys, again and again. Note in particularly that this
usually works across leaf pages.

> I'm not sure I understand. MDAM seems to work on an index level to
> return full ranges of values, while "skip scan" seems to try to allow
> systems to signal to the index to skip to some other index condition
> based on arbitrary cutoffs. This would usually be those of which the
> information is not stored in the index, such as "SELECT user_id FROM
> orders GROUP BY user_id HAVING COUNT(*) > 10", where the scan would go
> though the user_id index and skip to the next user_id value when it
> gets enough rows of a matching result (where "enough" is determined
> above the index AM's plan node, or otherwise is impossible to
> determine with only the scan key info in the index AM). I'm not sure
> how this could work without specifically adding skip scan-related
> index AM functionality, and I don't see how it fits in with this
> MDAM/SAOP system.

I think of that as being quite a different thing.

Basically, the patch that added that feature had to revise the index
AM API, in order to support a mode of operation where scans return
groupings rather than tuples. Whereas this patch requires none of
that. It makes affected index scans as similar as possible to
conventional index scans.

> > [...]
> >
> > Thoughts?
>
> MDAM seems to require exponential storage for "scan key operations"
> for conditions on N columns (to be precise, the product of the number
> of distinct conditions on each column); e.g. an index on mytable
> (a,b,c,d,e,f,g,h) with conditions "a IN (1, 2) AND b IN (1, 2) AND ...
> AND h IN (1, 2)" would require 2^8 entries.

Note that I haven't actually changed anything about the way that the
state machine generates new sets of single value predicates -- it's
still just cycling through each distinct set of array keys in the
patch.

What you describe is a problem in theory, but I doubt that it's a
problem in practice. You don't actually have to materialize the
predicates up-front, or at all. Plus you can skip over them using the
next index tuple. So skipping works both ways.

--
Peter Geoghegan



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Zhang Mingli
Дата:
Сообщение: Re: [PATCH] Check more invariants during syscache initialization
Следующее
От: Alvaro Herrera
Дата:
Сообщение: Re: cataloguing NOT NULL constraints