Re: Index Skip Scan

Поиск
Список
Период
Сортировка
От David Rowley
Тема Re: Index Skip Scan
Дата
Msg-id CAApHDvoETokg2pJ89PgK0ExryRS=_0p6tYvrQnL1GkEEJTyXBg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Index Skip Scan  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Список pgsql-hackers
On Sat, 7 Mar 2020 at 03:49, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>
> On Wed, Mar 04, 2020 at 11:32:00AM +1300, David Rowley wrote:
> >The reason it must be done this way is that when the RelOptInfo that
> >we're performing the DISTINCT on is a joinrel, then we're not going to
> >see any IndexPaths in the RelOptInfo's pathlist. We'll have some sort
> >of Join path instead.  I understand you're not yet supporting doing
> >this optimisation when joins are involved, but it should be coded in
> >such a way that it'll work when we do.  (It's probably also a separate
> >question as to whether we should have this only work when there are no
> >joins. I don't think I personally object to it for stage 1, but
> >perhaps someone else might think differently.)
> >
>
> I don't follow. Can you elaborate more?
>
> AFAICS skip-scan is essentially a capability of an (index) AM. I don't
> see how we could ever do that for joinrels? We can do that at the scan
> level, below a join, but that's what this patch already supports, I
> think. When you say "supporting this optimisation" with joins, do you
> mean doing skip-scan for join inputs, or on top of the join?

The skip index scan Path would still be created at the base rel level,
but the join path on the join relation would have one of the sub-paths
of the join as an index skip scan.

An example query that could make use of this is:

SELECT * FROM some_table WHERE a IN(SELECT
indexed_col_with_few_distinct_values FROM big_table);

In this case, we might want to create a Skip Scan path on big_table
using the index on the "indexed_col_with_few_distinct_values", then
Hash Join to "some_table". That class of query is likely stage 2 or 3
of this work, but we need to lay foundations that'll support it.

As for not having IndexScan paths in joinrels. Yes, of course, but
that's exactly why create_distinct_paths() cannot work the way the
patch currently codes it. The patch does:

+ /*
+ * XXX: In case of index scan quals evaluation happens
+ * after ExecScanFetch, which means skip results could be
+ * fitered out. Consider the following query:
+ *
+ * select distinct (a, b) a, b, c from t where  c < 100;
+ *
+ * Skip scan returns one tuple for one distinct set of (a,
+ * b) with arbitrary one of c, so if the choosed c does
+ * not match the qual and there is any c that matches the
+ * qual, we miss that tuple.
+ */
+ if (path->pathtype == T_IndexScan &&

which will never work for join relations since they'll only have paths
for Loop/Merge/Hash type joins.  The key here is to determine which
skip scan paths we should create when we're building the normal index
paths then see if we can make use of those when planning joins.
Subsequently, we'll then see if we can make use of the resulting join
paths during create_distinct_paths(). Doing it this way will allow us
to use skip scans in queries such as:

SELECT DISTINCT t1.z FROM t1 INNER JOIN t2 ON t1.a = t2.unique_col;

We'll first create the skip scan paths on t1, then when creating the
join paths we'll create additional join paths which use the skipscan
path. Because t1.unique_col will at most have 1 join partner for each
t2 row, then the join path will have the same unique_keys as the
skipscan path.  That'll allow us to use the join path which has the
skip scan on whichever side of the join the t1 relation ends up.  All
create_distinct_paths() should be doing is looking for paths that are
already implicitly unique on the distinct clause and consider using
those in a cost-based way. It shouldn't be making such paths itself.

> >For storing these new paths with UniqueKeys, I'm not sure exactly if
> >we can just add_path() such paths into the RelOptInfo's pathlist.
> >What we don't want to do is accidentally make use of paths which
> >eliminate duplicate values when we don't want that behaviour.  If we
> >did store these paths in RelOptInfo->pathlist then we'd need to go and
> >modify a bunch of places to ignore such paths. set_cheapest() would
> >have to do something special for them too, which makes me think
> >pathlist is the incorrect place.  Parallel query added
> >partial_pathlist, so perhaps we need unique_pathlist to make this
> >work.
> >
>
> Hmmm, good point. Do we actually produce incorrect plans with the
> current patch, using skip-scan path when we should not?

I don't think so. The patch is only creating skip scan paths on the
base rel when we discover it's valid to do so.  That's not the way it
should work though.  How the patch currently works would be similar to
initially only creating a SeqScan path for a query such as: SELECT *
FROM tab ORDER BY a;, but then, during  create_ordered_paths() go and
create some IndexPath to scan the btree index on tab.a because we
suddenly realise that it'll be good to use that for the ORDER BY.
The planner does not work that way.  We always create all the paths
that we think will be useful during set_base_rel_pathlists(). We then
make use of only existing paths in the upper planner.  See what
build_index_paths() in particular:

/* see if we can generate ordering operators for query_pathkeys */
match_pathkeys_to_index(index, root->query_pathkeys,
&orderbyclauses,
&orderbyclausecols);

We'll need something similar to that but for the query_uniquekeys and
ensure we build the skip scan paths when we think they'll be useful
and do so during the call to set_base_rel_pathlists().  Later in stage
2 or 3, we can go build skip scan paths when there are semi/anti joins
that could make use of them. Making that work will just be some
plumbing work in build_index_paths() and making use of those paths
during add_paths_to_joinrel().

David



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Masahiko Sawada
Дата:
Сообщение: Re: Some problems of recovery conflict wait events
Следующее
От: Isaac Morland
Дата:
Сообщение: Re: range_agg