Remove WindowClause PARTITION BY items belonging to redundant pathkeys

Поиск
Список
Период
Сортировка
От David Rowley
Тема Remove WindowClause PARTITION BY items belonging to redundant pathkeys
Дата
Msg-id CAApHDvo2ji+hdxrxfXtRtsfSVw3to2o1nCO20qimw0dUGK8hcQ@mail.gmail.com
обсуждение исходный текст
Ответы Re: Remove WindowClause PARTITION BY items belonging to redundant pathkeys
Список pgsql-hackers
Recently Markus Winand pointed out to me that the PG15 changes made in
[1] to teach the query planner about monotonic window functions
improved the situation for PostgreSQL on his feature/optimization
timeline for PostgreSQL.  These can be seen in [2].

Unfortunately, if you look at the timeline in [2], we're not quite on
green just yet per Markus's "Not with partition by clause (see below)"
caveat.  This is because nodeWindowAgg.c's use_pass_through code must
be enabled when the WindowClause has a PARTITION BY clause.

The reason for this is that we can't just stop spitting out rows from
the WindowAgg when one partition is done as we still need to deal with
rows from any subsequent partitions and we can only get to those by
continuing to read rows until we find rows belonging to the next
partition.

There is however a missed optimisation here when there is a PARTITION
BY clause, but also some qual exists for the column(s) mentioned in
the partition by clause that makes it so only one partition can exist.
A simple example of that is in the following:

EXPLAIN
SELECT *
FROM
  (SELECT
      relkind,
      pg_relation_size(oid) size,
      rank() OVER (PARTITION BY relkind ORDER BY pg_relation_size(oid) DESC
      ) rank
    FROM pg_class)
WHERE relkind = 'r' AND rank <= 10;

(the subquery may be better imagined as a view)

Here, because of the relkind='r' qual being pushed down into the
subquery, effectively that renders the PARTITION BY relkind clause
redundant.

What the attached patch does is process each WindowClause and removes
any items from the PARTITION BY clause that are columns or expressions
relating to redundant PathKeys.

Effectively, this allows the nodeWindowAgg.c code which stops
processing WindowAgg rows when the run condition is met to work as the
PARTITION BY clause is completely removed in the case of the above
query.  Removing the redundant PARTITION BY items also has the added
benefit of not having to needlessly check if the next row belongs to
the same partition as the last row. For the above, that check is a
waste of time as all rows have relkind = 'r'

I passed the patch along to Markus and he kindly confirmed that we're
now green for this particular optimisation.

I'll add this patch to the July commitfest.

David

[1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=9d9c02ccd
[2] https://use-the-index-luke.com/sql/partial-results/window-functions

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Thomas Munro
Дата:
Сообщение: Re: Let's make PostgreSQL multi-threaded
Следующее
От: Andres Freund
Дата:
Сообщение: Re: win32ver data in meson-built postgres.exe