Re: Expand applicability of aggregate's sortop optimization

Поиск
Список
Период
Сортировка
От Matthias van de Meent
Тема Re: Expand applicability of aggregate's sortop optimization
Дата
Msg-id CAEze2WhZkbupd35vB8cUJsyNhED-Rd9isvRGjskZ+MO14fo8xA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Expand applicability of aggregate's sortop optimization  (Andrei Lepikhov <lepihov@gmail.com>)
Ответы Re: Expand applicability of aggregate's sortop optimization
Список pgsql-hackers
On Wed, 17 Jul 2024 at 16:09, Andrei Lepikhov <lepihov@gmail.com> wrote:
>
> On 17/7/2024 16:33, Matthias van de Meent wrote:
> > On Wed, 17 Jul 2024 at 05:29, Andrei Lepikhov <lepihov@gmail.com> wrote:
> >> Thanks for the job! I guess you could be more brave and push down also
> >> FILTER statement.
> >
> > While probably not impossible, I wasn't planning on changing this code
> > with new optimizations; just expanding the applicability of the
> > current optimizations.
> Got it>> As I see, the code:
> >> aggsortop = fetch_agg_sort_op(aggref->aggfnoid);
> >> if (!OidIsValid(aggsortop))
> >>     return false;                /* not a MIN/MAX aggregate */
> >>
> >> used twice and can be evaluated earlier to avoid duplicated code.
> >
> > The code is structured like this to make sure we only start accessing
> > catalogs once we know that all other reasons to bail out from this
> > optimization indicate we can apply the opimization. You'll notice that
> > I've tried to put the cheapest checks that only use caller-supplied
> > information first, and catalog accesses only after that.
> >
> > If the fetch_agg_sort_op clause would be deduplicated, it would either
> > increase code complexity to handle both aggref->aggorder paths, or it
> > would increase the cost of planning MAX(a ORDER BY b) because of the
> > newly added catalog access.
> IMO it looks like a micro optimisation. But I agree, it is more about
> code style - let the committer decide what is better.>> Also, I'm unsure
> about the necessity of looking through the btree
> >> classes. Maybe just to check the commutator to the sortop, like in the
> >> diff attached? Or could you provide an example to support your approach?
> >
> > I think it could work, but I'd be hesitant to rely on that, as
> > commutator registration is optional (useful, but never required for
> > btree operator classes' operators). Looking at the btree operator
> > class, which is the definition of sortability in PostgreSQL, seems
> > more suitable and correct.
> Hm, I dubious about that. Can you provide an example which my variant
> will not pass but your does that correctly?

Here is one:

"""
CREATE OPERATOR @@> (
  function=int4gt, leftarg=int4, rightarg=int4
); CREATE OPERATOR @@>= (
  function=int4ge, leftarg=int4, rightarg=int4
); CREATE OPERATOR @@= (
  function=int4eq, leftarg=int4, rightarg=int4
); CREATE OPERATOR @@<= (
  function=int4le, leftarg=int4, rightarg=int4
); CREATE OPERATOR @@< (
  function=int4lt, leftarg=int4, rightarg=int4
);

CREATE OPERATOR CLASS my_int_ops
  FOR TYPE int
  USING btree AS
  OPERATOR 1 @<@,
  OPERATOR 2 @<=@,
  OPERATOR 3 @=@,
  OPERATOR 4 @>=@,
  OPERATOR 5 @>@,
  FUNCTION 1 btint4cmp;

CREATE AGGREGATE my_int_max (
  BASETYPE = int4,
  SFUNC = int4larger,
  STYPE = int4,
  SORTOP = @>@
);

CREATE TABLE my_table AS
SELECT id::int4 FROM generate_series(1, 10000) id;

CREATE INDEX ON my_table (id my_int_ops);

SELECT my_int_max(id ORDER BY id USING @<@ ) from my_table;
"""

Because the @<@ and @>@ operators are not registered as commutative,
it couldn't apply the optimization in your patch, while the btree
operator check does allow it to apply the optimization.

Aside: Arguably, checking for commutator operators would not be
incorrect when looking at it from "applied operators" point of view,
but if that commutative operator isn't registered as opposite ordering
of the same btree opclass, then we'd probably break some assumptions
of some aggregate's sortop - it could be registered with another
opclass, and that could cause us to select a different btree opclass
(thus: ordering) than is indicated to be required by the aggregate;
the thing we're trying to protect against here.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Joe Conway
Дата:
Сообщение: Re: CI, macports, darwin version problems
Следующее
От: Nazir Bilal Yavuz
Дата:
Сообщение: Re: CI, macports, darwin version problems