Re: Memory-Bounded Hash Aggregation

Поиск
Список
Период
Сортировка
От Jeff Davis
Тема Re: Memory-Bounded Hash Aggregation
Дата
Msg-id 1b5c79a14f85e20899b100381c1e8d41e3148c5b.camel@j-davis.com
обсуждение исходный текст
Ответ на Re: Memory-Bounded Hash Aggregation  (Adam Lee <ali@pivotal.io>)
Ответы Re: Memory-Bounded Hash Aggregation  (Adam Lee <ali@pivotal.io>)
Список pgsql-hackers
On Wed, 2019-12-04 at 19:50 -0800, Adam Lee wrote:
> On Wed, Dec 04, 2019 at 06:55:43PM -0800, Jeff Davis wrote:
> > 
> > Thanks very much for a great review! I've attached a new patch.
> 
> Hi,
> 
> About the `TODO: project needed attributes only` in your patch, when
> would the input tuple contain columns not needed? It seems like
> anything
> you can project has to be in the group or aggregates.

If you have a table like:

   CREATE TABLE foo(i int, j int, x int, y int, z int);

And do:

   SELECT i, SUM(j) FROM foo GROUP BY i;

At least from a logical standpoint, you might expect that we project
only the attributes we need from foo before feeding them into the
HashAgg. But that's not quite how postgres works. Instead, it leaves
the tuples intact (which, in this case, means they have 5 attributes)
until after aggregation and lazily fetches whatever attributes are
referenced. Tuples are spilled from the input, at which time they still
have 5 attributes; so naively copying them is wasteful.

I'm not sure how often this laziness is really a win in practice,
especially after the expression evaluation has changed so much in
recent releases. So it might be better to just project all the
attributes eagerly, and then none of this would be a problem. If we
still wanted to be lazy about attribute fetching, that should still be
possible even if we did a kind of "logical" projection of the tuple so
that the useless attributes would not be relevant. Regardless, that's
outside the scope of the patch I'm currently working on.

What I'd like to do is copy just the attributes needed into a new
virtual slot, leave the unneeded ones NULL, and then write it out to
the tuplestore as a MinimalTuple. I just need to be sure to get the
right attributes.

Regards,
    Jeff Davis





В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: [HACKERS] Block level parallel vacuum
Следующее
От: Jeff Davis
Дата:
Сообщение: Re: Memory-Bounded Hash Aggregation