Re: Pull up aggregate subquery

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: Pull up aggregate subquery
Дата
Msg-id BANLkTimURp+3_LZuaOyP0GSMOeZ7qNNdbw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Pull up aggregate subquery  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On Tue, May 24, 2011 at 12:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Oh, I see.  I have a general gripe with nested loop plans: we already
>> consider too many of them.  IIRC, when I last fooled around with this,
>> the number of nested loop paths that we generate far exceeded the
>> number of merge or hash join paths, and most of those paths suck and
>> are a complete waste of time.
>
> Hm, really?  My experience is that it's the mergejoin paths that breed
> like rabbits, because there are so many potential sort orders.

*scratches head*

Well, I'm pretty sure that's how it looked when I was testing it.  I
wonder how this could be different for the two of us.  Or maybe one of
us is confused.  Admittedly, I haven't looked at it in a while.

>>> But I think this is all fairly unrelated to the case that Hitoshi is on
>>> about.  As you said earlier, it seems like we'd have to derive both
>>> parameterized and unparameterized plans for the subquery, which seems
>>> mighty expensive.
>
>> That was my first thought, too, but then I wondered if I was getting
>> cheap.
>
> Yeah, it's certainly possible that we're worrying too much.  Usually
> I only get concerned about added planner logic if it will impact the
> planning time for simple queries.  "Simple" tends to be in the eye of
> the beholder, but something with a complicated aggregate subquery is
> probably not simple by anyone's definition.
>
> In this case the sticky point is that there could be multiple possible
> sets of clauses available to be pushed down, depending on what you
> assume is the outer relation for the eventual upper-level nestloop.
> So worst case, you could have not just one parameterized plan to
> generate in addition to the regular kind, but 2^N of them ...

Hmm.  Well, 2^N is more than 2.  But I bet most of them are boring.
Judging by his followup email, Hitoshi Harada seems to think we can
just look at the case where we can parameterize on all of the grouping
columns.  The only other case that seems like it might be interesting
is parameterizing on any single one of the grouping columns.  I can't
get excited about pushing down arbitrary subsets.  Of course, even
O(N) in the number of grouping columns might be too much, but then we
could fall back to just all or nothing.  I think the "all" case by
itself would probably extract 90%+ of the benefit, especially since
"all" will often mean "the only one there is".

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Josh Berkus
Дата:
Сообщение: Re: 9.2 schedule
Следующее
От: Andrew Dunstan
Дата:
Сообщение: Re: "errno" not set in case of "libm" functions (HPUX)