Re: Huge Data sets, simple queries

От: Tom Lane
Тема: Re: Huge Data sets, simple queries
Дата: ,
Msg-id: 18415.1138469820@sss.pgh.pa.us
(см: обсуждение, исходный текст)
Ответ на: Re: Huge Data sets, simple queries  ("Jeffrey W. Baker")
Ответы: Re: Huge Data sets, simple queries  (Tom Lane)
Список: pgsql-performance

Скрыть дерево обсуждения

Huge Data sets, simple queries  ("Mike Biamonte", )
 Re: Huge Data sets, simple queries  ("Jeffrey W. Baker", )
 Re: Huge Data sets, simple queries  (Tom Lane, )
  Re: Huge Data sets, simple queries  ("Jeffrey W. Baker", )
   Re: Huge Data sets, simple queries  (Tom Lane, )
    Re: Huge Data sets, simple queries  (Tom Lane, )
 Re: Huge Data sets, simple queries  ("Luke Lonergan", )
  Re: Huge Data sets, simple queries  (hubert depesz lubaczewski, )
   Re: Huge Data sets, simple queries  (Michael Stone, )
  Re: Huge Data sets, simple queries  ("Jim C. Nasby", )
   Re: Huge Data sets, simple queries  ("Luke Lonergan", )
    Re: Huge Data sets, simple queries  (Kevin, )
    Re: Huge Data sets, simple queries  ("Jim C. Nasby", )
     Re: Huge Data sets, simple queries  ("Luke Lonergan", )
      Re: Huge Data sets, simple queries  ("Jim C. Nasby", )
       Re: Huge Data sets, simple queries  ("Luke Lonergan", )
        Re: Huge Data sets, simple queries  ("Jim C. Nasby", )
    Re: Huge Data sets, simple queries  ("Jeffrey W. Baker", )
     Re: Huge Data sets, simple queries  ("Luke Lonergan", )
      Re: Huge Data sets, simple queries  (PFC, )
       Re: Huge Data sets, simple queries  ("Luke Lonergan", )
      Re: Huge Data sets, simple queries  ("Steinar H. Gunderson", )
       Re: Huge Data sets, simple queries  ("Luke Lonergan", )
      Re: Huge Data sets, simple queries  ("Jeffrey W. Baker", )
       Re: Huge Data sets, simple queries  ("Luke Lonergan", )
        Re: Huge Data sets, simple queries  ("Jeffrey W. Baker", )
         Re: Huge Data sets, simple queries  (PFC, )
          Re: Huge Data sets, simple queries  ("Luke Lonergan", )
           Re: Huge Data sets, simple queries  ("Steinar H. Gunderson", )
           Re: Huge Data sets, simple queries  (Mike Rylander, )
         Re: Huge Data sets, simple queries  ("Luke Lonergan", )
       Re: Huge Data sets, simple queries  (Michael Stone, )
     Re: Huge Data sets, simple queries  (Alan Stange, )
 Re: Huge Data sets, simple queries  ("Luke Lonergan", )
  Re: Huge Data sets, simple queries  ("Jeffrey W. Baker", )
  Re: Huge Data sets, simple queries  (Charles Sprickman, )
   Re: Huge Data sets, simple queries  ("Luke Lonergan", )
  Re: Huge Data sets, simple queries  (hubert depesz lubaczewski, )
   Re: Huge Data sets, simple queries  ("Luke Lonergan", )
 Re: Huge Data sets, simple queries  (Michael Adler, )
 Re: Huge Data sets, simple queries  ("Craig A. James", )

"Jeffrey W. Baker" <> writes:
> On Sat, 2006-01-28 at 10:55 -0500, Tom Lane wrote:
>> Assuming that "month" means what it sounds like, the above would result
>> in running twelve parallel sort/uniq operations, one for each month
>> grouping, to eliminate duplicates before counting.  You've got sortmem
>> set high enough to blow out RAM in that scenario ...

> Hrmm, why is it that with a similar query I get a far simpler plan than
> you describe, and relatively snappy runtime?

You can't see the sort operations in the plan, because they're invoked
implicitly by the GroupAggregate node.  But they're there.

Also, a plan involving GroupAggregate is going to run the "distinct"
sorts sequentially, because it's dealing with only one grouping value at
a time.  In the original case, the planner probably realizes there are
only 12 groups and therefore prefers a HashAggregate, which will try
to run all the sorts in parallel.  Your "group by date" isn't a good
approximation of the original conditions because there will be a lot
more groups.

(We might need to tweak the planner to discourage selecting
HashAggregate in the presence of DISTINCT aggregates --- I don't
remember whether it accounts for the sortmem usage in deciding
whether the hash will fit in memory or not ...)

            regards, tom lane


В списке pgsql-performance по дате сообщения:

От: Tom Lane
Дата:
Сообщение: Re: Huge Data sets, simple queries
От: hubert depesz lubaczewski
Дата:
Сообщение: Re: Huge Data sets, simple queries