Re: swap storm created by 8.2.3

Поиск
Список
Период
Сортировка
От Richard Huxton
Тема Re: swap storm created by 8.2.3
Дата
Msg-id 46573327.3070803@archonet.com
обсуждение исходный текст
Ответ на Re: swap storm created by 8.2.3  (Joseph Shraibman <jks@selectacast.net>)
Ответы Re: swap storm created by 8.2.3  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-general
Joseph Shraibman wrote:
>
>
> Richard Huxton wrote:
>> Joseph Shraibman wrote:
>>>>> I ran a query that was "SELECT field, count(*) INTO TEMP temptable"
>>>>> and it grew to be 10gig (as reported by top)
>>>>
>>>> What was the real query?
>>>
>>> First I selected 90634 rows (3 ints) into the first temp table, then
>>> I did "select intfield1, count(intfield2) FROM realtable rt WHERE
>>> rt.id = temptable.id and other conditions on rt here GROUP BY
>>> intfield1".  The size of the second temp table should have been no
>>> more than 60000 rows.
>>
> <SNIP>
>>
>> If your memory settings in postgresql.conf are reasonable (and they
>> look fine), this shouldn't happen. Let's see if an EXPLAIN sheds any
>> light.
>>
> => explain SELECT ml.uid, count(ml.jid) AS cnt INTO TEMP tempml FROM  ml
> WHERE ml.jid = tempjr1180108653561.id  AND ml.status IN(2,5,20) GROUP BY
> ml.uid;

> NOTICE:  adding missing FROM-clause entry for table "tempjr1180108653561"

I'm guessing this is just a typo from your test and you'd normally
mention the temp-table.

> LINE 2: ...INTO TEMP tempml FROM ml WHERE ml.jid = tempjr1180...
>                                                              ^
>                                         QUERY PLAN
> ------------------------------------------------------------------------------------------
>
>  HashAggregate  (cost=11960837.72..11967601.06 rows=541067 width=8)
>    ->  Hash Join  (cost=9675074.94..11849780.55 rows=22211434 width=8)

Here you seem to have 22 million rows estimated for your join.

>          Hash Cond: (tempjr1180108653561.id = ml.jid)
>          ->  Seq Scan on tempjr1180108653561  (cost=0.00..31.40
> rows=2140 width=4)

Is the 2140 rows here a good estimate?

>          ->  Hash  (cost=6511767.18..6511767.18 rows=181979021 width=8)
>                ->  Seq Scan on ml  (cost=0.00..6511767.18 rows=181979021
> width=8)

OK, so the 22 million matches is because "ml" has 181 million rows. Is
that right too?

>                      Filter: (status = ANY ('{2,5,20}'::integer[]))

Overall it's estimating about 9 times the number of rows you were
expecting (541000 vs 60000). Not enough to account for your extreme
memory usage.

Let's see if that hash-join is really the culprit. Can you run EXPLAIN
and then EXPLAIN ANALYSE on the query, but first issue:

SET enable_hashjoin=off;

If that make little difference, try the same with enable_hashagg.

--
   Richard Huxton
   Archonet Ltd

В списке pgsql-general по дате отправления:

Предыдущее
От: Ron Johnson
Дата:
Сообщение: Re: 2 instance of postgres service running against same db files?
Следующее
От: Tom Lane
Дата:
Сообщение: Re: swap storm created by 8.2.3