Re: Update with Subquery Performance

Поиск

Список

Период

Сортировка

От	Scott Marlowe
Тема	Re: Update with Subquery Performance
Дата	11 февраля 2008 г. 20:29:48
Msg-id	dcc563d10802111329y12739774ld128a67fe2123742@mail.gmail.com обсуждение исходный текст
Ответ на	Update with Subquery Performance ("Linux Guru" <linux.binary@gmail.com>)
Ответы	Re: Update with Subquery Performance ("Linux Guru" <linux.binary@gmail.com>)
Список	pgsql-performance

Дерево обсуждения

On Feb 11, 2008 5:06 AM, Linux Guru <linux.binary@gmail.com> wrote:
> We have a large datawarehouse stored in postgres and temp tables are created
> based on user query. The process of temp table creation involves selecting
> data from main fact table, this includes several select and update
> statements and one of the following update statement is having performance
> issues.
>
> The newly temp table created for this scenario contains 22712 rows. Here is
> the query
>
> alter table dummy add column gp numeric(40,15);
> update dummy set gp=(select (
> case when sum(temp.pd) <> 0 then sum(temp.gd)/sum(temp.pd)*100 else 0 end  )
> from dummy as temp
>  where temp.product=dummy.product)

Is this supposed to be updating every single row with one value?
Cause I'm guessing it's running that sub select over and over instead
of one time.  I'm guessing that with more work_mem the planner might
use a more efficient plan.  Try adding

analyze;
set work_mem = 128000;
 between the alter and update and see if that helps.

Also, as Tom said, post explain analyze output of the statement.


>
> Now this query basically updates a table using values within itself in the
> subquery but it takes tooooo much time i.e. approx 5 mins. The whole temp
> table creation process is stucked in this query (there are 4 additional such
> updates with same problem). Index creation is useless here since its only a
> one time process.
>
> Here is the strip down version (the part making performance issue) of above
> query i.e. only select statement
> -------------------------------
> select (case when sum(temp.pd) <> 0 then sum(temp.gd)/sum(temp.pd)*100 else
> 0 end  )   from dummy as temp, dummy as temp2
>  where temp.product=temp2.product group by temp.product
>
> "HashAggregate  (cost=1652480.98..1652481.96 rows=39 width=39)"
> "  ->  Hash Join  (cost=1636.07..939023.13 rows=71345785 width=39)"
>  "        Hash Cond: (("temp".product)::text = (temp2.product)::text)"
> "        ->  Seq Scan on dummy "temp"  (cost=0.00..1311.03 rows=26003
> width=39)"
> "        ->  Hash  (cost=1311.03..1311.03 rows=26003 width=21)"
>  "              ->  Seq Scan on dummy temp2  (cost=0.00..1311.03 rows=26003
> width=21)"
> -------------------------------
>
>
> Whats the solution of this problem, or any alternate way to write this
> query?
>
>
>

В списке pgsql-performance по дате отправления:

Предыдущее

От: salman
Дата: 11 февраля 2008 г., 19:03:56
Сообщение: Question about CLUSTER

Следующее

От: "Scott Marlowe"
Дата: 11 февраля 2008 г., 20:33:42
Сообщение: Re: Question about CLUSTER

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Update with Subquery Performance

Предыдущее

Следующее