Re: Optimising a query

Поиск
Список
Период
Сортировка
От Richard Huxton
Тема Re: Optimising a query
Дата
Msg-id 4768DAC0.4010304@archonet.com
обсуждение исходный текст
Ответ на Re: Optimising a query  (Paul Lambert <paul.lambert@reynolds.com.au>)
Ответы Re: Optimising a query
Re: Optimising a query
Список pgsql-performance
Paul Lambert wrote:
> Paul Lambert wrote:
>> <snip>
>
>
> This part of the query alone takes a significant part of the time:
>
> SELECT DISTINCT ON (finbalance.dealer_id, finbalance.year_id,
> finbalance.subledger_id, finbalance.account_id)
>             finbalance.year_id AS year,
>             finbalance.dealer_id AS dealer_id,
>             lpad(finbalance.subledger_id::text,4,'0') AS subledger,
>             lpad(finbalance.account_id::text,4,'0') AS account
>         FROM finbalance
>
> Runs with a query plan of :
>
> "Unique  (cost=30197.98..32782.33 rows=20675 width=16) (actual
> time=5949.695..7197.475 rows=17227 loops=1)"
> "  ->  Sort  (cost=30197.98..30714.85 rows=206748 width=16) (actual
> time=5949.691..7018.931 rows=206748 loops=1)"
> "        Sort Key: dealer_id, year_id, subledger_id, account_id"
> "        Sort Method:  external merge  Disk: 8880kB"
> "        ->  Seq Scan on finbalance  (cost=0.00..8409.70 rows=206748
> width=16) (actual time=0.042..617.949 rows=206748 loops=1)"
> "Total runtime: 7210.966 ms"
>
>
> So basically selecting from the finbalance table (approx. 206,000
> records) takes 10 seconds, even longer without the distinct clause in
> there - the distinct collapses the result-set down to around 17,000 rows.

Well, if you need to summarise all the rows then that plan is as good as
any.

If you run this query very frequently, you'll probably want to look into
keeping a summary table updated via triggers.

Before that though, try issuing a "SET work_mem = '9MB'" before running
your query. If that doesn't change the plan step up gradually. You
should be able to get the sort stage to happen in RAM rather than on
disk (see "Sort Method" above). Don't go overboard though, your big
query will probably use multiples of that value.

> Taking out the two lpad's in there knocks off about 1500ms, so I can
> come up with something else for them - but I'd like to get the query as
> a whole down to under a second.

Stick the lpads in a query that wraps your DISTINCT query.

> dealer_id, year_id, subledger_id and account_id are all part of the
> primary key on the finbalance table, so I don't think I can index them
> down any further.

A CLUSTER <pkey-index> ON <table> might help, but it will degrade as you
update the finbalance table.

--
   Richard Huxton
   Archonet Ltd

В списке pgsql-performance по дате отправления:

Предыдущее
От: Paul Lambert
Дата:
Сообщение: Re: Optimising a query
Следующее
От: Gregory Stark
Дата:
Сообщение: Re: Optimising a query