Re: hash aggregation

Поиск
Список
Период
Сортировка
От Korisk
Тема Re: hash aggregation
Дата
Msg-id 323831349928808@web5g.yandex.ru
обсуждение исходный текст
Ответ на Re: hash aggregation  (Sergey Konoplev <gray.ru@gmail.com>)
Ответы Re: hash aggregation  (Craig Ringer <ringerc@ringerc.id.au>)
Re: hash aggregation  (Sergey Konoplev <gray.ru@gmail.com>)
Список pgsql-performance
Thanx for the advice, but increment table is not acceptable because it should be a plenty of them.
Nevertheless in the investigations was achieved some progress (7.4 sec vs 19.6 sec).
But using IOS scan you can see that there is an abnormal cost calculations it make me suspicious  of little bugs.

Thanks for your answer.


hashes=# \d hashcheck;
                              Table "public.hashcheck"
 Column |       Type        |                       Modifiers
--------+-------------------+--------------------------------------------------------
 id     | integer           | not null default nextval('hashcheck_id_seq'::regclass)
 name   | character varying |
 value  | character varying |
Indexes:
    "hashcheck_name_idx" btree (name)

hashes=# vacuum hashcheck;
VACUUM
hashes=# set random_page_cost=0.1;
SET
hashes=# set seq_page_cost=0.1;
SET

hashes=# explain analyse verbose select name, count(name) as cnt from  hashcheck group by name order by name desc;
                                                                 QUERY PLAN
                  

---------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=407366.72..407367.22 rows=200 width=32) (actual time=10712.505..10712.765 rows=4001 loops=1)
   Output: name, (count(name))
   Sort Key: hashcheck.name
   Sort Method: quicksort  Memory: 315kB
   ->  HashAggregate  (cost=407357.08..407359.08 rows=200 width=32) (actual time=10702.285..10703.054 rows=4001
loops=1)
         Output: name, count(name)
         ->  Seq Scan on public.hashcheck  (cost=0.00..277423.12 rows=25986792 width=32) (actual time=0.054..2877.100
rows=25990002loops=1) 
               Output: id, name, value
 Total runtime: 10712.989 ms
(9 rows)

hashes=#  set enable_seqscan = off;
SET
hashes=# explain analyse verbose select name, count(name) as cnt from  hashcheck group by name order by name desc;
                                                                                          QUERY PLAN
                  

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 GroupAggregate  (cost=10000000000.00..10000528610.88 rows=200 width=32) (actual time=0.116..7452.005 rows=4001
loops=1)
   Output: name, count(name)
   ->  Index Only Scan Backward using hashcheck_name_idx on public.hashcheck  (cost=10000000000.00..10000398674.92
rows=25986792width=32) 
 (actual time=0.104..3785.767 rows=25990002 loops=1)
         Output: name
         Heap Fetches: 0
 Total runtime: 7452.509 ms
(6 rows)

Благодаря шаманствам на:
http://www.sql.ru/forum/actualthread.aspx?tid=974484

11.10.2012, 01:30, "Sergey Konoplev" <gray.ru@gmail.com>:
> On Wed, Oct 10, 2012 at 9:09 AM, Korisk <Korisk@yandex.ru> wrote:
>
>>  Hello! Is it possible to speed up the plan?
>>   Sort  (cost=573977.88..573978.38 rows=200 width=32) (actual time=10351.280..10351.551 rows=4000 loops=1)
>>     Output: name, (count(name))
>>     Sort Key: hashcheck.name
>>     Sort Method: quicksort  Memory: 315kB
>>     ->  HashAggregate  (cost=573968.24..573970.24 rows=200 width=32) (actual time=10340.507..10341.288 rows=4000
loops=1)
>>           Output: name, count(name)
>>           ->  Seq Scan on public.hashcheck  (cost=0.00..447669.16 rows=25259816 width=32) (actual
time=0.019..2798.058rows=25259817 loops=1) 
>>                 Output: id, name, value
>>   Total runtime: 10351.989 ms
>
> AFAIU there are no query optimization solution for this.
>
> It may be worth to create a table hashcheck_stat (name, cnt) and
> increment/decrement the cnt values with triggers if you need to get
> counts fast.
>
> --
> Sergey Konoplev
>
> a database and software architect
> http://www.linkedin.com/in/grayhemp
>
> Jabber: gray.ru@gmail.com Skype: gray-hemp Phone: +14158679984


В списке pgsql-performance по дате отправления:

Предыдущее
От: Claudio Freire
Дата:
Сообщение: Re: shared_buffers/effective_cache_size on 96GB server
Следующее
От: Craig Ringer
Дата:
Сообщение: Re: hash aggregation