Re: Performance problems with DISTINCT ON

Поиск
Список
Период
Сортировка
От Sgarbossa Domenico
Тема Re: Performance problems with DISTINCT ON
Дата
Msg-id 0D9FA0704ED847C089451F8B79D1EE4C@laneniac.it
обсуждение исходный текст
Ответ на Performance problems with DISTINCT ON  ("Sgarbossa Domenico" <domenico.sgarbossa@eniac.it>)
Ответы Re: Performance problems with DISTINCT ON  (hubert depesz lubaczewski <depesz@depesz.com>)
Re: Performance problems with DISTINCT ON  (Scott Carey <scott@richrelevance.com>)
Список pgsql-performance
Subject: Re: [PERFORM] Performance problems with DISTINCT ON


> Sgarbossa Domenico wrote:
>> I need to retrieve the most recent prices per products from a price list
>> table:
>
>> select distinct on (articolo) articolo,data_ent,prezzo from
>> listini_anagrafici order by articolo, data_ent desc
>>
>> but it seems that this query runs slowly... about 5/6 seconds.
>> the table contains more or less 500K records, PostgreSQL version is
>> 8.1.11 and the server has 4gb of RAM entirely dedicate to the db.
>
>> 'Unique  (cost=73893.89..76551.25 rows=88312 width=24) (actual
>> time=4022.578..5076.206 rows=193820 loops=1)'
>> '  ->  Sort  (cost=73893.89..75222.57 rows=531472 width=24) (actual
>> time=4022.574..4505.538 rows=531472 loops=1)'
>> '        Sort Key: articolo, data_ent'
>> '        ->  Seq Scan on listini_anagrafici  (cost=0.00..16603.72
>> rows=531472 width=24) (actual time=0.009..671.797 rows=531472 loops=1)'
>> 'Total runtime: 5217.452 ms'
>
> You've got 531472 rows in the table and the query is going to output
> 193820 of them. Scanning the whole table is almost certainly the way to
> go.
>
> If the table doesn't change much, you could try running a CLUSTER on the
> index you've created. That will lock the table while it re-orders the
> physical layout of the rows based on your index though, so it's no good
> if the table is updated much.
>
> Failing that, you could try issuing "set work_mem = ..." before the
> query with increasing sizes for work_mem. That might make the sort
> faster too.
>

Thank you for the answer,
I've tried as you suggest but the only things that seems make some
differences is the work_mem parameter
This helps to reduce the amount of time about for the half (3 seconds) but
unfortunately this ain't enough.
If  there are a lot of concurrent request I think it could made the data
swap to the disk.
Should I try a different approach to solve this issue?





В списке pgsql-performance по дате отправления:

Предыдущее
От: Reydan Cankur
Дата:
Сообщение: Using OProfile
Следующее
От: hubert depesz lubaczewski
Дата:
Сообщение: Re: Performance problems with DISTINCT ON