Re: efficient data reduction (and deduping)

Поиск
Список
Период
Сортировка
От Alessandro Gagliardi
Тема Re: efficient data reduction (and deduping)
Дата
Msg-id CAAB3BBKS2ncPZROej1OtE_k-DAEsRpJ2JP8w7zhoVJ8+i3EfDA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: efficient data reduction (and deduping)  (Peter van Hardenberg <pvh@pvh.ca>)
Список pgsql-performance
I was thinking of adding an index, but thought it would be pointless since I would only be using the index once before dropping the table (after its loaded into hourly_activity). I assumed it would take longer to create the index and then use it than to just seq scan once or twice. Am I wrong in that assumption?

On Thu, Mar 1, 2012 at 10:40 AM, Peter van Hardenberg <pvh@pvh.ca> wrote:
On Thu, Mar 1, 2012 at 10:27 AM, Alessandro Gagliardi
<alessandro@path.com> wrote:
> Now, I want to reduce that data to get the last activity that was performed
> by each user in any given hour. It should fit into a table like this:
>

How about:

1) Create an expression based index on date_trunc('hour', hr_timestamp)
2) Create a view on that showing the last value
3) If you want to throw away the data use CREATE TABLE AS on the
results of the view.

You may also want to investigate window functions.

-p

--
Peter van Hardenberg
San Francisco, California
"Everything was beautiful, and nothing hurt." -- Kurt Vonnegut

В списке pgsql-performance по дате отправления:

Предыдущее
От: Kääriäinen Anssi
Дата:
Сообщение: Re: Large insert and delete batches
Следующее
От: Alessandro Gagliardi
Дата:
Сообщение: Re: efficient data reduction (and deduping)