Re: Strategies/Best Practises Handling Large Tables

Поиск

Список

Период

Сортировка

От	Chitra Creta
Тема	Re: Strategies/Best Practises Handling Large Tables
Дата	15 ноября 2012 г. 14:34:55
Msg-id	CABkVLeM8sYBuFOm6s94Pr1S8uwdTCkkJ+CqBG3s+ZfHkk2drLw@mail.gmail.com обсуждение исходный текст
Ответ на	Re: Strategies/Best Practises Handling Large Tables (Chris Travers <chris.travers@gmail.com>)
Ответы	Re: Strategies/Best Practises Handling Large Tables (Igor Romanchenko <igor.a.romanchenko@gmail.com>)
Список	pgsql-general

Дерево обсуждения

Thanks for your example Chris. I will look into it as a long-term solution.

Partitioning tables as a strategy worked very well indeed. This will be my short/medium term solution.

Another strategy that I would like to evaluate as a short/medium term solution is archiving old records in a table before purging them.

I am aware that Oracle has a tool that allows records to be exported into a file / archive table before purging them. They also provide a tool to import these records.

Does PostgreSQL have similar tools to export to a file and re-import?

If PostgreSQL does not have a tool to do this, does anyone have any ideas on what file format (e.g. text file containing a table of headers being column names and rows being records) would be ideal for easy re-importing into a PostgreSQL table?

Thank you for your ideas.

On Mon, Oct 22, 2012 at 12:14 AM, Chris Travers <chris.travers@gmail.com> wrote:

On Sun, Oct 21, 2012 at 5:46 AM, Chitra Creta <chitracreta@gmail.com> wrote:
Thank you all.

Ryan, would you mind sharing your one-time function to move it?

Merlin, what are your suggestions to improve query performance?

Shaun, thank you. I will look into facts and dimensions should all else fail.

Chris, would you mind giving me an example of what you mean by your log, aggregate and snapshot approach. Also, with indexing, I believe composite and partial indexes are better than indexes, am I correct? Do you have any recommendations as to which type (e.g btree, hash) is better for which situations.

Sure. Suppose I have an accounting system.

I may record the amounts in the transactions in a journal_entry and journal_line table. These will be write once read many. However time you will end up having to digest millions of records (given sufficient volume) to find out the balance of a checking account, and this is not really ideal.

So to deal with this, I might, for example, add a table called account_checkpoint which might have the following fields:

account_id
end_date
debits
credits
balance

And then I can snapshot on closing of books the accumulated debits, credits, and balance to date. If I need any of these numbers I can just grab the appropriate number from account_checkpoint and roll forward from end_date. If I have too much volume I can have closings on a monthly level of whatever.

The checkpoint table contains sufficient information for me to start a report at any point and end it at any other point without having to scan interceding checkpointed periods. Additionally if I want to purge old data, I can do so without losing current balance information.

So what this approach does, in essence is it gives you a way to purge without losing some aggregated information, and a way to skip portions of the table for aggregation purposes you can't skip otherwise. The big thing is you cannot insert (and if this is in doubt, you need to enforce this with a trigger) any records effective before the most recent checkpoint.

Best Wishes,
Chris Travers

В списке pgsql-general по дате отправления:

Предыдущее

От: "Albe Laurenz"
Дата: 15 ноября 2012 г., 14:32:43
Сообщение: Re: Postgres Index

Следующее

От: Dann Corbit
Дата: 15 ноября 2012 г., 14:55:28
Сообщение: Re: Postgres Index

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Strategies/Best Practises Handling Large Tables

Предыдущее

Следующее