Re: Best way to handle multi-billion row read-only table?

Поиск

Список

Период

Сортировка

От	Greg Smith
Тема	Re: Best way to handle multi-billion row read-only table?
Дата	10 февраля 2010 г. 05:51:28
Msg-id	4B725760.8070905@2ndquadrant.com обсуждение исходный текст
Ответ на	Best way to handle multi-billion row read-only table? (Asher <asher@piceur.co.uk>)
Ответы	Re: Best way to handle multi-billion row read-only table? (Scott Marlowe <scott.marlowe@gmail.com>)
Список	pgsql-general

Дерево обсуждения

Asher wrote:
> Once loaded into the database the data will never be deleted or
> modified and will typically be accessed over a particular date range
> for a particular channel (e.g. "sample_time >= X AND sample_time <= Y
> AND channel=Z"). A typical query won't return more than a few million
> rows and speed is not desperately important (as long as the time is
> measured in minutes rather than hours).
>
> Are there any recommended ways to organise this? Should I partition my
> big table into multiple smaller ones which will always fit in memory
> (this would result in several hundreds or thousands of sub-tables)?
> Are there any ways to keep the index size to a minimum? At the moment
> I have a few weeks of data, about 180GB, loaded into a single table
> and indexed on sample_time and channel and the index takes up 180GB too.

One approach to consider is partitioning by sample_time and not even
including the channel number in the index.  You've got tiny records;
there's going to be hundreds of channels of data on each data page
pulled in, right?  Why not minimize physical I/O by reducing the index
size, just read that whole section of time in to memory (they should be
pretty closely clustered and therefore mostly sequential I/O), and then
push the filtering by channel onto the CPU instead.  If you've got
billions of rows, you're going to end up disk bound anyway; minimizing
physical I/O and random seeking around at the expense of CPU time could
be a big win.

--
Greg Smith    2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com  www.2ndQuadrant.com

В списке pgsql-general по дате отправления:

Предыдущее

От: AI Rumman
Дата: 10 февраля 2010 г., 05:38:16
Сообщение: Re: more than 2GB data string save

Следующее

От: Pavel Stehule
Дата: 10 февраля 2010 г., 05:52:13
Сообщение: Re: more than 2GB data string save

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Best way to handle multi-billion row read-only table?

Предыдущее

Следующее