Re: Best way to handle multi-billion row read-only table?

Поиск

Список

Период

Сортировка

От	John R Pierce
Тема	Re: Best way to handle multi-billion row read-only table?
Дата	9 февраля 2010 г. 18:08:52
Msg-id	4B71DCC9.8000402@hogranch.com обсуждение исходный текст
Ответ на	Re: Best way to handle multi-billion row read-only table? (Asher Hoskins <asher@piceur.co.uk>)
Ответы	Re: Best way to handle multi-billion row read-only table?
Список	pgsql-general

Дерево обсуждения

Asher Hoskins wrote:
> If I partition so that each partition holds data for a single channel
> (and set a CHECK constraint for this) then I can presumably remove the
> channel from the index since constraint exclusion will mean that only
> partitions holding the channel I'm interested in will be searched in a
> query. Given that within a partition all of my sample_time's will be
> different do you know if there's a more efficient way to index these?

how do you plan on accessing this monster data?  do you expect to be
looking up single values or small set of values at a specific time?

seems to me like this is the sort of data thats more often processed in
the aggregate, like running a fourier analysis of sliding windows, and
that sort of data processing may well be more efficiently done with
fixed block binary files rather than relational databases, as there's no
real relationships in this data.

for instance, a directory for each sensor, with a directory for each
week, and a file for each hour, containing the hours worth of samples in
fixed binary blocks after a file header identifying it.  you can random
access a specific time sample by using fseek
(sampletime-starttimeofblock) * blocksize + headersize or whatever.

В списке pgsql-general по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Best way to handle multi-billion row read-only table?