Re: Best way to handle multi-billion row read-only table?

Поиск
Список
Период
Сортировка
От John R Pierce
Тема Re: Best way to handle multi-billion row read-only table?
Дата
Msg-id 4B71DCC9.8000402@hogranch.com
обсуждение исходный текст
Ответ на Re: Best way to handle multi-billion row read-only table?  (Asher Hoskins <asher@piceur.co.uk>)
Ответы Re: Best way to handle multi-billion row read-only table?  (Asher <asher@piceur.co.uk>)
Список pgsql-general
Asher Hoskins wrote:
> If I partition so that each partition holds data for a single channel
> (and set a CHECK constraint for this) then I can presumably remove the
> channel from the index since constraint exclusion will mean that only
> partitions holding the channel I'm interested in will be searched in a
> query. Given that within a partition all of my sample_time's will be
> different do you know if there's a more efficient way to index these?

how do you plan on accessing this monster data?  do you expect to be
looking up single values or small set of values at a specific time?

seems to me like this is the sort of data thats more often processed in
the aggregate, like running a fourier analysis of sliding windows, and
that sort of data processing may well be more efficiently done with
fixed block binary files rather than relational databases, as there's no
real relationships in this data.

for instance, a directory for each sensor, with a directory for each
week, and a file for each hour, containing the hours worth of samples in
fixed binary blocks after a file header identifying it.  you can random
access a specific time sample by using fseek
(sampletime-starttimeofblock) * blocksize + headersize or whatever.


В списке pgsql-general по дате отправления:

Предыдущее
От: Boszormenyi Zoltan
Дата:
Сообщение: Re: ERROR: could not load library "...": Exec format error
Следующее
От: Guillaume Lelarge
Дата:
Сообщение: Re: viewing large queries in pg_stat_activity