Re: Best way to handle multi-billion row read-only table?

Поиск
Список
Период
Сортировка
От Asher Hoskins
Тема Re: Best way to handle multi-billion row read-only table?
Дата
Msg-id 4B71D686.8040003@piceur.co.uk
обсуждение исходный текст
Ответ на Re: Best way to handle multi-billion row read-only table?  (Justin Graf <justin@magwerks.com>)
Ответы Re: Best way to handle multi-billion row read-only table?  (John R Pierce <pierce@hogranch.com>)
Re: Best way to handle multi-billion row read-only table?  (Justin Graf <justin@magwerks.com>)
Список pgsql-general
Justin Graf wrote:
> Well first is that 200hz  meaning 200 samples per channel per second.
> That is very fast sampling for pressure sensor,  I would be surprised if
> the meters  are actually giving real results at that rate.  I would look
> at reducing that down to what the meter is actual capable of sending
> What kind of AD card is being used as this effects what makes sense to
> record.

Yes, we really are measuring at 200 samples per second. We're trying to
capture high resolution images of pressure transients as they move along
water distribution pipelines (the backbones of the water network,
typically 4'-6' in diameter, carrying 500-1000 litres/second) to
understand how they travel and what stress they put upon the pipe. We're
using custom data loggers at the moment based around Intel iMote2 Linux
systems with a high-speed QuickFilter ADC (and the sensors we're using
can cope at 200Hz).

> I would look into table partitioning
> http://www.postgresql.org/docs/current/static/ddl-partitioning.html
> http://wiki.postgresql.org/wiki/Table_partitioning

Thanks for that, it looks like partitioning is the way to go. I'm
assuming that I should try and keep my total_relation_sizes less than
the memory size of the machine?

> A one big index for such a small record will not be a big win because
> the index are going to be the same size as table.
> Look into limiting the number of records each index covers.
> http://www.postgresql.org/docs/8.4/static/sql-createindex.html

If I partition so that each partition holds data for a single channel
(and set a CHECK constraint for this) then I can presumably remove the
channel from the index since constraint exclusion will mean that only
partitions holding the channel I'm interested in will be searched in a
query. Given that within a partition all of my sample_time's will be
different do you know if there's a more efficient way to index these?


Many thanks,
Asher


В списке pgsql-general по дате отправления:

Предыдущее
От: Scott Marlowe
Дата:
Сообщение: Re: vacuumdb ERROR: out of memory
Следующее
От: Scott Marlowe
Дата:
Сообщение: Re: Best way to handle multi-billion row read-only table?