Design of live and archive database

Поиск

Список

Период

Сортировка

От	Genc, Ömer
Тема	Design of live and archive database
Дата	9 июля 2015 г. 11:52:59
Msg-id	3eff99123d744043a47cd590f1a80d1c@e2k13-ms1.iais.fraunhofer.de обсуждение исходный текст
Список	pgsql-cluster-hackers

Дерево обсуждения

Hey there,

i am pretty new to postgres, so please bear with me.

Currently I am working on a project which serializes data into a postgres database.

This is a real-time streaming data visualization system with a high frequency of incoming data, which need to be stored in the database.

You can think about continuously 1000 inserts per second. This data needs to be stored for ever, since it should be possible

to restore the system to any state is has ever been in at any time.

However during runtime only the data for the past hour is frequently requested,

which is not very much compared to the overall data size.

As you can image, the database becomes very large in size over time and queries are getting slower and slower. I already

implemented partitioning using time-based approach for corresponding tables. This approach worked and were satisfactory at a first glance.

However since this is an always running system, there is very much data (in form of old partition tables) laying around cluttering the database.

Now I decided to take a different approach. I am trying to separate between two database clusters, one live cluster and one archive cluster.

The live cluster will only maintain the last hour of data, the archive cluster will keep all the data of all time.

Is there anyone who ever implemented a similar database architecture? Can anyone give me an advice, if this is a good approach?

And if so, what would be the best way to implement this?

The approach Im going to implement is:

Currently I have one application server which handles all queries against the database. Now I thought about implementing a multiplexer, which instead of inserting into on database also inserts into the second archive database. While triggers on the live system would remove data which is older than on hour, the archive database would partition data on a monthly approach.

I simply don’t want to miss an obvious solution to this, so any advice would be greatly appreciated.

Cheers,

Oemer

В списке pgsql-cluster-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Design of live and archive database