Re: Strategy for materialisation and centralisation of data

Поиск

Список

Период

Сортировка

От	Michael Stephenson
Тема	Re: Strategy for materialisation and centralisation of data
Дата	22 мая 2020 г. 00:06:48
Msg-id	CAHJZW0=zDHUhrKGAPYE-ekRT9FRmHfsASSnpHev+cC-5K9=bbQ@mail.gmail.com обсуждение исходный текст
Ответ на	Strategy for materialisation and centralisation of data (Rory Campbell-Lange <rory@campbell-lange.net>)
Ответы	Re: Strategy for materialisation and centralisation of data
Список	pgsql-general

Дерево обсуждения

You might find Materialize interesting:

https://materialize.io/

https://youtu.be/zWSdkGq1XWk

On Thu, May 21, 2020 at 10:36 AM Rory Campbell-Lange <rory@campbell-lange.net> wrote:

We have quite a few databases of type a and many of type b in a cluster.
Both a and b types are fairly complex and are different solutions to a
similar problem domain. All the databases are very read-centric, and all
database interaction is currently through plpgsql with no materialised
data.

Some organisations have several type a and many type b databases, and
need to query these in a homogeneous manner. We presently do this with
many middleware requests or pl/proxy. An a or b type database belongs to
0 or 1 organisations.

Making a and b generally the same would be a very big project.
Consequently I'm discussing materialising a subset of data in a common
format between the two database types and shipping that data to
organisation databases. This would have the benefit of providing a
common data interface and speeding up queries for all database types.
Users would have faster queries, and it would be a big time saver for
our development team, who presently have to deal with three quite
different data APIs.

Presently I've been thinking of using triggers or materialized views in
each database to materialise data into a "matview" schema which is then
shipped via logical replication to an organisation database when
required. New columns in the matview schema tables would ensure replica
identity uniqueness and allow the data to be safely stored in common
tables in the organisation database.

A few issues I foresee with this approach include:

* requiring two to three times current storage for materialisation
(the cluster is currently ~250GB)

* having to have many logical replication slots
(we sometimes suffer from pl/proxy connection storms)

Commentary gratefully received,
Rory

В списке pgsql-general по дате отправления:

Предыдущее

От: Zahir Lalani
Дата: 21 мая 2020 г., 23:58:32
Сообщение: RE: Should I use JSON?

Следующее

От: Rory Campbell-Lange
Дата: 22 мая 2020 г., 00:19:21
Сообщение: Re: Strategy for materialisation and centralisation of data

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Strategy for materialisation and centralisation of data

Предыдущее

Следующее