Multi master disjoint cluster

Поиск
Список
Период
Сортировка
От Vijaykumar Jain
Тема Multi master disjoint cluster
Дата
Msg-id CAM+6J95-vXzF6v55cSXaKMVBrCdAc3RVqKEb1Aau+6KZfc8Hqw@mail.gmail.com
обсуждение исходный текст
Ответы Re: Multi master disjoint cluster  (Vijaykumar Jain <vijaykumarjain.github@gmail.com>)
Список pgsql-general
Hi all,

tl;dr
I have a simple question,
given a choice if I can write the same data to two databases in parallel, should I opt for primary / replica setup or multi writer/master setup. This setup has the ability to make use of kafka consumer groups (like two replication slots each having their own lsn offsets)  to write to both db node pairs in parallel via the application layer.

The churn of data is really high, there is a lot of wals generated, around 500gb/hr.

If I go with primary/replica, (lr not feasible)

I need to ensure both are on the same major version. Upgrades are tricky (we don't have qa) so we just have option to schema dump for upgrade compatibility. Data, we trust postgresql for that :). (I wish we had zfs everywhere but no )

Any excl table blocking operations, (although with later versions there are very less blocking operations) can impact queries on replica as well (excluding delay settings).

Corruption can cause downtime (we have tons of them and raids to protect them) so if replica is having issues, we can zero the pages on the replica and do some operations if we isolate the problem pages, else resync the replica from primary. But if primary is having some issues, we copy data from replica to disk and copy in to primary after truncating etc. Some downtime but not a lot. (I am not expert at data recovery) and mostly rely on amcheck, dd, and raid checks.

We don't use pitr (too many wals × 58) or delayed replication as we can't afford more servers.

ddl deploys are guaranteed by replication. So no need to try 2pc like stuff at app layer. (Although apps use deploy tools to ensure eventually the ddls are consistent and idempotent)

Basically primary/replica relieves the app to think what is there on the primary is also on the replica eventually, so there can be source of truth.

But with multi writers, any app mishandling like bug in catching exception etc can result in diversion and no more mirrored setup.
We need to have checks/reconciliation to ensure both write nodes in pair have almost similar data at the end of the day so we can trust this setup independent of any app mistakes.

But if app layer gets robust, we have almost no downtime in reads and writes, we can have both nodes on different versions, (w/o logical replication) can query both nodes real time, no real replication lag issues , conflicts etc, can upgrade like blue green, canary test some changes on one if needed etc.

Am I making sense at all? Or I am sounding confused, and I don't know the difference between primary/replica vs multi writer. This is not bdr like thing, they don't really need each other unless we are into some recovery.

My point is, we have 58 such primary/replica shards (each 10tb+) (consistent hashing at app layer, no fdw)  and there is no scope of downtime for reads, so any issue like post upgrade performance degradation (if any) gives me chills. and we have no qa to test real data.

There are too many dimensions to shard on and aggregations need to run across the shards (Yes there is no scope of data isolation).

В списке pgsql-general по дате отправления:

Предыдущее
От: Laurenz Albe
Дата:
Сообщение: Re: How to know how much CPU, RAM is used by existing 1 database
Следующее
От: Vijaykumar Jain
Дата:
Сообщение: Re: Multi master disjoint cluster