[HACKERS] new high availability feature for the system with both asynchronousand synchronous replication

Поиск
Список
Период
Сортировка
От Higuchi, Daisuke
Тема [HACKERS] new high availability feature for the system with both asynchronousand synchronous replication
Дата
Msg-id 1803D792815FC24D871C00D17AE95905ACEF6A@g01jpexmbkw24
обсуждение исходный текст
Ответы [HACKERS] Re: new high availability feature for the system with bothasynchronous and synchronous replication  ("Higuchi, Daisuke" <higuchi.daisuke@jp.fujitsu.com>)
Список pgsql-hackers
Hi all,

I propose a new feature for high availability. 

This configuration is effective for following configuration: 
1. Primary and synchronous standby are in the same center; called main center. 
2. Asynchronous standby is in the another center; called backup center.   (The backup center is located far away from
themain center. If replication   mode is synchronous, performance will be deteriorated. So, this replication   must be
Asynchronous.)
 
3. Asynchronous replication is performed in the backup center too. 
4. When primary in main center abnormally stops, standby in main center is   promoted, and the standby in backup center
connectsto the new primary.
 

This configuration is also shown in the figure below. 
               [Main center]
|--------------------------------------------|
| |----------|  synchronous     |----------| |
| |          |    replication   |          | |
| | primary  | <--------------> | standby1 | |
| |----------|                  |----------| |
|----||--------------------------------------|    ||    || asynchronous    ||   replication    ||    ||        [Backup
center]
|----||--------------------------------------|
| |----------|  asynchronous    |----------| |
| |          |    replication   |          | |
| | standby2 | <--------------> | standby3 | |
| |----------|                  |----------| |
|--------------------------------------------|

When the load in the main center becomes high, although WAL reaches standby in 
backup center, WAL may not reach synchronous standby in main center for various 
reasons. In other words, standby in the backup center may advance beyond 
synchronous standby in main center.

When the primary abnormally stops and standby in main center promotes, two 
standbys in backup center must be recovered by pg_rewind. However, it is 
necessary to stop new primary for pg_rewind. If pg_basebackup is used, 
recovery of backup center takes some times. This is not high availability. 

[Proposal Concept]
In this feature, just switch the connection destination and restart it. 
So, it is not necessary to stop new primary.There is no need for recovering 
by pg_rewind or pg_basebackup because standby in the backup center will not 
advance beyond the standby in the main center.

In my idea, this feature is enabled when the new GDU parameter is set. 
In the case that synchronous standby and asynchronous standby are connected 
to primary, walsender check if WAL is sent to synchronous standby before 
sending WAL to the asynchronous standby. After walsender confirm WAL has been 
sent to synchronous standby, it also sends the WAL to the asynchronous standby.

I would appreciate it if you give any comments for this feature. 

Regards, 
Daisuke Higuchi 




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: [HACKERS] WAL consistency check facility
Следующее
От: Robert Haas
Дата:
Сообщение: Re: [HACKERS] Possible TODO: allow arbitrary expressions in eventtrigger WHEN