Обсуждение: [PATCH] Add cascade synchronous replication

Поиск
Список
Период
Сортировка

[PATCH] Add cascade synchronous replication

От
Григорий Новиков
Дата:
Hello hackers,

Introduction
Using a large number of synchronous standbys creates excessive load on the primary node. To solve this problem, cascading synchronous replication can be used.

Overview of Changes
This patch adds synchronous cascading replication mechanics to PostgreSQL. With it, standby servers will consider configuration parameters related to synchronous replication. They will select walsenders LSN positions from walsdender data structures and compute the synchronous LSN position for write, flush, and apply  among them using the synchronous replication algorithm, then calculate the minimum value between these values and the corresponding positions of the standby server. To avoid synchronization problems and unnecessary overhead, these calculations are performed by the walreceiver process. The offset positions will be transmitted in the standby reply message instead of the server's own positions. This will occur if the SyncRepRequested condition is met and if at least one synchronous standby server is specified in synchronous_standby_names.
In case the walsender processes fail to calculate synchronous LSN values (for example, because there are not enough synchronous standbys), the server will send DefaultSendingLSN. This value is between InvalidXLogRecPtr and FirstNormalUnloggedLSN. Sending InvalidXLogRecPtr is not allowed because in the pg_stat_replication function, a standby sending such value will be displayed as asynchronous, although it is not. The value 2 was chosen for DefaultSendingLSN since 1 is used by one of the access methods.
When receiving a DefaultSendingLSN position value from a synchronous standby, the server will use it as a regular LSN. This allows transaction execution to continue if the configuration permits it. If not, transaction execution stops until the cluster failure is resolved.

Overview of Individual Patch Parts
The first part adds the SyncRepGetSendingSyncRecPtr function, which is written similarly to SyncRepGetSyncRecPtr and is responsible for calculating the LSN positions to be sent. These functions contained a large common code section, which was moved to the SyncRepGetSyncRecPtrBySyncRepMethod function. Also, for optimization purposes, the walsender process serving a synchronous standby can call the WalRcvForceReply function.
The second part of the patch is responsible for redistributing code in the syncrep.c file into sections. This is necessary to preserve the semantics of the sections used in this file, since now some functions can be used by the walreceiver process, while others can be used by both walreceiver and walsender.
The third part adds a special notation in pg_stat_replication for standbys sending DefaultSendingLSN. If such a standby is synchronous, it is marked with a "?" symbol. In the author's opinion, this notation can simplify problem searching in the cluster, but does not claim to be a serious solution for failure detection.
The fourth part of the patch contains fixes in recovery tests numbered 9 and 12. These tests created circular dependencies between servers. This was not a problem as long as standby ignored synchronous replication parameters, but with this patch the tests broke. Also, tests for the new mechanics were added to test 7, which is responsible for synchronous replication.

Possible Topologies
As part of the patch, connection of asynchronous and synchronous standbys to a synchronous standby is allowed. However, offset positions sent by asynchronous standbys will not be considered, since the synchronous replication algorithm is used. For the same reason, connecting a synchronous standby to an asynchronous one is theoretically possible but meaningless.

Additional Information
The patch contains no platform-dependent elements, compiles with the -Wall flag, and successfully passes tests. Performance optimization is a separate task, and in the author's opinion, deserves a separate patch. Nevertheless, local testing using Docker containers showed insignificant performance degradation when using synchronous cascading chains.
This patch is intended primarily for discussion. It was developed for the master branch, commit hash: b227b0bb4e032e19b3679bedac820eba3ac0d1cf.
Best wishes, Grigoriy Novikov!
Вложения