RE: Time delayed LR (WAS Re: logical replication restrictions)
| От | Hayato Kuroda (Fujitsu) | 
|---|---|
| Тема | RE: Time delayed LR (WAS Re: logical replication restrictions) | 
| Дата | |
| Msg-id | TYAPR01MB586649419227CFDC148CEF75F554A@TYAPR01MB5866.jpnprd01.prod.outlook.com обсуждение исходный текст | 
| Ответ на | RE: Time delayed LR (WAS Re: logical replication restrictions) ("Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>) | 
| Ответы | RE: Time delayed LR (WAS Re: logical replication restrictions) | 
| Список | pgsql-hackers | 
Dear hackers, At PGcon and other places we have discussed the time-delayed logical replication, but now we have understood that there are no easy ways. Followings are our analysis. # Abstract To implement the time-dealyed logical replication for more proper approach, the worker must serialize all the received messages into permanent files. But PostgreSQL does not have good infrastructures for the purpose so huge engineering is needed. ## Review: problem of without-file approach In the without-file approach, the apply worker process sleeps while delaying the application. This approach is chosen in earlier versions like [1], but it contains problems which was shared by Sawada-san [2]. They lead the PANIC error due to the disk full. A) WALs cannot be recycled on publisher because they are not flushed on subscriber. B) Moreover, vacuuming cannot remove dead tuples on publisher. ## Alternative approach: serializing messages to files To prevent any potential issues, the worker should serialize all incoming messages to a permanent file, like what the physical walreceiver does. Here, messages are first written into files at the beginning of transactions and then flushed at the end. This approach could slove problem a), b), but it still has many considerations and difficulties. ### How to separate messages into files? There are two possibilities for dividing messages into files, but neither of them is ideal. 1. Create a file per received transaction. In this case files will be removed after the delay-period is exceeded and it is applied. This is the simplest approach, but the number of files is bloat. 2. Use one large file or segmented file (like WAL). This can reduce the number of files, but we must consider further things: A) Purge – We must purge the applied transaction, but we do not have a good way to remove one transaction from the large file. B) 2PC – It is more likely to occur that the segment which contains the actual transaction differs from the segment where COMMIT PREPARED. Hence the worker must check all the segments to find the actual messages from them. C) Streamed in-progress transactions - chunks of transactions are separated into several segments. Hence the worker must check all the segments to find chunks messages from them, same as above. ### Handle the case when the file exceeds the limitation Regardless of the option chosen from the ones mentioned above, there is a possibility that the file size could exceed the file system's limit. This can occur as the publisher can send transactions of any length. PostgreSQL provides a mechanism for working with such large files - BufFile data structure, but it could not be used as-is for several reasons: A) It only supports the buffered-I/O. A read or write of the low-level File occurs only when the buffer is filled or emptied. So, we cannot control when it is persisted. B) It can be used only for temporary purpose. Internally the BufFile creates some physical files into $PGDATA/base/pgsql_tmp directories, and files in the subdirectory will be removed when postmaster restarts. C) It does not have mechanisms for restoring information after the restart. BufFile contains virtual positions such as file index and offset, but these fields are stored in a memory structure, so the BufFile will forget the ordering of files and its initial/final position after restarts. D) It cannot remove a part of virtual file. Even if a large file is separated into multiple physical files and all transactions in a physical file are already applied, BufFile cannot remove only one part. [1]: https://www.postgresql.org/message-id/f026292b-c9ee-472e-beaa-d32c5c3a2ced%40www.fastmail.com [2]: https://www.postgresql.org/message-id/CAD21AoAeG2+RsUYD9+mEwr8-rrt8R1bqpe56T2D=euO-Qs-GAg@mail.gmail.com Acknowledgement: Amit, Peter, Sawada-san Thank you for discussing with me off-list. Best Regards, Hayato Kuroda FUJITSU LIMITED
В списке pgsql-hackers по дате отправления: