Snapshot synchronization, again...
От | Joachim Wieland |
---|---|
Тема | Snapshot synchronization, again... |
Дата | |
Msg-id | AANLkTi=XPm4FGRjzdQtWnHxJ8Su_tE+x59Yv_bYwDuaR@mail.gmail.com обсуждение исходный текст |
Ответы |
Re: Snapshot synchronization, again...
(Alvaro Herrera <alvherre@commandprompt.com>)
Re: Snapshot synchronization, again... (Florian Pflug <fgp@phlo.org>) Re: Snapshot synchronization, again... (Joachim Wieland <joe@mcknight.de>) |
Список | pgsql-hackers |
The snapshot synchronization discussion from the parallel pg_dump patch somehow ended without a clear way to go forward. Let me sum up what has been brought up and propose a short- and longterm solution. Summary: Passing snapshot sync information can be done either: a) by returning complete snapshot information from the backend to the client so that the client can pass it along to a different backend b) or by returning only a unique identifier to the client and storing the actual snapshot data somewhere on the server side Advantage of a: no memory is used in the backend and no memory needs to get cleaned up, it is also theoretically possible that we could forward that data to a hot standby server and do e.g. a dump partially on the master server and partially on the hot standby server or among several hot standby servers. Disadvantage of a: The snapshot must be validated to make sure that its information is still current, it might be difficult to cover all cases of this validation. A client can not only access exactly a published snapshot, but just about any snapshot that fits and passes the validation checks (this is more a disadvantage than an advantage because it allows to see a database state that never existed in reality). Advantage of b: No validation necessary, as soon as the transaction that publishes the snapshot loses that snapshot, it will also revoke the snapshot information (either by removing a temp file or deleting it from shared memory) Disadvantage of b: It doesn't allow a snapshot to be installed on a different server. It requires a serializable open transaction to hold the snapshot. What I am proposing now is the following: We return snapshot information as a chunk of data to the client. At the same time however, we set a checksum in shared memory to protect against modification of the snapshot. A publishing backend can revoke its snapshot by deleting the checksum and a backend that is asked to install a snapshot can verify that the snapshot is correct and current by calculating the checksum and comparing it with the one in shared memory. This only costs us a few bytes for the checksum * max_connection in shared memory and apart from resetting the checksum it does not have cleanup and verification issues. Note that we are also free to change the internal format of the chunk of data we return whenever we like, so we are free to enhance this feature in the future, transparently to the client. Thoughts? Joachim
В списке pgsql-hackers по дате отправления: