Re: Inconsistent DB data in Streaming Replication

Поиск
Список
Период
Сортировка
От Samrat Revagade
Тема Re: Inconsistent DB data in Streaming Replication
Дата
Msg-id CAF8Q-GyF=vrm+WLHhCLtLtg0skb_LkZwFEWjhRvcG=iybFyzwg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Inconsistent DB data in Streaming Replication  (Hannu Krosing <hannu@2ndQuadrant.com>)
Ответы Re: Inconsistent DB data in Streaming Replication  (Samrat Revagade <revagade.samrat@gmail.com>)
Список pgsql-hackers
<div dir="ltr"><p class="">>>it's one of the reasons why a fresh base backup is required when starting old master
asnew standby? >>If yes, I agree with you. I've often heard the complaints about a backup when restarting new
standby.>>That's really big problem.<p class="">I think Fujii Masao is on the same page.<p class=""> <p
class="">>Incase of syncrep the master just waits for confirmation from standby before returning to client on
>commit.<pclass="">>Not just commit, you must stop any *writing* of the wal records effectively killing any
parallelism.<br/> > Min issue is that it will make *all* backends dependant on each sync commit, essentially
serialisingall >backends commits, with the serialisation *including* the latency of roundtrip to client. With
current>sync streaming the other backends can continue to write wal, with proposed approach you cannot >write any
recordsafter the one waiting an ACK from standby.<p class=""> <p class="">Let me rephrase the proposal in a more
accuratemanner:<p class="">Consider following scenario:<p class=""> <p class="">(1) A client sends the "COMMIT" command
tothe master server.<p class=""><p class="">(2) The master writes WAL record to disk<p class="">(3) The master writes
thedata page related to this transaction.  i.e. via checkpoint or bgwriter.<p class="">(4) The master sends WAL records
continuouslyto the standby, up to the commit WAL record.<p class="">(5) The standby receives WAL records, writes them
tothe disk, and then replies the ACK.<p class="">(6) The master returns a success indication to a client after it
receivesACK.<p class=""> <p class="">If failover happens between (3) and (4), WAL and DB data in old master are ahead
ofthem in new master. After failover, new master continues running new transactions independently from old master. Then
WALrecord and DB data would become inconsistent between those two servers. To resolve these inconsistencies, the backup
ofnew master needs to be taken onto new standby.<p class=""><br /><p class="">But taking backup is not feasible in case
oflarger database size with several TB over a slow WAN.<br /><p class=""><p class="">So to avoid this type of
inconsistencywithout taking fresh backup we are thinking to do following thing:<p class=""> <br /><p class="">>>
Ithink that you can introduce GUC specifying whether this extra check is required to avoid a backup >>when
failback.<pclass="">Approach:<p class="">Introduce new GUC option specifying whether to prevent PostgreSQL from writing
DBdata before corresponding WAL records have been replicated to the standby. That is, if this GUC option is enabled,
PostgreSQLwaits for corresponding WAL records to be not only written to the disk but also replicated to the standby
beforewriting DB data.<p class=""><br /><p class="">So the process becomes as follows:<p class="">(1) A client sends
the"COMMIT" command to the master server.<p class="">(2) The master writes the commit WAL record to the disk.<p
class="">(3)The master sends WAL records continuously to standby up to the commit WAL record.<p class="">(4) The
standbyreceives WAL records, writes them to disk, and then replies the ACK.<p class="">(5) <b>The master then forces a
writeof the data page related to this transaction. </b><p class="">(6) The master returns a success indication to a
clientafter it receives ACK.<p class=""> <p class="">While master is waiting to force a write (point 5) for this data
page,streaming replication continuous. Also other data page writes are not dependent on this particular page write. So
thecommit of data pages are not serialized.<p class="" style="style"><br /><p class="" style="style">Regards,<p
class=""style="style">Samrat<p class=""><br /></div> 

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: [BUGS] replication_timeout not effective
Следующее
От: Dang Minh Huong
Дата:
Сообщение: Re: [BUGS] replication_timeout not effective