Re: Sync Rep: First Thoughts on Code

Поиск
Список
Период
Сортировка
От Jeff Davis
Тема Re: Sync Rep: First Thoughts on Code
Дата
Msg-id 1228244886.14591.45.camel@dell.linuxdev.us.dell.com
обсуждение исходный текст
Ответ на Re: Sync Rep: First Thoughts on Code  (Simon Riggs <simon@2ndQuadrant.com>)
Ответы Re: Sync Rep: First Thoughts on Code  (Josh Berkus <josh@agliodbs.com>)
Re: Sync Rep: First Thoughts on Code  ("Fujii Masao" <masao.fujii@gmail.com>)
Список pgsql-hackers
On Tue, 2008-12-02 at 13:09 +0000, Simon Riggs wrote:
> > Is it dangerous to abort the transaction with replication continued when
> > the timeout occurs? I think that the WAL consistency between two servers
> > might be broken. Because the WAL writing and sending are done concurrently,
> > and the backend might already write the WAL to disk on the primary when
> > waiting for walsender.
> 
> The issue I see is that we might want to keep wal_sender_delay small so
> that transaction times are not increased. But we also want
> wal_sender_delay high so that replication never breaks. It seems better
> to have the action on wal_sender_delay configurable if we have an
> unsteady network (like the internet). Marcus made some comments on line
> dropping that seem relevant here; we should listen to his experience.
> 
> Hmmm, dangerous? Well assuming we're linking commits with replication
> sends then it sounds it. We might end up committing to disk and then
> deciding to abort instead. But remember we don't remove the xid from
> procarray or mark the result in clog until the flush is over, so it is
> possible. But I think we should discuss this in more detail when the
> main patch is committed.
> 

What is the "it" in "it is possible"? It seems like there's still a
problem window in there.

Even if that could be made safe, in the event of a real network failure,
you'd just wait the full timeout every transaction, because it still
thinks it's replicating.

If the timeout is exceeded, it seems more reasonable to abandon the
slave until you could re-sync it and continue processing as normal. As
you pointed out, that's not necessarily an expensive operation because
you can use something like rsync. The process of re-syncing might be
made easier (or perhaps less costly), of course.

If we want to still allow processing to happen after a timeout, it seems
reasonable to have a configurable option to allow/disallow non-read-only
transactions when out of sync. 

Regards,Jeff Davis



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Simon Riggs
Дата:
Сообщение: Re: Sync Rep: First Thoughts on Code
Следующее
От: Heikki Linnakangas
Дата:
Сообщение: Re: pg_stop_backup wait bug fix