Re: Keepalive for max_standby_delay

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Keepalive for max_standby_delay
Дата
Msg-id 10389.1275506064@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: Keepalive for max_standby_delay  (Greg Stark <gsstark@mit.edu>)
Ответы Re: Keepalive for max_standby_delay  (Greg Stark <gsstark@mit.edu>)
Список pgsql-hackers
Greg Stark <gsstark@mit.edu> writes:
> On Wed, Jun 2, 2010 at 6:14 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I believe that the motivation for treating archived timestamps as live
>> is, essentially, to force rapid catchup if a slave falls behind so far
>> that it's reading from archive instead of SR.

> Huh, I think this is the first mention of this that I've seen. I
> always assumed the motivation was just that you wanted to control how
> much data loss could occur on failover and how long recovery would
> take. I think separating the two delays is an interesting idea but I
> don't see how it counts as a simplification.

Well, it isn't a simplification: it's bringing it up to the minimum
complication level where it'll actually work sanely.  The current
implementation doesn't work sanely because it confuses stale timestamps
read from WAL with real live time.

> This also still allows a slave to become arbitrarily far behind the
> master.

Indeed, but nothing we do can prevent that, if the slave is just plain
slower than the master.  You have to assume that the slave is capable of
keeping up in the absence of query-caused delays, or you're hosed.

The real reason this is at issue is the fact that the max_standby_delay
kill mechanism applies to certain buffer-level locking operations.
On the master we just wait, and it's not a problem, because in practice
the conflicting queries almost always release these locks pretty quick.
On the slave, though, instant kill as a result of a buffer-level lock
conflict would result in a very serious degradation in standby query
reliability (while also doing practically nothing for the speed of WAL
application, most of the time).  This morning on the phone Bruce and I
were seriously discussing the idea of ripping the max_standby_delay
mechanism out of the buffer-level locking paths, and just let them work
like they do on the master, ie, wait forever.  If we did that then
simplifying max_standby_delay to a boolean would be reasonable again
(because it really would only come into play for DDL on the master).
The sticky point is that once in a blue moon you do have a conflicting
query sitting on a buffer lock for a long time, or even more likely a
series of queries keeping the WAL replay process from obtaining buffer
cleanup lock.

So it seems that we have to have max_standby_delay-like logic for those
locks, and also that zero grace period before kill isn't a very practical
setting.  However, there isn't a lot of point in obsessing over exactly
how long the grace period ought to be, as long as it's more than a few
milliseconds.  It *isn't* going to have any real effect on whether the
slave can stay caught up.  You could make a fairly decent case for just
measuring the grace period from when the replay process starts to wait,
as I think I proposed awhile back.  The value of measuring delay from a
receipt time is that if you do happen to have a bunch of delays within
a short interval you'll get more willing to kill queries --- but I
really believe that that is a corner case and will have nothing to do
with ordinary performance.

> I propose an alternate way out of the problem of syncing two clocks.
> Instead of comparing timestamps compare time intervals. So as it reads
> xlog records it only ever compares the master timestamps with previous
> master timestamps to determine how much time has elapsed on the
> master. It compares that time elapsed with the time elapsed on the
> slave to determine if it's falling behind.

I think this would just add complexity and uncertainty, to deal with
something that won't be much of a problem in practice.
        regards, tom lane


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: Idea for getting rid of VACUUM FREEZE on cold pages
Следующее
От: Heikki Linnakangas
Дата:
Сообщение: Re: "caught_up" status in walsender