Re: [HACKERS] make async slave to wait for lsn to be replayed

Поиск
Список
Период
Сортировка
От i.kartyshov@postgrespro.ru
Тема Re: [HACKERS] make async slave to wait for lsn to be replayed
Дата
Msg-id 0a26d4c3-b100-3274-cfda-f85940cfa974@postgrespro.ru
обсуждение исходный текст
Ответ на Re: [HACKERS] make async slave to wait for lsn to be replayed  (Thomas Munro <thomas.munro@enterprisedb.com>)
Список pgsql-hackers
Hello, thank you for your comments over main idea and code.

On 13.03.2017 05:20, Thomas Munro wrote:
1)
> First, I'll restate my view of the syntax-vs-function question:  I
> think an fmgr function is the wrong place to do this, because it
> doesn't work for our 2 higher isolation levels as mentioned.  Most
> people probably use READ COMMITTED most of the time so the extension
> version you've proposed is certainly useful for many people and I like
> it, but I will vote against inclusion in core of any feature that
> doesn't consider all three of our isolation levels, especially if
> there is no way to extend support to other levels later.  I don't want
> PostgreSQL to keep adding features that eventually force everyone to
> use READ COMMITTED because they want to use feature X, Y or Z.

Waiting for LSN is expected to be used on hot standby READ ONLY 
replication.
Only READ COMMITTED and REPEATABLE READ, are allowed on hot standby.

> Maybe someone can think of a clever way for an extension to insert a
> wait for a user-supplied LSN *before* acquiring a snapshot so it can
> work for the higher levels, or maybe the hooks should go into core
> PostgreSQL so that the extension can exist as an external project not
> requiring a patched PostgreSQL installation, or maybe this should be
> done with new core syntax that extends transaction commands.  Do other
> people have views on this?

I think it is a good idea, but it is not clear for me, how to do it 
properly.

2)
> +wl_lsn_updated_hook(void)
> +{
> +    uint i;
> +    /*
> +     * After update lastReplayedEndRecPtr set Latches in SHMEM array
> +     */
> +    if (counter_waitlsn % count_waitlsn == 0
> +        || 
> TimestampDifferenceExceeds(time_waitlsn,GetCurrentTimestamp(),interval_waitlsn))
> +    {
> 
> Doesn't this mean that if you are waiting for LSN 1234, and the
> primary sends that LSN and then doesn't send anything for another
> hour, a standby waiting in pg_waitlsn is quite likely to skip that
> update (either because of count_waitlsn or interval_waitlsn), and then
> finish up waiting for a very long time?
> 
> I'm not sure if this is a good idea, but it's an idea:  You could keep
> your update skipping logic, but make sure you always catch the
> important case where recovery hits the end of WAL, by invoking your
> callback from WaitForWALToBecomeAvailable.  It could have a boolean
> parameter that means 'don't skip this one!'.  In other words, it's OK
> to skip updates, but not if you know there is no more WAL available to
> apply (since you have no idea how long it will be for more to arrive).
> 
> Calling GetCurrentTimestamp() at high frequency (after every WAL
> record is replayed) may be a bad idea.  It's a slow system call on
> some operating systems.  Can we use an existing timestamp for free,
> like recoveryLastXTime?  Remembering that the motivation for using
> this feature is to wait for *whole transactions* to be replayed and
> become visible, there is no sensible reason to wait for random WAL
> positions inside a transaction, so if you used that then you would
> effectively skip all non-COMMIT records and also skip some COMMIT
> records that are coming down the pipe too fast.

Yes, I applied this idea and prepared new patch.

3)
> +static void
> +wl_own_latch(void)
> +{
> +    SpinLockAcquire(&state->l_arr[MyBackendId].slock);
> +    OwnLatch(&state->l_arr[MyBackendId].latch);
> +    is_latch_owned = true;
> +
> +    if (state->backend_maxid < MyBackendId)
> +        state->backend_maxid = MyBackendId;
> +
> +    state->l_arr[MyBackendId].pid = MyProcPid;
> +    SpinLockRelease(&state->l_arr[MyBackendId].slock);
> +}
> 
> What is the point of using extra latches for this?  Why not just use
> the standard latch?  Syncrep and many other things do that.  I'm not
> actually sure if there is ever a reason to create more latches in
> regular backends.  SIGUSR1 will be delivered and set the main latch
> anyway.
> 
> There are cases of specialised latches in the system, like the wal
> receiver latch, and I'm reliably informed that that solves problems
> like delivering a wakeup message without having to know which backend
> is currently the wal receiver (a problem we might consider solving
> today with a condition variable?)  I don't think anything like that
> applies here.

In my case I create a bunch of shared latches for each backend, I`ll 
think
how to use standard latches in an efficient way. And about specialized
latches on standby they are already in busy with wal replaying 
functions.

4)
> +        for (i = 0; i <= state->backend_maxid; i++)
> +        {
> +            SpinLockAcquire(&state->l_arr[i].slock);
> +            if (state->l_arr[i].pid != 0)
> +                SetLatch(&state->l_arr[i].latch);
> +            SpinLockRelease(&state->l_arr[i].slock);
> +        }
> 
> Once we get through the update-skipping logic above, we hit this loop
> which acquires  spinlocks for every backend one after another and sets
> the latches of every backend, no matter whether they are waiting for
> the LSN that has been applied.  Assuming we go with this
> scan-the-whole-array approach, why not include the LSN waited for in
> the array slots, so that we can avoid disturbing processes that are
> waiting for a later LSN?

Done.

> Could you talk a bit about the trade-off between this approach and a
> queue based approach?  I would like to understand whether this really
> is the best way to do it.
> One way to use a queue would be
> ConditionVariableBroadcast(&state->lsn_moved), and then waiters would
> use a condition variable wait loop.  That would make your code much
> simpler (you wouldn't even need your array of spinlocks + latches) but
> would still have the problem of processes waking up even though
> they're actually waiting for a later LSN.  Another choice would be to
> create a custom wait list which actually holds the LSNs waited for in
> sorted order, so that we wake up exactly the right processes, cheaply,
> or in arbitrary order which makes insertion cheaper but search more
> expensive.

I`ll think how to implemented waiting for lsn with queue in next patch.

Thank you for your review.

-- 
Ivan Kartyshov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alexey Chernyshov
Дата:
Сообщение: Re: [HACKERS] index-only count(*) for indexes supporting bitmap scans
Следующее
От: Simon Riggs
Дата:
Сообщение: Re: [HACKERS] Release Note changes