Re: [HACKERS] make async slave to wait for lsn to be replayed
От | i.kartyshov@postgrespro.ru |
---|---|
Тема | Re: [HACKERS] make async slave to wait for lsn to be replayed |
Дата | |
Msg-id | 0a26d4c3-b100-3274-cfda-f85940cfa974@postgrespro.ru обсуждение исходный текст |
Ответ на | Re: [HACKERS] make async slave to wait for lsn to be replayed (Thomas Munro <thomas.munro@enterprisedb.com>) |
Список | pgsql-hackers |
Hello, thank you for your comments over main idea and code. On 13.03.2017 05:20, Thomas Munro wrote: 1) > First, I'll restate my view of the syntax-vs-function question: I > think an fmgr function is the wrong place to do this, because it > doesn't work for our 2 higher isolation levels as mentioned. Most > people probably use READ COMMITTED most of the time so the extension > version you've proposed is certainly useful for many people and I like > it, but I will vote against inclusion in core of any feature that > doesn't consider all three of our isolation levels, especially if > there is no way to extend support to other levels later. I don't want > PostgreSQL to keep adding features that eventually force everyone to > use READ COMMITTED because they want to use feature X, Y or Z. Waiting for LSN is expected to be used on hot standby READ ONLY replication. Only READ COMMITTED and REPEATABLE READ, are allowed on hot standby. > Maybe someone can think of a clever way for an extension to insert a > wait for a user-supplied LSN *before* acquiring a snapshot so it can > work for the higher levels, or maybe the hooks should go into core > PostgreSQL so that the extension can exist as an external project not > requiring a patched PostgreSQL installation, or maybe this should be > done with new core syntax that extends transaction commands. Do other > people have views on this? I think it is a good idea, but it is not clear for me, how to do it properly. 2) > +wl_lsn_updated_hook(void) > +{ > + uint i; > + /* > + * After update lastReplayedEndRecPtr set Latches in SHMEM array > + */ > + if (counter_waitlsn % count_waitlsn == 0 > + || > TimestampDifferenceExceeds(time_waitlsn,GetCurrentTimestamp(),interval_waitlsn)) > + { > > Doesn't this mean that if you are waiting for LSN 1234, and the > primary sends that LSN and then doesn't send anything for another > hour, a standby waiting in pg_waitlsn is quite likely to skip that > update (either because of count_waitlsn or interval_waitlsn), and then > finish up waiting for a very long time? > > I'm not sure if this is a good idea, but it's an idea: You could keep > your update skipping logic, but make sure you always catch the > important case where recovery hits the end of WAL, by invoking your > callback from WaitForWALToBecomeAvailable. It could have a boolean > parameter that means 'don't skip this one!'. In other words, it's OK > to skip updates, but not if you know there is no more WAL available to > apply (since you have no idea how long it will be for more to arrive). > > Calling GetCurrentTimestamp() at high frequency (after every WAL > record is replayed) may be a bad idea. It's a slow system call on > some operating systems. Can we use an existing timestamp for free, > like recoveryLastXTime? Remembering that the motivation for using > this feature is to wait for *whole transactions* to be replayed and > become visible, there is no sensible reason to wait for random WAL > positions inside a transaction, so if you used that then you would > effectively skip all non-COMMIT records and also skip some COMMIT > records that are coming down the pipe too fast. Yes, I applied this idea and prepared new patch. 3) > +static void > +wl_own_latch(void) > +{ > + SpinLockAcquire(&state->l_arr[MyBackendId].slock); > + OwnLatch(&state->l_arr[MyBackendId].latch); > + is_latch_owned = true; > + > + if (state->backend_maxid < MyBackendId) > + state->backend_maxid = MyBackendId; > + > + state->l_arr[MyBackendId].pid = MyProcPid; > + SpinLockRelease(&state->l_arr[MyBackendId].slock); > +} > > What is the point of using extra latches for this? Why not just use > the standard latch? Syncrep and many other things do that. I'm not > actually sure if there is ever a reason to create more latches in > regular backends. SIGUSR1 will be delivered and set the main latch > anyway. > > There are cases of specialised latches in the system, like the wal > receiver latch, and I'm reliably informed that that solves problems > like delivering a wakeup message without having to know which backend > is currently the wal receiver (a problem we might consider solving > today with a condition variable?) I don't think anything like that > applies here. In my case I create a bunch of shared latches for each backend, I`ll think how to use standard latches in an efficient way. And about specialized latches on standby they are already in busy with wal replaying functions. 4) > + for (i = 0; i <= state->backend_maxid; i++) > + { > + SpinLockAcquire(&state->l_arr[i].slock); > + if (state->l_arr[i].pid != 0) > + SetLatch(&state->l_arr[i].latch); > + SpinLockRelease(&state->l_arr[i].slock); > + } > > Once we get through the update-skipping logic above, we hit this loop > which acquires spinlocks for every backend one after another and sets > the latches of every backend, no matter whether they are waiting for > the LSN that has been applied. Assuming we go with this > scan-the-whole-array approach, why not include the LSN waited for in > the array slots, so that we can avoid disturbing processes that are > waiting for a later LSN? Done. > Could you talk a bit about the trade-off between this approach and a > queue based approach? I would like to understand whether this really > is the best way to do it. > One way to use a queue would be > ConditionVariableBroadcast(&state->lsn_moved), and then waiters would > use a condition variable wait loop. That would make your code much > simpler (you wouldn't even need your array of spinlocks + latches) but > would still have the problem of processes waking up even though > they're actually waiting for a later LSN. Another choice would be to > create a custom wait list which actually holds the LSNs waited for in > sorted order, so that we wake up exactly the right processes, cheaply, > or in arbitrary order which makes insertion cheaper but search more > expensive. I`ll think how to implemented waiting for lsn with queue in next patch. Thank you for your review. -- Ivan Kartyshov Postgres Professional: http://www.postgrespro.com Russian Postgres Company
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Alexey ChernyshovДата:
Сообщение: Re: [HACKERS] index-only count(*) for indexes supporting bitmap scans