Re: Hot standby, conflict resolution
От | Simon Riggs |
---|---|
Тема | Re: Hot standby, conflict resolution |
Дата | |
Msg-id | 1232982129.2327.1698.camel@ebony.2ndQuadrant обсуждение исходный текст |
Ответ на | Re: Hot standby, conflict resolution (Simon Riggs <simon@2ndQuadrant.com>) |
Ответы |
Re: Hot standby, conflict resolution
|
Список | pgsql-hackers |
On Sun, 2009-01-25 at 16:19 +0000, Simon Riggs wrote: > On Fri, 2009-01-23 at 21:30 +0200, Heikki Linnakangas wrote: > > > Ok, then I think we have a little race condition. The startup process > > doesn't get any reply indicating that the target backend has > > processed > > the SIGINT and set the cached conflict LSN. The target backend might > > move ahead using the old LSN for a little while, even though the > > startup > > process has already gone ahead and replayed a vacuum record. > > > > Another tiny issue is that it looks like a new conflict LSN always > > overwrites the old one. But you should always use the oldest > > conflicted > > LSN in the checks, not the newest. > > That makes it easier, because it is either not set, or it is set and > does not need to be reset as new conflict LSNs appear. > > I can see a simple scheme emerging, which I will detail tomorrow > morning. Rather than signalling, we could use a hasconflict boolean for each proc in a shared data structure. It can be read without spinlock, but should only be written while holding spinlock. Each time we read a block we check if hasconflict is set. If it is, we grab spinlock, recheck if it is set, if so read the conflict details, clear the flag and drop the spinlock. The aim of this type of conflict resolution was to reduce the footprint of users that would be effected and defer it as much as possible. We've spent time getting the latestCompletedXid, but we know deriving that value is very difficult in the btree case at least. So what I would like to do is pass the relid of a conflict across as well and use that to reduce the footprint, now that we are performing the test inside the buffer manager. We would keep a relid cache with a very small number of relids, perhaps just one, maybe as many as 4 or 8, so that we can fit relids and associated LSNs in a single cache line. We can match the relid using a simple for loop, which we know is well optimised when there is no dependency between the elements of the loop and the loop has a compile-time fixed number of iterations. I would be inclined to make this a separate shared memory area rather than try to weld that onto PGPROC. We could index that using backendid. If the relid cache overflows, we just apply a general LSN value. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
В списке pgsql-hackers по дате отправления: