Re: Inadequate thought about buffer locking during hot standby replay

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: Inadequate thought about buffer locking during hot standby replay
Дата
Msg-id 50A0B1E3.5050902@vmware.com
обсуждение исходный текст
Ответ на Re: Inadequate thought about buffer locking during hot standby replay  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Inadequate thought about buffer locking during hot standby replay  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On 12.11.2012 01:25, Tom Lane wrote:
> Worse than that, GIST WAL replay seems to be a total disaster from this
> standpoint, and I'm not convinced that it's even fixable without
> incompatible WAL changes.  There are several cases where index
> insertion operations generate more than one WAL record while holding
> exclusive lock on buffer(s).  If the lock continuity is actually
> necessary to prevent concurrent queries from seeing inconsistent index
> states, then we have a big problem, because WAL replay isn't designed to
> hold any buffer lock for more than one WAL action.

Hmm. After the rewrite of that logic I did in 9.1 (commit 9de3aa65f0), 
the GiST index is always in a consistent state. Before the downlink for 
a newly split page has been inserted yet, its left sibling is flagged so 
that a search knows to follow the right-link to find it. In normal 
operation, the lock continuity is needed to prevent another backend from 
seeing the incomplete split and trying to fix it, but in hot standby, we 
never try to fix incomplete splits anyway.

So I think we're good on >= 9.1. The 9.0 code is broken, however. In 
9.0, when a child page is split, the parent and new children are kept 
locked until the downlinks are inserted/updated. If a concurrent scan 
comes along and sees that incomplete state, it will miss tuples on the 
new right siblings. We rely on a rm_cleanup operation at the end of WAL 
replay to fix that situation, if the downlink insertion record is not 
there. I don't see any easy way to fix that, unfortunately. Perhaps we 
could backpatch the 9.1 rewrite, now that it's gotten some real-world 
testing, but it was a big change so I don't feel very comfortable doing 
that.

> Also, gistRedoPageDeleteRecord seems to be dead code?  I don't see
> anything that emits XLOG_GIST_PAGE_DELETE.

Yep. It was used in VACUUM FULL, but has been dead since VACUUM FULL was 
removed.

- Heikki



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Greg Smith
Дата:
Сообщение: Re: Proposal for Allow postgresql.conf values to be changed via SQL
Следующее
От: Markus Wanner
Дата:
Сообщение: Re: Enabling Checksums