Re: Crash safe visibility map vs hint bits

Поиск
Список
Период
Сортировка
От Jesper@Krogh.cc
Тема Re: Crash safe visibility map vs hint bits
Дата
Msg-id CCA5101C-C4D4-409F-8D3C-89F8E850E810@krogh.cc
обсуждение исходный текст
Ответ на Re: Crash safe visibility map vs hint bits  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Ответы Re: Crash safe visibility map vs hint bits  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Список pgsql-hackers
Den 4 Dec 2010 kl. 08:48 skrev Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>:

> On 04.12.2010 09:14, Jesper@Krogh.cc wrote:
>> There has been a lot discussion about index-only scans and how to make the visibillity map crash safe. Then followed
bya good discussion about hint bits. 
>>
>> What seems to be the main concern is the added wal volume and it makes me wonder if there is a way in-between that
looksmore like hint bits. 
>>
>> How about lazily wal-log the complete visibility map say every X minutes or N amount of tuple updates and make the
walrecovery jobs of rechecking visibility of pages touched by the wal stream on recovery. 
>
> If you WAL-log the visibility map changes after-the-fact, it doesn't solve the race condition we're struggling with:
thevisibility map change might hit the disk before the PD_ALL_VISIBLE to the heap page. If you crash, you can end up
witha situation where the PD_ALL_VISIBLE flag on the heap page is not set, but the bit in the visibility map is. Which
causesserious issues later on. 

My imagination is probably not as good, but if you at time A wallog the complete map and at A+1 you update a tuple so
thevisibility bit is cleared but the map bit change does not happen due to a crash. Then at wal replay time you restore
themap from time A and if the tuple change at A+1 is represented in the wal stream the you also update the visibility
map. This is the situation where the heap tuple hit disk but the map is left in a broken state?  Or is it a different
similarlooking situation? 

The tuple change in the wal stream will require the system to reinspect the page anyway so there shouldn't be any
additionaldisk io on replay due to this. 

Jesper
>


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Heikki Linnakangas
Дата:
Сообщение: Re: Streaming replication document
Следующее
От: Heikki Linnakangas
Дата:
Сообщение: Re: Crash safe visibility map vs hint bits