Re: Gist Recovery testing

Поиск
Список
Период
Сортировка
От Teodor Sigaev
Тема Re: Gist Recovery testing
Дата
Msg-id 42AEC3D9.5060107@sigaev.ru
обсуждение исходный текст
Ответ на Re: Gist Recovery testing  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Gist Recovery testing  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Gist Recovery testing  (Stefan Kaltenbrunner <stefan@kaltenbrunner.cc>)
Список pgsql-hackers
> btree manages to avoid calling the index support functions during WAL
> restore --- can't you do the same?  Perhaps you need to be including
> more information in the initial xlog records, so that the cleanup step
> has everything it needs.  Or resort to brute-force search (which is more
> or less what btree does).  I don't think this operation needs to be very
> efficient, since it's a corner case that should only seldom be invoked.

I've just commited WALogging for GiST. It works for online backup  (normal 
recovery) and mostly after crash, but in this case it can't restore inclompleted  inserts although it can detected and
sayto log thet it's needed to reindex 
 
GiST index.

The problem with incopmleted inserts is: when new entry is installed into leaf 
page, all chain (in a worst case) of keys from root page to leaf should be 
updated. Of course, case of updating all chain is rarely and usially it's 
updated only part. Each key on inner pages contains union of keys (bounding box 
in a case of rtree, for example) on page which it points. This union can formed 
only with a help of user-defined function of opclass, because of GiST doesn't 
know something about nature of keys. Returning to WAL, GiST core write xlog 
entry with all nessary information for restoration before write page, but in 
this moment it doesn't know it should update keys on parent page or key is 
unchanged.  So GiST's WAL restoration code should remember this page's update as 
incompleted insert. When insert complited, GiST's core write to log that insert 
is completed and restore code can clean up stored incompleted insert. If it was 
crash, then sign of completed insert can be absent in log, and GiST's restore 
code should continue it. While continue, it's know which page was changed and 
should walk up to root. On each step of walk it should form union for page and 
insert it to parent.

This is absent in btree, because of btree is sorted, so parent's key simply is 
right on child page and xlog code doesn't need to call any support function.


Which ways I see:
1 Leave it as is. Incomplited inserts isn't often case, and code can detect it 
to suggest reindex.
2 In principle, its possible to collect all information about changes per insert 
in the tree in one WAL record. But itsn't good solution, because of it needs to 
ugly coding of inserts: collect all pages and write it after pushing WAL record. 
Second, this manner kills using LSN for page control for concurrency. But it 
fully solve problem with incomplete insert.
3 Construct relation by XLogOpenRelation some more full than now. In fact, 
restore code needs only 2 thing about index: index->rd_att and information for 
working index_getprocinfo. I understand why idea to calling any support 
functions during WAL recovery is bad.
4 Add one hook to finalyze WAL code. This hook should be called when system 
catalog is maded accessible.
5 There is one unused bit in IndexTupleData. GiST code can use it as mark that 
this tuple contains incorrect union key. In this case, GiST search algorithm 
should keep it mind that such tuple has incorrect value and always should go to 
it's child to check. During insertions in tree such tuples will be little by 
little changed to correct. Defect of this scheme will be a possible performance 
degradation after crash recovery. In other hand, it's possible to add recreation 
such tuples to vacuum code (in gistbulkdelete).





-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
  WWW: http://www.sigaev.ru/
 


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Oleg Bartunov
Дата:
Сообщение: Re: Best way to scan on-disk bitmaps
Следующее
От: Jan Wieck
Дата:
Сообщение: Re: The Contrib Roundup (long)