Re: Hash Indexes

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: Hash Indexes
Дата
Msg-id CAA4eK1L=OCH2-Vh1JXLBzktfjHMOkchZ3jSL3cvAXGXc-B9AqA@mail.gmail.com
обсуждение исходный текст
Ответ на Hash Indexes  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: Hash Indexes
Re: Hash Indexes
Список pgsql-hackers
On Tue, May 10, 2016 at 5:39 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:

Incomplete Splits
--------------------------
Incomplete splits can be completed either by vacuum or insert as both needs exclusive lock on bucket.  If vacuum finds split-in-progress flag on a bucket then it will complete the split operation, vacuum won't see this flag if actually split is in progress on that bucket as vacuum needs cleanup lock and split retains pin till end of operation.  To make it work for Insert operation, one simple idea could be that if insert finds split-in-progress flag, then it releases the current exclusive lock on bucket and tries to acquire a cleanup lock on bucket, if it gets cleanup lock, then it can complete the split and then the insertion of tuple, else it will have a exclusive lock on bucket and just perform the insertion of tuple.  The disadvantage of trying to complete the split in vacuum is that split might require new pages and allocating new pages at time of vacuum is not advisable.  The disadvantage of doing it at time of Insert is that Insert might skip it even if there is some scan on the bucket is going on as scan will also retain pin on the bucket, but I think that is not a big deal.  The actual completion of split can be done in two ways: (a) scan the new bucket and build a hash table with all of the TIDs you find there.  When copying tuples from the old bucket, first probe the hash table; if you find a match, just skip that tuple (idea suggested by Robert Haas offlist) (b) delete all the tuples that are marked as moved_by_split in the new bucket and perform the split operation from the beginning using old bucket. 


I have completed the patch with respect to incomplete splits and delayed cleanup of garbage tuples.  For incomplete splits, I have used the option (a) as mentioned above.  The incomplete splits are completed if the insertion sees split-in-progress flag in a bucket.  The second major thing this new version of patch has achieved is cleanup of garbage tuples i.e the tuples that are left in old bucket during split.  Currently (in HEAD), as part of a split operation, we clean the tuples from old bucket after moving them to new bucket, as we have heavy-weight locks on both old and new bucket till the whole split operation.  In the new design, we need to take cleanup lock on old bucket and exclusive lock on new bucket to perform the split operation and we don't retain those locks till the end (release the lock as we move on to overflow buckets).  Now to cleanup the tuples we need a cleanup lock on a bucket which we might not have at split-end.  So I choose to perform the cleanup of garbage tuples during vacuum and when re-split of the bucket happens as during both the operations, we do hold cleanup lock.  We can extend the cleanup of garbage to other operations as well if required.

I have done some performance tests with this new version of patch and results are on same lines as in my previous e-mail.  I have done some functional testing of the patch as well.  I think more detailed testing is required, however it is better to do that once the design is discussed and agreed upon.

I have improved the code comments to make the new design clear, but still one can have questions related to locking decisions I have taken in patch.  I think one of the important thing to verify in the patch is locking strategy used for different operations.  I have changed heavy-weight locks to a light-weight read and write locks and a cleanup lock for vacuum and split operation.



With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Masahiko Sawada
Дата:
Сообщение: Re: forcing a rebuild of the visibility map
Следующее
От: Jakob Egger
Дата:
Сообщение: sslmode=require fallback