[HACKERS] Skip all-visible pages during second HeapScan of CIC

Поиск
Список
Период
Сортировка
От Pavan Deolasee
Тема [HACKERS] Skip all-visible pages during second HeapScan of CIC
Дата
Msg-id CABOikdO+=3=rK_Y=8o-xd5oPiNSPsoORYThJUCNE8kWm1pWOow@mail.gmail.com
обсуждение исходный текст
Ответы Re: [HACKERS] Skip all-visible pages during second HeapScan of CIC  (Masahiko Sawada <sawada.mshk@gmail.com>)
Re: [HACKERS] Skip all-visible pages during second HeapScan of CIC  (Andres Freund <andres@anarazel.de>)
Re: [HACKERS] Skip all-visible pages during second HeapScan of CIC  (Peter Geoghegan <pg@bowt.ie>)
Список pgsql-hackers
Hello All,

During the second heap scan of CREATE INDEX CONCURRENTLY, we're only interested in the tuples which were inserted after the first scan was started. All such tuples can only exists in pages which have their VM bit unset. So I propose the attached patch which consults VM during second scan and skip all-visible pages. We do the same trick of skipping pages only if certain threshold of pages can be skipped to ensure OS's read-ahead is not disturbed.

The patch obviously shows significant reduction of time for building index concurrently for very large tables, which are not being updated frequently and which was vacuumed recently (so that VM bits are set). I can post performance numbers if there is interest. For tables that are being updated heavily, the threshold skipping was indeed useful and without that we saw a slight regression.

Since VM bits are only set during VACUUM which conflicts with CIC on the relation lock, I don't see any risk of incorrectly skipping pages that the second scan should have scanned.

Comments?

Thanks,
Pavan

--
 Pavan Deolasee                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: [HACKERS] avoid bloat from CREATE INDEX CONCURRENTLY
Следующее
От: Tom Lane
Дата:
Сообщение: Re: [HACKERS] BRIN de-summarize ranges