Re: HEAD seems to generate larger WAL regarding GIN index

Поиск
Список
Период
Сортировка
От Jesper Krogh
Тема Re: HEAD seems to generate larger WAL regarding GIN index
Дата
Msg-id 532B4B93.6060408@krogh.cc
обсуждение исходный текст
Ответ на Re: HEAD seems to generate larger WAL regarding GIN index  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Ответы Re: HEAD seems to generate larger WAL regarding GIN index  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On 15/03/14 20:27, Heikki Linnakangas wrote:
> That said, I didn't expect the difference to be quite that big when 
> you're appending to the end of the table. When the new entries go to 
> the end of the posting lists, you only need to recompress and WAL-log 
> the last posting list, which is max 256 bytes long. But I guess that's 
> still a lot more WAL than in the old format.
>
> That could be optimized, but I figured we can live with it, thanks to 
> the fastupdate feature. Fastupdate allows amortizing that cost over 
> several insertions. But of course, you explicitly disabled that...

In a concurrent update environment, fastupdate as it is in 9.2 is not 
really useful. It may be that you can bulk up insertion, but you have no 
control over who ends up paying the debt. Doubling the amount of wal 
from gin-indexing would be pretty tough for us, in 9.2 we generate 
roughly 1TB wal / day, keeping it
for some weeks to be able to do PITR. The wal are mainly due to 
gin-index updates as new data is added and needs to be searchable by 
users. We do run gzip that cuts it down to 25-30% before keeping the for 
too long, but doubling this is going to be a migration challenge.

If fast-update could be made to work in an environment where we both 
have users searching the index and manually updating it and 4+ backend 
processes updating the index concurrently then it would be a good 
benefit to gain.

the gin index currently contains 70+ million records with and average 
tsvector of 124 terms.

-- 
Jesper .. trying to add some real-world info.



> - Heikki
>
>




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Josh Berkus
Дата:
Сообщение: Re: QSoC proposal: date_trunc supporting intervals
Следующее
От: Thom Brown
Дата:
Сообщение: Re: QSoC proposal: date_trunc supporting intervals