Re: HEAD seems to generate larger WAL regarding GIN index

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: HEAD seems to generate larger WAL regarding GIN index
Дата
Msg-id CA+Tgmoa2Ny=0AeJe5YRuZ5PSKna0qJSDx3_BE41Q65EYypsiJQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: HEAD seems to generate larger WAL regarding GIN index  (Jesper Krogh <jesper@krogh.cc>)
Ответы Re: HEAD seems to generate larger WAL regarding GIN index  (Fujii Masao <masao.fujii@gmail.com>)
Список pgsql-hackers
On Thu, Mar 20, 2014 at 1:12 PM, Jesper Krogh <jesper@krogh.cc> wrote:
> On 15/03/14 20:27, Heikki Linnakangas wrote:
>> That said, I didn't expect the difference to be quite that big when you're
>> appending to the end of the table. When the new entries go to the end of the
>> posting lists, you only need to recompress and WAL-log the last posting
>> list, which is max 256 bytes long. But I guess that's still a lot more WAL
>> than in the old format.
>>
>> That could be optimized, but I figured we can live with it, thanks to the
>> fastupdate feature. Fastupdate allows amortizing that cost over several
>> insertions. But of course, you explicitly disabled that...
>
> In a concurrent update environment, fastupdate as it is in 9.2 is not really
> useful. It may be that you can bulk up insertion, but you have no control
> over who ends up paying the debt. Doubling the amount of wal from
> gin-indexing would be pretty tough for us, in 9.2 we generate roughly 1TB
> wal / day, keeping it
> for some weeks to be able to do PITR. The wal are mainly due to gin-index
> updates as new data is added and needs to be searchable by users. We do run
> gzip that cuts it down to 25-30% before keeping the for too long, but
> doubling this is going to be a migration challenge.
>
> If fast-update could be made to work in an environment where we both have
> users searching the index and manually updating it and 4+ backend processes
> updating the index concurrently then it would be a good benefit to gain.
>
> the gin index currently contains 70+ million records with and average
> tsvector of 124 terms.

Should we try to install some hack around fastupdate for 9.4?  I fear
the divergence between reasonable values of work_mem and reasonable
sizes for that list is only going to continue to get bigger.  I'm sure
there's somebody out there who has work_mem = 16GB, and stuff like
263865a48973767ce8ed7b7788059a38a24a9f37 is only going to increase the
appeal of large values.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: jsonb and nested hstore
Следующее
От: David Johnston
Дата:
Сообщение: Re: PQputCopyData dont signal error