Re: Vacuuming big btree indexes without pages with deleted items

Поиск
Список
Период
Сортировка
От Jim Nasby
Тема Re: Vacuuming big btree indexes without pages with deleted items
Дата
Msg-id 5519CCF2.9080006@BlueTreble.com
обсуждение исходный текст
Ответ на Vacuuming big btree indexes without pages with deleted items  (Vladimir Borodin <root@simply.name>)
Ответы Re: Vacuuming big btree indexes without pages with deleted items  (Kevin Grittner <kgrittn@ymail.com>)
Список pgsql-hackers
On 3/27/15 5:15 AM, Vladimir Borodin wrote:
> Hi all.
>
> I have described [0] a problem with delaying replicas after vacuuming a
> relation with big btree index. It stucks in replaying WAL record of
> type XLOG_BTREE_VACUUM like that (with lastBlockVacuumed 0):
>
> rmgr: Btree       len (rec/tot):     20/    52, tx:          0, lsn:
> 4115/56126DC0, prev 4115/56126D90, bkp: 0000, desc: vacuum: rel
> 1663/16420/16796; blk 31222118, lastBlockVacuumed 0
>
> Master writes this record to xlog in btvacuumscan [1] function after
> vacuuming of all index pages. And in case of no pages with deleted items
> xlog record would contain lastBlockVacuumed 0.
>
> In btree_xlog_vacuum [2] replica reads all blocks from lastBlockVacuumed
> to last block of the index while applying this record because there is
> no api in the buffer manager to understand if the page is unpinned.
>
> So if the index is quite big (200+ GB in described case) it takes much
> time to do it. So the questions are:
>
> 1. Aren’t there still any api in buffer manager to understand that the
> page is not in shared_buffers without reading it?
I don't know offhand, but since XLogReadBufferExtended already has a 
mode argument it wouldn't be too hard to add it there.

> 2. Is it possible not to write to xlog record with lastBlockVacuumed 0
> in some cases? For example, in case of not deleting any pages.

Possibly, but that's much higher risk. Without studying it, if we wanted 
to mess around with that it might actually make more sense to XLOG a set 
of blkno's that got vacuumed, but I suspect that wouldn't be a win.

> Or maybe there are some better ways of improving this situation?
>
> [0]
> http://www.postgresql.org/message-id/FE82A9A7-0D52-41B5-A9ED-967F6927CB8A@simply.name
> [1]
>
http://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/access/nbtree/nbtree.c;hb=refs/heads/REL9_4_STABLE#l813
> [2]
>
http://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/access/nbtree/nbtxlog.c;hb=refs/heads/REL9_4_STABLE#l482
>
> --
> May the force be with you…
> https://simply.name
>


-- 
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Ignoring entries generated by autoconf in code tree
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: Exposing PG_VERSION_NUM in pg_config