Re: GIN pending list pages not recycled promptly (was Re: GIN improvements part 1: additional information)

Поиск
Список
Период
Сортировка
От Alvaro Herrera
Тема Re: GIN pending list pages not recycled promptly (was Re: GIN improvements part 1: additional information)
Дата
Msg-id 20140122135150.GM10723@eldon.alvh.no-ip.org
обсуждение исходный текст
Ответ на GIN pending list pages not recycled promptly (was Re: GIN improvements part 1: additional information)  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Список pgsql-hackers
Heikki Linnakangas wrote:

> I wrote a little utility that scans all pages in a gin index, and
> prints out the flags indicating what kind of a page it is. The
> distribution looks like this:
> 
>      19 DATA
>    7420 DATA LEAF
>   24701 DELETED
>       1 LEAF
>       1 META

Hah.

> I think we need to add the deleted pages to the FSM more aggressively.
>
> I tried simply adding calls to RecordFreeIndexPage, after the list
> pages have been marked as deleted, but unfortunately that didn't
> help. The problem is that the FSM is organized into a three-level
> tree, and RecordFreeIndexPage only updates the bottom level.

Interesting.  I think the idea of having an option for RecordFreeIndexPage
to update upper levels makes sense (no need to force it for other
users.)

Some time ago I proposed an index-only cleanup for vacuum.  That would
help GIN get this kind of treatment (vacuuming its FSM and processing
the pending list) separately from vacuuming the index.  It's probably
too late for 9.4 though.

One other thing worth considering in this area is that making the
pending list size depend on work_mem appears to have been a really bad
idea.  I know one case where the server is really large and seems to run
mostly OLAP type stuff with occasional updates, so they globally set
work_mem=2GB; they have GIN indexes for text search, and the result is
horrible performance 90% of the time, then a vacuum cleans the pending
list and it is blazing fast until the pending list starts getting big
again.  Now you can argue that setting work_mem to that value is a bad
idea, but as it turns out, in this case other than the GIN pending list
it seems to work fine.

Not related to the patch at hand, but I thought I would out it for
consideration, 'cause I'm not gonna start a new thread about it.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: Add min and max execute statement time in pg_stat_statement
Следующее
От: Kevin Grittner
Дата:
Сообщение: Re: Hard limit on WAL space used (because PANIC sucks)