Proposal for background vacuum full/cluster

Поиск
Список
Период
Сортировка
От Jim C. Nasby
Тема Proposal for background vacuum full/cluster
Дата
Msg-id 20050420220556.GU58835@decibel.org
обсуждение исходный текст
Ответы Re: Proposal for background vacuum full/cluster  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Proposal for background vacuum full/cluster  (Paul Tillotson <pntil@shentel.net>)
Список pgsql-hackers
I talked to a few people on IRC about this and they didn't think I was
nuts, so maybe this is something practical...

In a nutshell, my idea is to use the normal transactional/XID code to
relocate tuples in the heap. Think of doing an UPDATE field=field if you
could tell update what page to put the new tuple on. Using this
mechanism, you can move tuples from the end of the heap to pages that
have free space on them. The dead tuples at the end of the heap could
then be vacuumed conventionally, and completely empty pages removed by
that vacuum.

Of course, it's not quite that simple. For starters, you'd want to do a
conventional vacuum before this, both to free as much space as possible
and to update the FSM. It might also be necessary to prevent backends
from using the pages at the end of the heap (which you're trying to
empty). I'm guessing that could be done just by removing the pages from
the FSM. You'd also need to vacuum after emptying these pages to reclaim
the disk space. To facilitate these things, it might be useful to be
able to vacuum parts of the heap. So as pages are emptied at the end of
the heap, they can be vacuumed and reclaimed while the pages are still
probably in cache (and without requiring a re-vacuum of the entire
table).

Taking this technique one step further, it should also be possible to
cluster in the background without blocking everything. One way to do
this would be to empty the first page in the heap by moving it's tuples
elsewhere, and vacuuming that page (but not putting it in the FSM). Once
that page is available, you can start reading in from the clustering
index and moving those tuples to the first page.

One thing that might be an issue for both ideas is index bloat. But
since reindex is a non-blocking operation, it doesn't seem unreasonable
to either do that automatically or have the user do it.

Is this TODOable?
-- 
Jim C. Nasby, Database Consultant               decibel@decibel.org 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Postgres: pg_hba.conf, md5, pg_shadow, encrypted passwords
Следующее
От: "Jim C. Nasby"
Дата:
Сообщение: Re: Postgres: pg_hba.conf, md5, pg_shadow, encrypted passwords