Re: New vacuum option to do only freezing

Поиск
Список
Период
Сортировка
От Bossart, Nathan
Тема Re: New vacuum option to do only freezing
Дата
Msg-id FF45D041-47F2-49A0-82BE-B2B53AE2E7B5@amazon.com
обсуждение исходный текст
Ответ на Re: New vacuum option to do only freezing  (Masahiko Sawada <sawada.mshk@gmail.com>)
Ответы Re: New vacuum option to do only freezing  (Masahiko Sawada <sawada.mshk@gmail.com>)
Список pgsql-hackers
Hi,

On 10/1/18, 5:23 AM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote:
> Attached patch adds a new option FREEZE_ONLY to VACUUM command. This
> option is same as FREEZE option except for it disables reclaiming dead
> tuples. That is, with this option vacuum does pruning HOT chain,
> freezing live tuples and maintaining both visibility map and freespace
> map but does not collect dead tuples and invoke neither heap vacuum
> nor index vacuum. This option will be useful if user wants to prevent
> XID wraparound a table as quick as possible, especially when table is
> quite large and is about to XID wraparound. I think this usecase was
> mentioned in some threads but I couldn't find them.

I've thought about this a bit myself.  One of the reasons VACUUM can
take so long is because of all the index scans needed.  If you're in a
potential XID wraparound situation and just need a quick way out, it
would be nice to have a way to do the minimum amount of work necessary
to reclaim transaction IDs.  At a high level, I think there are some
improvements to this design we should consider.

1. Create a separate FREEZE command instead of adding a new VACUUM
   option

The first line of the VACUUM documentation reads, "VACUUM reclaims
storage occupied by dead tuples," which is something that we would
explicitly not be doing with FREEZE_ONLY.  I think it makes sense to
reuse many of the VACUUM code paths to implement this feature, but
from a user perspective, it should be separate.

2. We should reclaim transaction IDs from dead tuples as well

Unless we also have a way to freeze XMAX like we do XMIN, I doubt this
feature will be useful for the imminent-XID-wraparound use-case.  In
short, we won't be able to advance relfrozenxid and relminmxid beyond
the oldest XMAX value for the relation.  IIUC the idea of freezing
XMAX doesn't really exist yet.  Either the XMAX is aborted/invalid and
can be reset to InvalidTransactionId, or it is committed and the tuple
can be removed if it beyond the freezing threshold.  So, we probably
also want to look into adding a way to freeze XMAX, either by setting
it to FrozenTransactionId or by setting the hint bits to
(HEAP_XMAX_COMMITTED | HEAP_XMIN_INVALID) as is done for XMIN.

Looking closer, I see that the phrase "freezing XMAX" is currently
used to refer to setting it to InvalidTransactionId if it is aborted
or invalid (e.g. lock-only).

> Currently this patch just adds the new option to VACUUM command but it
> might be good to make autovacuum use it when emergency vacuum is
> required.

This also seems like a valid use-case, but it should definitely be
done as a separate effort after this feature has been committed.

> This is a performance-test result for FREEZE option and FREEZE_ONLY
> option. I've tested them on the table which is about 3.8GB table
> without indexes and randomly modified.
>
> * FREEZE
> ...
> Time: 50301.262 ms (00:50.301)
>
> * FREEZE_ONLY
> ...
> Time: 44589.794 ms (00:44.590)

I'd be curious to see the improvements you get when there are several
indexes on the relation.  The ability to skip the index scans is
likely how this feature will really help speed things up.

Nathan


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Nikolay Shaplov
Дата:
Сообщение: Re: [PATCH][PROPOSAL] Add enum releation option type
Следующее
От: Robert Haas
Дата:
Сообщение: Re: pg_promote not marked as parallel-restricted in pg_proc.dat