Re: New vacuum option to do only freezing

Поиск
Список
Период
Сортировка
От Bossart, Nathan
Тема Re: New vacuum option to do only freezing
Дата
Msg-id 42826130-91C1-4586-A086-4C33EE9DCF07@amazon.com
обсуждение исходный текст
Ответ на Re: New vacuum option to do only freezing  (Masahiko Sawada <sawada.mshk@gmail.com>)
Список pgsql-hackers
On 11/5/18, 2:07 AM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote:
> On Fri, Nov 2, 2018 at 1:32 AM Bossart, Nathan <bossartn@amazon.com> wrote:
>> 1. Create a separate FREEZE command instead of adding a new VACUUM
>>    option
>>
>> The first line of the VACUUM documentation reads, "VACUUM reclaims
>> storage occupied by dead tuples," which is something that we would
>> explicitly not be doing with FREEZE_ONLY.
>
> No. Actually FREEZE_ONLY option (maybe will be changed its name) could
> reclaim dead tuples by HOT-purning. If a page have HOT-updated chains
> the FREEZE_ONLY prunes them and reclaim disk space occupied.

I see.

>>  I think it makes sense to
>> reuse many of the VACUUM code paths to implement this feature, but
>> from a user perspective, it should be separate.
>
> I'm concernced that since the existing users already have recognized
> that vacuuming and freezing are closely related they would get
> confused more if we have a similar purpose feature with different
> name.

That seems reasonable to me.  Perhaps decoupling this option from
FREEZE would make it clearer to users and easier to name.  This would
allow users to do both VACUUM (WITHOUT_INDEX_CLEANUP) and VACUUM
(FREEZE, WITHOUT_INDEX_CLEANUP).

>> 2. We should reclaim transaction IDs from dead tuples as well
>>
>> Unless we also have a way to freeze XMAX like we do XMIN, I doubt this
>> feature will be useful for the imminent-XID-wraparound use-case.  In
>> short, we won't be able to advance relfrozenxid and relminmxid beyond
>> the oldest XMAX value for the relation.
>>  IIUC the idea of freezing> XMAX doesn't really exist yet.  Either the XMAX is aborted/invalid and
>> can be reset to InvalidTransactionId, or it is committed and the tuple
>> can be removed if it beyond the freezing threshold.  So, we probably
>> also want to look into adding a way to freeze XMAX, either by setting
>> it to FrozenTransactionId or by setting the hint bits to
>> (HEAP_XMAX_COMMITTED | HEAP_XMIN_INVALID) as is done for XMIN.
>
> That's a good point. If the oldest xmax is close to the old
> relfrozenxid we will not be able to advance relfrozenxid enough.
> However, since dead tuples are vacuumed by autovacuum periodically I
> think that we can advance relfrozenxid enough in common case. There is
> possible that we eventually need to do vacuum with removing dead
> tuples after done FREEZE_ONLY but it would be a rare case. Thought?

Given that a stated goal of this patch is to help recover from near
wraparound, I think this is very important optimization.  It's true
that you might be able to advance relfrozenxid/relminmxid a bit in
some cases, but there are many others where you won't.  For example,
if I create a row, delete it, and insert many more rows, my table's
XID age would be stuck at the row deletion.  If I was in a situation
where this table was near wraparound and autovacuum wasn't keeping up,
I would run VACUUM (WITHOUT_INDEX_CLEANUP, FREEZE) with the intent of
reclaiming transaction IDs as fast as possible.

>> I'd be curious to see the improvements you get when there are several
>> indexes on the relation.  The ability to skip the index scans is
>> likely how this feature will really help speed things up.
>>
>
> I've tested performance of FREEZE option and FREEZE_ONLY option using
> a 3GB table having 3 indexes. Before do vacuum I modified 1 % of data
> on the table.
>
> * FREEZE
> Time: 78677.211 ms (01:18.677)
> Time: 86958.452 ms (01:26.958)
> Time: 78351.190 ms (01:18.351)
>
> * FREEZE_ONLY
> Time: 19913.863 ms (00:19.914)
> Time: 18917.379 ms (00:18.917)
> Time: 20048.541 ms (00:20.049)

Nice.

Nathan


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: pread() and pwrite()
Следующее
От: Jim Finnerty
Дата:
Сообщение: Re: Hash Joins vs. Bloom Filters / take 2