Re: [HACKERS] GUC for cleanup indexes threshold.

Поиск
Список
Период
Сортировка
От Alexander Korotkov
Тема Re: [HACKERS] GUC for cleanup indexes threshold.
Дата
Msg-id CAPpHfdtzCHxBbdAU+K2=ujOWBEJG4mEqRq6QnP3iVTES82sFzQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] GUC for cleanup indexes threshold.  (Masahiko Sawada <sawada.mshk@gmail.com>)
Ответы Re: [HACKERS] GUC for cleanup indexes threshold.
Список pgsql-hackers
On Mon, Mar 19, 2018 at 5:12 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Wed, Mar 14, 2018 at 9:25 PM, Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> On Wed, Mar 14, 2018 at 7:40 AM, Masahiko Sawada <sawada.mshk@gmail.com>
> wrote:
>>
>> On Sat, Mar 10, 2018 at 3:40 AM, Alexander Korotkov
>> <a.korotkov@postgrespro.ru> wrote:
>> > On Fri, Mar 9, 2018 at 3:12 PM, Masahiko Sawada <sawada.mshk@gmail.com>
>> > wrote:
>> >>
>> >> On Fri, Mar 9, 2018 at 8:43 AM, Alexander Korotkov
>> >> <a.korotkov@postgrespro.ru> wrote:
>> >> > 2) These parameters are reset during btbulkdelete() and set during
>> >> > btvacuumcleanup().
>> >>
>> >> Can't we set these parameters even during btbulkdelete()? By keeping
>> >> them up to date, we will able to avoid an unnecessary cleanup vacuums
>> >> even after index bulk-delete.
>> >
>> >
>> > We certainly can update cleanup-related parameters during
>> > btbulkdelete().
>> > However, in this case we would update B-tree meta-page during each
>> > VACUUM cycle.  That may cause some overhead for non append-only
>> > workloads.  I don't think this overhead would be sensible, because in
>> > non append-only scenarios VACUUM typically writes much more of
>> > information.
>> > But I would like this oriented to append-only workload patch to be
>> > as harmless as possible for other workloads.
>>
>> What overhead are you referring here? I guess the overhead is only the
>> calculating the oldest btpo.xact. And I think it would be harmless.
>
>
> I meant overhead of setting last_cleanup_num_heap_tuples after every
> btbulkdelete with wal-logging of meta-page.  I bet it also would be
> harmless, but I think that needs some testing.

Agreed.

After more thought, it might be too late but we can consider the
possibility of another idea proposed by Peter. Attached patch
addresses the original issue of index cleanups by storing the epoch
number of page deletion XID into PageHeader->pd_prune_xid which is
4byte field. Comparing to the current proposed patch this patch
doesn't need neither the page upgrade code nor extra WAL-logging. If
we also want to address cases other than append-only case we will
require the bulk-delete method of scanning whole index and of logging
WAL. But it leads some extra overhead. With this patch we no longer
need to depend on the full scan on b-tree index. This might be useful
for a future when we make the bulk-delete of b-tree index not scan
whole index.

Storing epoch in deleted pages is good by itself, because it makes us
free to select the moment of time to recycle those pages without risk
of wraparound.

However, I see that you are comparing relative change of num_heap_tuples
before and after vacuum to vacuum_cleanup_index_scale_factor.
The problem is that if vacuum is very aggressive (and that is likely for
append only case if this patch is committed), then num_heap_tuples
changes very slightly every vacuum cycle.  So, cleanup would never
happen and statistics could stall indefinitely.

Another issue is that append only workloads are frequently combined
with rare periodical bulk deletes of data.  Assuming that in this patch
you don't consider deleted pages to trigger index cleanup, on such
workload large amounts of deleted pages may reside non-recycled until
next bulk delete cycle.

So, despite I generally like idea of storing epoch of deleted xid in the
page, I think my version of patch is closer to committable shape.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Pavan Deolasee
Дата:
Сообщение: Re: [HACKERS] MERGE SQL Statement for PG11
Следующее
От: Alexander Korotkov
Дата:
Сообщение: Re: [HACKERS] GUC for cleanup indexes threshold.