Re: [HACKERS] GUC for cleanup indexes threshold.

Поиск
Список
Период
Сортировка
От Masahiko Sawada
Тема Re: [HACKERS] GUC for cleanup indexes threshold.
Дата
Msg-id CAD21AoBweeC4nkWi=F2tUcV_sFUSgsA5mkWZcCC+fRs0PY=uvw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] GUC for cleanup indexes threshold.  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: [HACKERS] GUC for cleanup indexes threshold.  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
On Wed, Feb 22, 2017 at 12:01 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Tue, Feb 21, 2017 at 1:09 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> On 20 February 2017 at 10:27, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>> On Mon, Feb 20, 2017 at 3:01 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>>>> On 20 February 2017 at 09:15, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>>>> On Mon, Feb 20, 2017 at 7:26 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>>>>>> On Fri, Feb 17, 2017 at 3:41 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>>>>>>> On Thu, Feb 16, 2017 at 6:17 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>>>>>>>> On 15 February 2017 at 08:07, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>>>>>>>>> It's a bug. Attached latest version patch, which passed make check.
>>>>
>>>>>>>> 2. The current btree vacuum code requires 2 vacuums to fully reuse
>>>>>>>> half-dead pages. So skipping an index vacuum might mean that second
>>>>>>>> index scan never happens at all, which would be bad.
>>>>>>>
>>>>>>> Maybe.  If there are a tiny number of those half-dead pages in a huge
>>>>>>> index, it probably doesn't matter.  Also, I don't think it would never
>>>>>>> happen, unless the table just never gets any more updates or deletes -
>>>>>>> but that case could also happen today.  It's just a matter of
>>>>>>> happening less frequently.
>>>>>>
>>>>>
>>>>> Yeah thats right and I am not sure if it is worth to perform a
>>>>> complete pass to reclaim dead/deleted pages unless we know someway
>>>>> that there are many such pages.
>>>>
>>>> Agreed.... which is why
>>>> On 16 February 2017 at 11:17, Simon Riggs <simon@2ndquadrant.com> wrote:
>>>>> I suggest that we store the number of half-dead pages in the metapage
>>>>> after each VACUUM, so we can decide whether to skip the scan or not.
>>>>
>>>>
>>>>> Also, I think we do reclaim the
>>>>> complete page while allocating a new page in btree.
>>>>
>>>> That's not how it works according to the README at least.
>>>>
>>>
>>> I am referring to code (_bt_getbuf()->if (_bt_page_recyclable(page))),
>>> won't that help us in reclaiming the space?
>>
>> Not unless the README is incorrect, no.
>>
>
> Just to ensure that we both have the same understanding, let me try to
> write what I understand about this reclaim algorithm.  AFAIU, in the
> first pass vacuum will mark the half dead pages as Deleted and in the
> second pass, it will record such pages as free in FSM so that they can
> be reused as new pages when the indexam asked for a new block instead
> of extending the index relation.  Now, if we introduce this new GUC,
> then there are chances that sometimes we skip the second pass where it
> would not have been skipped.
>
> Note that we do perform the second pass in the same vacuum cycle when
> index has not been scanned for deleting the tuples as per below code:

The first pass uses cycle id given by _bt_start_vacuum, but the second
pass always uses cycle id 0.

> btvacuumcleanup()
> {
> ..
> if (stats == NULL)
> {
> stats = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));
> btvacuumscan(info, stats, NULL, NULL, 0);
> ..
> }
>
> In above code stats won't be NULL, if the vacuum has scanned index for
> deleting tuples (btbulkdelete).  So, based on this I think it will
> skip scanning the index (or recycling pages marked as deleted) in the
> second vacuum only when there are no dead tuple removals in that
> vacuum.  Do we agree till here?

Agreed.

> I understand that there could be some delay in reclaiming dead pages
> but do you think it is such a big deal that we completely scan the
> index for such cases or even try to change the metapage format?

IIUC, I think that we need to have the number of half-dead pages in meta page.
Isn't it a problem that the freespace map of btree index is not
vacuumed if all vacuums skip the second pass?

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Pavel Stehule
Дата:
Сообщение: [HACKERS] proposal - psql: possibility to specify sort for describe commands,when size is printed
Следующее
От: Kyotaro HORIGUCHI
Дата:
Сообщение: Re: [HACKERS] asynchronous execution