Re: 10.1: hash index size exploding on vacuum full analyze

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: 10.1: hash index size exploding on vacuum full analyze
Дата
Msg-id CAA4eK1K2ynh=_gDHYydVXHvG+Lk=xyY-Pb9n86QjbtY768jm-A@mail.gmail.com
обсуждение исходный текст
Ответ на Re: 10.1: hash index size exploding on vacuum full analyze  (AP <pgsql@inml.weebeastie.net>)
Ответы Re: 10.1: hash index size exploding on vacuum full analyze  (Ashutosh Sharma <ashu.coek88@gmail.com>)
Список pgsql-bugs
On Thu, Nov 16, 2017 at 10:00 AM, AP <pgsql@inml.weebeastie.net> wrote:
> On Thu, Nov 16, 2017 at 09:48:13AM +0530, Amit Kapila wrote:
>> On Thu, Nov 16, 2017 at 4:59 AM, AP <pgsql@inml.weebeastie.net> wrote:
>> > I've some tables that'll never grow so I decided to replace a big index
>> > with one with a fillfactor of 100. That went well. The index shrunk to
>> > 280GB. I then did a vacuum full analyze on the table to get rid of any
>> > cruft (as the table will be static for a long time and then only deletes
>> > will happen) and the index exploded to 701GB. When it was created with
>> > fillfactor 90 (organically by filling the table) the index was 309GB.
>>
>> Sounds quite strange.  I think during vacuum it leads to more number
>> of splits than when the original data was loaded.  By any chance do
>> you have a copy of both the indexes (before vacuum full and after
>> vacuum full)?  Can you once check and share the output of
>> pgstattuple-->pgstathashindex() and pageinspect->hash_metapage_info()?
>>  I wanted to confirm if the bloat is due to additional splits.
>
> I'll see what I can do. Currently vacuuming the table without the index
> so that I can then do a create index concurrently and get back my 280GB
> index (it's how I got it in the first place). Namely:
>

One possible theory could be that the calculation for initial buckets
required for the index has overestimated the number of buckets.  I
think this is possible because we choose the initial number of buckets
based on the number of tuples, but actually while inserting the values
we might have created more of overflow buckets rather than using the
newly created primary buckets.  The chances of such a misestimation
are more when there are duplicate values.  Now, if that is true, then
actually one should see the same size of the index (as you have seen
after vacuum full ..) when you create an index on the table with the
same values in index columns.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


В списке pgsql-bugs по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: BUG #14915: Create sub-partitioning using GENERATED ALWAYS ASIDENTITY will lead to system collapse.
Следующее
От: Amit Langote
Дата:
Сообщение: Re: BUG #14915: Create sub-partitioning using GENERATED ALWAYS ASIDENTITY will lead to system collapse.