Re: REINDEX : new parameter to preserve current average leaf densityas new implicit FILLFACTOR

Поиск
Список
Период
Сортировка
От John Lumby
Тема Re: REINDEX : new parameter to preserve current average leaf densityas new implicit FILLFACTOR
Дата
Msg-id DM6PR06MB55621884223DF97D296E2E4DA3E20@DM6PR06MB5562.namprd06.prod.outlook.com
обсуждение исходный текст
Ответ на Re: REINDEX : new parameter to preserve current average leaf densityas new implicit FILLFACTOR  (Peter Geoghegan <pg@bowt.ie>)
Ответы Re: REINDEX : new parameter to preserve current average leaf densityas new implicit FILLFACTOR  (Peter Geoghegan <pg@bowt.ie>)
Список pgsql-general
Thanks Tom and Peter for those thoughts.

>
> I think you've confused REINDEX with VACUUM.  It seems like a pretty poor
> substitute for that --- it's much more expensive and has worse locking
> requirements.
>

I think the answer is mainly "I wish it were so" ,   but in practice,   even with a reasonably aggressive autovacuum
configurationrunning,  eventually the number of pages and number of dead keys builds up too much.    The assumption
(whichis the case in our particular workload) is that,  eventually,  unavailable-service maintenance operation must be
done, and REINDEX does (today,  pg-11.x) play a useful role because it addresses various aspects.   And then,  given we
findit useful to REINDEX,   we would prefer to avoid the sawtooth / "wave of misery" syndrome. 

>
> There is a very recent research paper that discusses the idea of
> varying fillfactor with a view to ameliorating page splits:
>

Thanks,  I am chewing my way through that.  As you say,  it does address exactly the issues I raised.
Do you happen to know if their source-code is available somewhere?
( I did see their db is MS SQL Server but it still might provide some portable ideas. )

>
> I suspect that you might find that the enhancements to B-Tree indexes
> that went into Postgres 12 would help with this workload, especially
> if you notice that this happens with indexes that have a lot of duplicates
>

I had not noticed that,   thanks for pointing it out.  Yes ,  in my workload most of the indexes in question are
non-uniqueand some have very low key card.    I will try out the pg-12 beta when I get a chance. 

>
> For the full background, you might take a look at my pgCon talk:
> https://youtu.be/p5RaATILoiE
>

Is there a pdf or text version?

>
> Does it seem at all possible that you were affected by either the issue with duplicates,
>

definitely

>
> or the issue that is addressed by the "split after new tuple" optimization?
>

don't know,    yet more research needed.    is there a module or contrib which would tell me?

Thanks again      John Lumby





From: Peter Geoghegan <pg@bowt.ie>

Sent: June 25, 2019 6:12 PM

To: John Lumby

Cc: pgsql general

Subject: Re: REINDEX : new parameter to preserve current average leaf density as new implicit FILLFACTOR

 


On Tue, Jun 25, 2019 at 2:45 PM John Lumby <johnlumby@hotmail.com> wrote:

> Background :   For some workloads involving high volume of INSERT/UPDATE/DELETE,  It is sometimes beneficial

> to schedule regular REINDEX of high-activity indexes,   so as to improve performance,  or restore performance levels
backto what it was earlier,   by removing dead keys etc.      This can result in the average page density of these
indexesfluctuating up 
 and down in a saw-tooth fashion,  REINDEX causing large increase in density (large drop in total number of pages) and
theworkload gradually  decreasing density back to some "steady-state". 



I suspect that you might find that the enhancements to B-Tree indexes

that went into Postgres 12 would help with this workload, especially

if you notice that this happens with indexes that have a lot of

duplicates. For the full background, you might take a look at my pgCon

talk:



https://youtu.be/p5RaATILoiE



Fair warning: this is a very technical talk.



Does it seem at all possible that you were affected by either the

issue with duplicates, or the issue that is addressed by the "split

after new tuple" optimization? They're both causes of index bloat that

VACUUM cannot usually prevent.



--

Peter Geoghegan




В списке pgsql-general по дате отправления:

Предыдущее
От: Hitesh Chadda
Дата:
Сообщение: migrating from Oracle to PostgreSQL 11
Следующее
От: Jesper Pedersen
Дата:
Сообщение: pg_receivelwal vs synchronous