Re: index bloat estimation

Поиск
Список
Период
Сортировка
От Keith Fiske
Тема Re: index bloat estimation
Дата
Msg-id CAODZiv5uVWASjuigUxJhmxmrQAdMoOxRu5vz97CkbuqH2-oxow@mail.gmail.com
обсуждение исходный текст
Ответ на Re: index bloat estimation  (Victor Sudakov <vas@sibptus.ru>)
Ответы Re: index bloat estimation  (Victor Sudakov <vas@sibptus.ru>)
Список pgsql-admin


On Sun, Feb 14, 2021 at 11:43 PM Victor Sudakov <vas@sibptus.ru> wrote:
Keith Fiske wrote:
> >
> > What queries do you use to estimate index and table bloat?

[dd]

> Why estimate when you can get the exact amount? At least for b-tree indexes
> anyway.
>
> https://github.com/keithf4/pg_bloat_check

Hello Keith,

Thanks for this script and for reminding about pgstattuple.

The script has the drawback of requiring r/w access to the database, so
it cannot be run on a replica. But the idea is excellent. At least it
can show the top N bloated relations, and then I can further explore with
pgstattuple(), pgstatindex() etc.


--
Victor Sudakov,  VAS4-RIPE, VAS47-RIPN
2:5005/49@fidonet http://vas.tomsk.ru/


You cannot run the actual scan on the replica, no. But it can be set to run against the replica and just report the statistics so you can have a cron set up to always run on the given system in case of failover. The "--recovery_mode_norun" can be set so it will only run if the target system is actually a primary.

I could possibly see about letting this actually run against the replica, however this can be a rather long running transaction depending on the size of the tables involved. You can set the "--commit_rate" to avoid some of that, but if you have really large tables, it can still run quite long. So this sort of check is really best run against the primary to avoid issues around having to allow long running queries on the replica (delayed replication or even worse bloat buildup).

However, once you get bloat under control, you likely shouldn't need to be running this often, especially against the entire database. If a few tables turn out to be problematic, you can make schedules just for them. And otherwise run an entire database scan at most maybe once a month during off-peak hours.


--
Keith Fiske
Senior Database Engineer
Crunchy Data - http://crunchydata.com

В списке pgsql-admin по дате отправления:

Предыдущее
От: Peter Eisentraut
Дата:
Сообщение: Re: pg_baseback could not connect in AWS linux 2
Следующее
От: Victor Sudakov
Дата:
Сообщение: Re: index bloat estimation