Re: shared-memory based stats collector - v70

Поиск
Список
Период
Сортировка
От Greg Stark
Тема Re: shared-memory based stats collector - v70
Дата
Msg-id CAM-w4HNqCR7X9q5_wbJi4wLfTT1gLvy5VRS+ZGo6z8-DbXkHpw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: shared-memory based stats collector - v70  ("Drouvot, Bertrand" <bdrouvot@amazon.com>)
Ответы Re: shared-memory based stats collector - v70  (Andres Freund <andres@anarazel.de>)
Re: shared-memory based stats collector - v70  ("Drouvot, Bertrand" <bdrouvot@amazon.com>)
Список pgsql-hackers
On Tue, 9 Aug 2022 at 06:19, Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
>
>
> What do you think about adding a function in core PG to provide such
> functionality? (means being able to retrieve all the stats (+ eventually
> add some filtering) without the need to connect to each database).

I'm working on it myself too. I'll post a patch for discussion in a bit.

I was more aiming at a C function that extensions could use directly
rather than an SQL function -- though I suppose having the former it
would be simple enough to implement the latter using it. (though it
would have to be one for each stat type I guess)

The reason I want a C function is I'm trying to get as far as I can
without a connection to a database, without a transaction, without
accessing the catalog, and as much as possible without taking locks. I
think this is important for making monitoring highly reliable and low
impact on production. It's also kind of fundamental to accessing stats
for objects from other databases since we won't have easy access to
the catalogs for the other databases.

The main problem with my current code is that I'm accessing the shared
memory hash table directly. This means the I'm possibly introducing
locking contention on the shared memory hash table. I'm thinking of
separating the shared memory hash scan from the metric scan so the
list can be quickly  built minimizing the time the lock is held. We
could possibly also only rebuild that list at a lower frequency than
the metrics gathering so new objects might not show up instantly.

I have a few things I would like to suggest for future improvements to
this infrastructure. I haven't polished the details of it yet but the
main thing I think I'm missing is the catalog name for the object. I
don't want to have to fetch it from the catalog and in any case I
think it would generally be useful and might regularize the
replication slot handling too.

I also think it would be nice to have a change counter for every stat
object, or perhaps a change time. Prometheus wouldn't be able to make
use of it but other monitoring software might be able to receive only
metrics that have changed since the last update which would really
help on databases with large numbers of mostly static objects. Even on
typical databases there are tons of builtin objects (especially
functions) that are probably never getting updates.

-- 
greg



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bharath Rupireddy
Дата:
Сообщение: Re: Generalize ereport_startup_progress infrastructure
Следующее
От: Robert Haas
Дата:
Сообщение: moving basebackup code to its own directory