Re: Checksum errors in pg_stat_database

Поиск
Список
Период
Сортировка
От Julien Rouhaud
Тема Re: Checksum errors in pg_stat_database
Дата
Msg-id CAOBaU_aHnz=5-b8H2wgPiSmDSed6KQOVdHbfzPb6me1qDUTApA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Checksum errors in pg_stat_database  (Julien Rouhaud <rjuju123@gmail.com>)
Ответы Re: Checksum errors in pg_stat_database
Список pgsql-hackers
On Sun, Mar 10, 2019 at 1:13 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> On Sat, Mar 9, 2019 at 7:58 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> >
> > On Sat, Mar 9, 2019 at 7:50 PM Magnus Hagander <magnus@hagander.net> wrote:
> > >
> > > On Sat, Mar 9, 2019 at 10:41 AM Julien Rouhaud <rjuju123@gmail.com> wrote:
> > >>
> > >> Sorry, I have again new comments after a little bit more thinking.
> > >> I'm wondering if we can do something about shared objects while we're
> > >> at it.  They don't belong to any database, so it's a little bit
> > >> orthogonal to this proposal, but it seems quite important to track
> > >> error on those too!
> > >>
> > >> What about adding a new field in PgStat_GlobalStats for that?  We can
> > >> use the same lastDir to easily detect such objects and slightly adapt
> > >> sendFile again, which seems quite straightforward.
> >
> > > Question is then what number that should show -- only the checksum counter in non-database-fields, or the total
numberacross the cluster?
 
> >
> > I'd say only for non-database-fields errors, especially if we can
> > reset each counters separately.  If necessary, we can add a new view
> > to give a global overview of checksum errors for DBA convenience.
>
> I'm considering adding a new PgStat_ChecksumStats for that purpose
> instead, but I don't know if that's acceptable to do so in the last
> commitfest.  It seems worthwhile to add it eventually, since we'll
> probably end up having more things to report to users related to
> checksum.  Online enabling of checksum could be the most immediate
> potential target.

I wasn't aware that we were already storing informations about shared
objects in PgStat_StatDBEntry, with an InvalidOid as databaseid
(though we don't have any system view that are actually showing
information for such objects).

As a result I ended up simply adding counters for the number of total
checks and the timestamp of the last failure in PgStat_StatDBEntry,
making attached patch very lightweight.  I moved all the checksum
related counters out of pg_stat_database in a new pg_stat_checksum
view.  It avoids to make pg_stat_database too wide, and also allows to
display information about shared object in this new view (some of the
other counters don't really make sense for shared objects or could
break existing monitoring query).  While at it, I tried to add a
little bit of documentation wrt. checksum monitoring.


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Banck
Дата:
Сообщение: Re: Offline enabling/disabling of data checksums
Следующее
От: ilmari@ilmari.org (Dagfinn Ilmari Mannsåker)
Дата:
Сообщение: Using the return value of strlcpy() and strlcat()