On 2020-10-21 13:41, Amit Kapila wrote:
> On Tue, Oct 20, 2020 at 12:41 PM Masahiro Ikeda
> <ikedamsh@oss.nttdata.com> wrote:
>>
>> On 2020-10-20 12:46, Amit Kapila wrote:
>> > On Tue, Oct 20, 2020 at 8:01 AM Masahiro Ikeda
>> >> 1. Basic statistics of WAL activity
>> >>
>> >> - wal_records: Total number of WAL records generated
>> >> - wal_fpi: Total number of WAL full page images generated
>> >> - wal_bytes: Total amount of WAL bytes generated
>> >>
>> >> To understand DB's performance, first, we will check the performance
>> >> trends for the entire database instance.
>> >> For example, if the number of wal_fpi becomes higher, users may tune
>> >> "wal_compression", "checkpoint_timeout" and so on.
>> >>
>> >> Although users can check the above statistics via EXPLAIN,
>> >> auto_explain,
>> >> autovacuum and pg_stat_statements now,
>> >> if users want to see the performance trends for the entire database,
>> >> they must recalculate the statistics.
>> >>
>> >
>> > Here, do you mean to say 'entire cluster' instead of 'entire database'
>> > because it seems these stats are getting collected for the entire
>> > cluster?
>>
>> Thanks for your comments.
>> Yes, I wanted to say 'entire cluster'.
>>
>> >> I think it is useful to add the sum of the basic statistics.
>> >>
>> >
>> > There is an argument that it is better to view these stats at the
>> > statement-level so that one can know which statements are causing most
>> > WAL and then try to rate-limit them if required in the application and
>> > anyway they can get the aggregate of all the WAL if they want. We have
>> > added these stats in PG-13, so do we have any evidence that the
>> > already added stats don't provide enough information? I understand
>> > that you are trying to display the accumulated stats here which if
>> > required users/DBA need to compute with the currently provided stats.
>> > OTOH, sometimes adding more ways to do some things causes difficulty
>> > for users to understand and learn.
>>
>> I agreed that the statement-level stat is important and I understood
>> that we can
>> know the aggregated WAL stats of pg_stat_statement view and
>> autovacuum's
>> log.
>> But now, WAL stats generated by autovacuum can be output to logs and
>> it
>> is not
>> easy to aggregate them. Since WAL writes impacts for the entire
>> cluster,
>> I thought
>> it's natural to provide accumulated value.
>>
>
> I think it is other way i.e if we would have accumulated stats then it
> makes sense to provide those at statement-level because one would like
> to know the exact cause of more WAL activity. Say it is due to an
> autovacuum or due to the particular set of statements then it would
> easier for users to do something about it.
OK, I'll remove them.
Do you have any comments for other statistics?
--
Masahiro Ikeda
NTT DATA CORPORATION