I think shipping with log_checkpoints=on and
log_autovacuum_min_duration=10m or so would be one of the best things
we could possibly do to allow ex-post-facto troubleshooting of
system-wide performance issues.
Fully agree, it would be really great. Glad that you added autovacuum logging
into the picture.
Maybe, 1m would be better – I recently observed a system with
high TPS, where autoANALYZE took very long for a system, causing
huge slowdown on standbys, starting after a few minutes of ANALYZE launched.
Correlation and then causation was confirmed only thanks to
log_autovacuum_min_duration configured -- without it, no one would be
able to understand what happened.
Back to checkpoint logging. With log_checkpoints = off, and high write activity
with low max_wal_size we're already "spamming" the logs with lots of
"checkpoints are occurring too frequently" – and this happens very often,
any DBA running a restore process on Postgres with default max_wal_size
(1GB, very low for modern DBs) saw it.
Without details, this definitely looks like "spam" – and it's already happening
here and there. Details provided by log_checkpoints = on will give something
leading to making the decision on max_wal_size reconfiguration based on data,
not guesswork.
+1 for log_checkpoints = on
and +1 for log_autovacuum_min_duration = 1m or so.