Обсуждение: [HACKERS] Queuing all tables for analyze after recovery
Hi. Some (maybe all) row statistics are lost after the database has recovered after a failover. So it's recommended to ANALYZE all databases in a cluster after recovery. Amazon's AWS RDS (their managed SQL databases service) even sends an email "consider running analyze if your database is slow" after a failover of so called MultiAZ databases (with fast automatic failover for double price). Funny that they send it for both PostgreSQL and Oracle databases, which, I suppose, confuses Oracle DBA's greatly. And in AWS RDS MultiAZ a failover is pretty common. Minor version upgrade - failover. A storage hiccup - failover. Out of memory - failover. Shouldn't this analyze be queued and all tables analyzed automatically after failover by autovacuum daemon? With up to autovacuum_max_workers in parallel? It might save some DBA's from a couple of lost sleeping hours for sure. What do you think? A GUC option? On by dafault? Maybe even backported, but off by default in released versions? -- Tomasz "Tometzky" Ostrowski -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Tomasz Ostrowski <tometzky+pg@ato.waw.pl> writes: > Some (maybe all) row statistics are lost after the database has > recovered after a failover. So it's recommended to ANALYZE all databases > in a cluster after recovery. Uh ... recommended by whom? pg_statistic has exactly the same reliability guarantees as the rest of the system catalogs. I don't deny that there might be cases where this is worth doing, but it does not seem so likely that it should be part of one's standard checklist. Much less something that we should expend a great deal of effort to automate. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On 10/19/2017 10:54 PM, Tom Lane wrote: > Tomasz Ostrowski <tometzky+pg@ato.waw.pl> writes: >> Some (maybe all) row statistics are lost after the database has >> recovered after a failover. So it's recommended to ANALYZE all databases >> in a cluster after recovery. > > Uh ... recommended by whom? pg_statistic has exactly the same reliability > guarantees as the rest of the system catalogs. > > I don't deny that there might be cases where this is worth doing, but > it does not seem so likely that it should be part of one's standard > checklist. Much less something that we should expend a great deal > of effort to automate. For data statistics, sure. One thing I'm unhappy about is that pg_stat_all_tables is blank. An idea I've been throwing around in my head is to have autovacuum work on tables that have vacuum_count and autovacuum_count both zero (and likewise for analyze). This will cause a flurry of activity after failover or crash, but the alternative is autovacuum not knowing anything about the state of the tables and allowing massive bloat to potentially occur. For example, if you have a 1 billion row table, and crash when there are 199,999,999 dead tuples, you currently get to wait for another 200 million to die before anything gets cleaned up. -- Vik Fearing +33 6 46 75 15 36 http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On 10/19/2017 10:54 PM, Tom Lane wrote: > Uh ... recommended by whom? pg_statistic has exactly the same reliability > guarantees as the rest of the system catalogs. Actually I'm not exactly sure what is lost and what is preserved. I'm pretty sure that pg_stat_all_tables and similar views turn out with no data after a failover. Also I have some experience with badly performing databases after a failover, which went back to normal performance after whole cluster analyze. This email from AWS suggests that it's not only me. > I don't deny that there might be cases where this is worth doing, but > it does not seem so likely that it should be part of one's standard > checklist. Much less something that we should expend a great deal > of effort to automate. I assumed that the effort here shouldn't be that large. I imagined a simple check if the statistics are missing when considering tables for analyze by autovacuum. But I'm not a programmer, so I might misestimate this effort badly. -- Regards, Tomasz "Tometzky" Ostrowski -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Vik Fearing <vik.fearing@2ndquadrant.com> writes: > On 10/19/2017 10:54 PM, Tom Lane wrote: >> Uh ... recommended by whom? pg_statistic has exactly the same reliability >> guarantees as the rest of the system catalogs. > For data statistics, sure. One thing I'm unhappy about is that > pg_stat_all_tables is blank. Well, that's because we throw away the stats collector's stats after a crash -- or, in the failover case, because the promoted slave has its own counters and not the master's. ANALYZE isn't going to help that at all. The fact that we drop those stats in a crash cycle is probably mostly an overabundance of caution. We could likely quit doing that, maybe with a bit more validation effort when reading the files. Not sure whether we ought to change anything about the failover case. It's certainly reasonable for a standby server to have its own stats. The one case where it might make sense to explicitly discard the counters is when we do PITR to a previous system state. That's not too common though ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On 10/19/2017 11:26 PM, Tom Lane wrote: > Vik Fearing <vik.fearing@2ndquadrant.com> writes: >> On 10/19/2017 10:54 PM, Tom Lane wrote: >>> Uh ... recommended by whom? pg_statistic has exactly the same reliability >>> guarantees as the rest of the system catalogs. > >> For data statistics, sure. One thing I'm unhappy about is that >> pg_stat_all_tables is blank. > > Well, that's because we throw away the stats collector's stats after a > crash -- or, in the failover case, because the promoted slave has its own > counters and not the master's. ANALYZE isn't going to help that at all. Sure it will. ANALYZE estimates n_dead_tup, which often accurate enough for autovacuum to figure out what to do. -- Vik Fearing +33 6 46 75 15 36 http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers