Обсуждение: xmin value from pg_stat_replication
All; We have a master and a standby (Streaming replication). This query produces a value of approx 1.5million on the master and approx 80million on the standby: select slot_name, slot_type, database, xmin from pg_replication_slots order by age(xmin); We are running replication in async mode. A couple of questions: - Is it unusual for the master & standby to differ dramatically like this? - could the replication slot (i.e. the xmin on the standby) be preventing vacuum from cleaning up row versions on the master? Thanks in advance
"S. Bob" <sbob@quadratum-braccas.com> writes: > All; > > > We have a master and a standby (Streaming replication). > > This query produces a value of approx 1.5million on the master and > approx 80million on the standby: > > > select slot_name, slot_type, database, xmin from pg_replication_slots > order by age(xmin); Why does your standby have rep slots? Is it also feeding lower subordinate standbys? > > > We are running replication in async mode. > > > A couple of questions: > > > - Is it unusual for the master & standby to differ dramatically like this? > > - could the replication slot (i.e. the xmin on the standby) be > preventing vacuum from cleaning up row versions on the master? > > > Thanks in advance > > > > > > > > > -- Jerry Sievers Postgres DBA/Development Consulting e: postgres.consulting@comcast.net
Hi, On 2019-07-17 12:31:29 -0600, S. Bob wrote: > We have a master and a standby (Streaming replication). > > This query produces a value of approx 1.5million on the master and approx > 80million on the standby: > > > select slot_name, slot_type, database, xmin from pg_replication_slots order > by age(xmin); I'd suggest using age(xmin) in the output as well, not just in the ORDER BY, otherwise it's hard to make sense of the value. When you say a value of '1.5 million' on the master, how did you determine that? Because the above doesn't return anything for the master. The simplest way to determine that is probably: SELECT max(age(backend_xmin)) FROM pg_stat_activity; When you say '80 million', do you mean that pg_replication_slots.xmin was 80 million, or age(pg_replication_slots.xmin)? If it's the former, then that would indicate that the standby is *not* a problem, but that you have a long-running transaction or such on the primary. Basically, age(xmin) has to be large, not xmin itself. > We are running replication in async mode. > > > A couple of questions: > > > - Is it unusual for the master & standby to differ dramatically like this? Do you have very longrunning queries on the standby or primary? That's typically the reason why xmin is old. > - could the replication slot (i.e. the xmin on the standby) be preventing > vacuum from cleaning up row versions on the master? Yes. Greetings, Andres Freund
Thanks for the feedva k,
It turns out we were not looking at master & skave. The slave has no replication slot. We are seeing lots of row versions not getting cleaned up, trying toinvestigate all the possible causes. Is there a way to determine which connections (on the master or the slave) are the cause of the row versions not getting cleaned up?
Thanks
On Wed, Jul 17, 2019, at 4:32 PM, Andres Freund wrote:
Hi,On 2019-07-17 12:31:29 -0600, S. Bob wrote:> We have a master and a standby (Streaming replication).>> This query produces a value of approx 1.5million on the master and approx> 80million on the standby:>>> select slot_name, slot_type, database, xmin from pg_replication_slots order> by age(xmin);I'd suggest using age(xmin) in the output as well, not just in the ORDERBY, otherwise it's hard to make sense of the value.When you say a value of '1.5 million' on the master, how did youdetermine that? Because the above doesn't return anything for themaster. The simplest way to determine that is probably:SELECT max(age(backend_xmin)) FROM pg_stat_activity;When you say '80 million', do you mean that pg_replication_slots.xminwas 80 million, or age(pg_replication_slots.xmin)? If it's the former,then that would indicate that the standby is *not* a problem, but thatyou have a long-running transaction or such on the primary.Basically, age(xmin) has to be large, not xmin itself.> We are running replication in async mode.>>> A couple of questions:>>> - Is it unusual for the master & standby to differ dramatically like this?Do you have very longrunning queries on the standby or primary? That'stypically the reason why xmin is old.> - could the replication slot (i.e. the xmin on the standby) be preventing> vacuum from cleaning up row versions on the master?Yes.Greetings,Andres Freund