Обсуждение: slave stops replica

Поиск
Список
Период
Сортировка

slave stops replica

От
Pepe TD Vo
Дата:
Hello,

I had master slave replica and yesterday the slave server's device was full on /var where the archive copy from master.  After trying to figure out why the pg_archivecleanup doesn't work, SA removed all wal files except the  current, August 25 wal files on slave and freed up /var with 504G available.  SA patched OS included reboot the server.  After the db started up after reboot, I found out that 
database/instance on slave is not sync with master.  The rows counted of queries containing these tables are not match with master from slave.

I tried to create a test table on master and didn't see it pick up from slave.  I found out again /var is full. 

What should I do to make slave streaming from master again without rebuilt the slave or must to rebuild it? Why my pg_archivecleanup is not working?  All the wal files stored since Nov 2020 ~ August 25.


Bach-Nga


Вложения

Re: slave stops replica

От
Laurenz Albe
Дата:
On Thu, 2021-08-26 at 21:09 +0000, Pepe TD Vo wrote:
> I had master slave replica and yesterday the slave server's device was full on /var
> where the archive copy from master.  After trying to figure out why the
> pg_archivecleanup doesn't work, SA removed all wal files except the  current
> 
> I tried to create a test table on master and didn't see it pick up from slave.
> I found out again /var is full. 

If you have deleted WAL that has no yet been applied to the standby, replication
is broken and you have to build the standby from scratch.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com




Re: slave stops replica

От
Pepe TD Vo
Дата:
Can we delete the old WALs?  May I know what  is the retention to perform tuning on the server like Oracle to keep only 7 days and why pg_archivecleanup doesn't work?

Bach-Nga




On Friday, August 27, 2021, 03:29:54 AM EDT, Laurenz Albe <laurenz.albe@cybertec.at> wrote:


On Thu, 2021-08-26 at 21:09 +0000, Pepe TD Vo wrote:

> I had master slave replica and yesterday the slave server's device was full on /var
> where the archive copy from master.  After trying to figure out why the
> pg_archivecleanup doesn't work, SA removed all wal files except the  current
>
> I tried to create a test table on master and didn't see it pick up from slave.
> I found out again /var is full. 


If you have deleted WAL that has no yet been applied to the standby, replication
is broken and you have to build the standby from scratch.

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com




Re: slave stops replica

От
Laurenz Albe
Дата:
On Fri, 2021-08-27 at 10:56 +0000, Pepe TD Vo wrote:
> > > I had master slave replica and yesterday the slave server's device was full on /var
> > > where the archive copy from master.  After trying to figure out why the
> > > pg_archivecleanup doesn't work, SA removed all wal files except the  current
> > >
> > > I tried to create a test table on master and didn't see it pick up from slave.
> > > I found out again /var is full. 
> >
> > If you have deleted WAL that has no yet been applied to the standby, replication
> > is broken and you have to build the standby from scratch.
> 
> Can we delete the old WALs?

Only the ones you don't need for replication.
Of course, if you rebuild replication from scratch, you can delete the WAL archives.

> May I know what  is the retention to perform tuning on the server like Oracle to
> keep only 7 days and why pg_archivecleanup doesn't work?

I guess that pg_archivecleanup doesn't delete anything because it is only
called once replication has processed a WAL segment and doesn't need it any more.

If you are missing a WAL segment because you deleted it, replication gets stuck
at that point and will wait indefinitely for that WAL segment.  So nothing is
processedm and nothing is deleted.

A gap in the WAL will stop and break replication, as I said.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com