Re: [GENERAL] pg_xlog on a hot_standby slave filling up
От | Lacey Powers |
---|---|
Тема | Re: [GENERAL] pg_xlog on a hot_standby slave filling up |
Дата | |
Msg-id | 55809F54.5000402@gmail.com обсуждение исходный текст |
Ответ на | Re: [GENERAL] pg_xlog on a hot_standby slave filling up (Jeff Frost <jeff@pgexperts.com>) |
Список | pgsql-bugs |
On 06/16/2015 01:16 PM, Jeff Frost wrote: >> On Jun 16, 2015, at 11:35 AM, Christoph Berg <cb@df7cb.de> wrote: >> >> [moving to -bugs] >> >> Re: Xavier 12 2015-06-16 <CAMOV8iB3oRzC4f7UTzOwC2wT08do3voi+PGN07uJq+ayo9E=cQ@mail.gmail.com> >>> Hi everyone, >>> >>> Questions about pg_xlogs again... >>> I have two Postgresql 9.1 servers in a master/slave stream replication >>> (hot_standby). >>> >>> Psql01 (master) is backuped with Barman and pg_xlogs is correctly >>> purged (archive_command is used). >>> >>> Hower, Psql02 (slave) has a huge pg_xlog (951 files, 15G for 7 days >>> only, it keeps growing up until disk space is full). I have found >>> documentation and tutorials, mailing list, but I don't know what is >>> suitable for a Slave. Leads I've found : >> Hi, >> >> I have the same problem here. Master/slave running on 9.3.current. On >> the master everything is normal, but on the slave server, files in >> pg_xlog and archive_status pile up. Interestingly, the filenames are >> mostly 0x20 apart. (IRC user Kassandry is reporting the same issue on >> 9.4 as well, including the 0x20 spacing.) > Ive seen this before, but havent been able to make a reproducible test case yet. > > Are you by chance using SSL to talk to the primary server? Is the ssl_renegotiation_limit the default of 512MB? 32 WALfiles at 16MB each = 512MB. I found that it would always leave the WAL file from before the invalid record length message. Does that seem to be the case for you as well? > > > Hello Jeff, To add to this on PostgreSQL 9.4 ( Kassandry from IRC ), yes, I see SSL errors in my logs. I turned off the archive_command I had running on one of my three replicas, which recycled all of the .ready files and all of the outstanding xlogs. I re-enabled the archive_command and waited. I got this in my logs: < @[] LOG: restartpoint complete: wrote 15437 buffers (2.9%); 0 transaction log file(s) added, 0 removed, 5 recycled; write=269.358 s, sync=0.035 s, total=269.397 s; sync files=202, longest=0.008 s, average=0.000 s < @[] LOG: recovery restart point at 650/4D01CCA0 < @[] DETAIL: last completed transaction was at log time 2015-06-16 21:41:41.990409+00 < @[] LOG: restartpoint starting: time < @[] LOG: restartpoint complete: wrote 115 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 12 recycled; write=11.446 s, sync=0.005 s, total=11.455 s; sync files=29, longest=0.001 s, average=0.000 s < @[] LOG: recovery restart point at 650/5204B6C8 < @[] DETAIL: last completed transaction was at log time 2015-06-16 21:42:24.524081+00 < @[] FATAL: could not send data to WAL stream: SSL error: unexpected record < @[] LOG: unexpected pageaddr 650/18000000 in log segment 00000001000006500000005A, offset 0 And a ready file appeared and stayed for 000000010000065000000059 : -rw------- 1 postgres postgres 0 Jun 16 21:57 000000010000065000000059.ready On my other streaming replica, there are lots of these log messages, and it looks like there is also a ready file for each of the segments previous to the segment mentioned in the unexpected pageaddr message. Hope this helps. Please let me know if I can gather further data to help fix this. =) Regards, Lacey
В списке pgsql-bugs по дате отправления: