On Sat, Mar 7, 2015 at 10:18 PM, Tatsuo Ishii <ishii@postgresql.org> wrote: > > When I set log_min_messages to debug5 and looked into walreciver log, > I saw this: > > 3600 2015-03-08 09:47:38 JST DEBUG: sendtime 2015-03-08 09:47:38.31493+09 receipttime 2015-03-08 09:47:38.315027+09 replication apply delay -1945478837 ms tran sfer latency 0 ms > > The "replication apply delay -1945478837 ms" part looks strange > because the delay is below 0. The number is formatted as %d in elog > call, and I suspect this is kind of integer overflow. >
Looking at GetReplicationApplyDelay() in walreceiverfuncs.c I noticed that the integer overflow occurs because sometimes the return of the GetCurrentChunkReplayStartTime() is 0 (zero).
I added an elog into GetReplicationApplyDelay to check this info:
DEBUG: GetReplicationApplyDelay 479099832 seconds, 352 milliseconds, (0.000000, 479099832352083.000000) DEBUG: sendtime 2015-03-08 00:17:12.351987-03 receipttime 2015-03-08 00:17:12.352043-03 replication apply delay -1936504800 ms transfer latency 0 ms DEBUG: GetReplicationApplyDelay 479099841 seconds, 450 milliseconds, (0.000000, 479099841450320.000000) DEBUG: sendtime 2015-03-08 00:17:21.45018-03 receipttime 2015-03-08 00:17:21.450294-03 replication apply delay -1936495702 ms transfer latency 0 ms
Maybe we should check before and return zero from GetReplicationApplyDelay. The attached patch implement it.