I have observed that currently incase there is a network break between master and standby, walsender process gets terminated immediately, however
walreceiver detects the breakage after long time.
I could see that there is replication_timeout configuration parameter, walsender checks for replication_timeout and exits after that timeout.
Shouldn't for walreceiver, there be a mechanism so that it can detect n/w failure sooner?
Basic Steps to observe above behavior
1. Both master and standby machine are connected normally,
2. then you use the command: ifconfig ip down; make the network card of master and standby down,
Observation
master can detect connect abnormal, but the standby can't detect connect abnormal and show a connected channel long time.
Note - Earlier I had sent this on Hackers list also, I just wanted to know that is it the behavior as defined by PostgreSQL or is it a bug or a new feature in itself.
In case it is not clear, I will raise a bug.
With Regards,
Amit Kapila