On Thu, Jul 28, 2016 at 9:08 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Thu, Jul 28, 2016 at 6:52 PM, Kyotaro HORIGUCHI
> <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
>> Hello,
>>
>> While testing replication for 9.5, we found that repl-master can
>> ignore wal_sender_timeout and seemingly waits for TCP
>> retransmission timeout for the case of sudden power-off of a
>> standby.
>>
>> My investigation told me that the immediate cause could be that
>> secure_write() is called with *blocking mode* (that is,
>> port->noblock = false) under *pq_putmessage_noblock* macro called
>> from XLogSendPhysical().
>>
>> libpq.h of 9.5 and newer defines it as the following,
>>
>>> #define pq_putmessage(msgtype, s, len) \
>>> (PqCommMethods->putmessage(msgtype, s, len))
>>> #define pq_putmessage_noblock(msgtype, s, len) \
>>> (PqCommMethods->putmessage(msgtype, s, len))
>>
>> which is apparently should be the following.
>>
>>> #define pq_putmessage_noblock(msgtype, s, len) \
>>> (PqCommMethods->putmessage_noblock(msgtype, s, len))
>>
>> The attached patch fixes it.
>
> Good catch! Barring objection, I will push this to both master and 9.5.
Regarding this patch, while reading pqcomm.c, I found the following things.
1. socket_comm_reset() calls pq_endcopyout().
I think that socket_endcopyout() should be called, instead.
2. socket_putmessage_noblock() calls pq_putmessage().
I think that socket_putmessage() should be called, instead.
3. Several source comments in pqcomm.c have not been updated.
Some comments still use the old function name like pq_putmessage().
Attached patch fixes the above issues.
Regards,
--
Fujii Masao