In walsender, in the main loop that waits for backend requests to send
WAL, there's this comment:
> + /*
> + * Nap for the configured time or until a request arrives.
> + *
> + * On some platforms, signals won't interrupt the sleep. To ensure we
> + * respond reasonably promptly when someone signals us, break down the
> + * sleep into 1-second increments, and check for interrupts after each
> + * nap.
> + */
That's apparently copy-pasted from bgwriter. It's fine for bgwriter,
where a prompt response is not important, but it seems pretty awful for
synchronous replication. On such platforms, that would introduce a delay
of 500ms on average at every commit. I'm not sure if the comment is
actually accurate, though. bgwriter uses pq_usleep(), while this loop
uses pq_wait, which uses secure_poll().
There's also a small race condition in that loop:
> + while (remaining > 0)
> + {
> + int waitres;
> +
> + if (got_SIGHUP || shutdown_requested || replication_requested)
> + break;
> +
> + /*
> + * Check whether the data from standby can be read.
> + */
> + waitres = pq_wait(true, false,
> + remaining > 1000 ? 1000 : remaining);
> +
> ...
If a signal is received just before pq_wait call, after checking
replication_requested, pq_wait won't be interrupted and will wait up to
a second before responding to it.
BTW, on what platforms signal doesn't interrupt sleep?
-- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com