Fujii Masao <masao.fujii@gmail.com> writes:
> In 9.0, walsender reads WAL always from the disk and sends it to the standby.
> That is, we cannot send WAL until it has been written (and flushed) to the disk.
I believe the above statement to be incorrect: walsender does *not* wait
for an fsync to occur.
I agree with the idea of trying to read from WAL buffers instead of the
file system, but the main reason why is that the current behavior makes
FADVISE_DONTNEED for WAL pretty dubious. It'd be a good idea to still
(artificially) limit replication to not read ahead of the written-out
data.
> ... Since we can write and send WAL simultaneously, in synchronous
> replication, a transaction commit has only to wait for either of them. So the
> performance would significantly increase.
That performance claim, frankly, is ludicrous. There is no way that
round trip network delay plus write+fsync on the slave is faster than
local write+fsync. Furthermore, I would say that you are thinking
exactly backwards about the requirements for synchronous replication:
what that would mean is that transaction commit waits for *both*,
not whichever one finishes first.
regards, tom lane