On Thu, May 13, 2010 at 6:47 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> Rollbacks are always flushed to disk, so this explanation doesn't work.
> Even if it were it would take no longer than ~1 sec if everything were
> working correctly on the test system.
Yeah, rollbacks are always flushed sooner or later, but not *immediately*,
since RecordTransactionAbort() calls only XLogInsert() but not XLogFlush().
Until XLogFlush() is executed by another process, the WAL record of rollback
would stay in wal_buffers.
On the other hand, RecordTransactionCommit() calls XLogFlush(),
so commits are always flushed to the disk immediately.
> The "weird hang" is a lock wait and is perfectly normal in database
> systems. Robert says he hasn't checked whether it is reproduceable, so
> there is no evidence to show there is anything other than pilot error,
> at this point.
I was able to reproduce such a hang by not executing another transaction
after rollback. In this case, walsender cannot replicate the rollback
since it's not in the disk.
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center