On 12/12/16 22:42, Andres Freund wrote:
> Hi,
>
> On 2016-12-10 23:10:19 +0100, Petr Jelinek wrote:
>> Hi,
>> First one is outright bug, which has to do with how we track running
>> transactions. What snapbuild basically does while doing initial snapshot
>> is read the xl_running_xacts record, store the list of running txes and
>> then wait until they all finish. The problem with this is that
>> xl_running_xacts does not ensure that it only logs transactions that are
>> actually still running (to avoid locking PGPROC) so there might be xids
>> in xl_running_xacts that already committed before it was logged.
>
> I don't think that's actually true? Notice how LogStandbySnapshot()
> only releases the lock *after* the LogCurrentRunningXacts() iff
> wal_level >= WAL_LEVEL_LOGICAL. So the explanation for the problem you
> observed must actually be a bit more complex :(
>
Hmm, interesting, I did see the transaction commit in the WAL before the
xl_running_xacts that contained the xid as running. I only seen it on
production system though, didn't really manage to easily reproduce it
locally.
-- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services