Re: sequences vs. synchronous replication

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: sequences vs. synchronous replication
Дата
Msg-id 38a6a779-ee8d-de3c-fbd7-709285be88fa@enterprisedb.com
обсуждение исходный текст
Ответ на Re: sequences vs. synchronous replication  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: sequences vs. synchronous replication  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Список pgsql-hackers

On 12/18/21 05:52, Tom Lane wrote:
> Tomas Vondra <tomas.vondra@enterprisedb.com> writes:
>> The problem is exactly the same as in [1] - the aborted transaction
>> generated WAL, but RecordTransactionAbort() ignores that and does not
>> update LogwrtResult.Write, with the reasoning that aborted transactions
>> do not matter. But sequences violate that, because we only write WAL
>> once every 32 increments, so the following nextval() gets "committed"
>> without waiting for the replica (because it did not produce WAL).
> 
> Ugh.
> 
>> I'm not sure this is a clear data corruption bug, but it surely walks
>> and quacks like one. My proposal is to fix this by tracking the lsn of
>> the last LSN for a sequence increment, and then check that LSN in
>> RecordTransactionCommit() before calling XLogFlush().
> 
> (1) Does that work if the aborted increment was in a different
> session?  I think it is okay but I'm tired enough to not be sure.
> 

Good point - it doesn't :-( At least not by simply storing LSN in a 
global variable or something like that.

The second backend needs to know the LSN of the last WAL-logged sequence 
increment, but only the first backend knows that. So we'd need to share 
that between backends somehow. I doubt we want to track LSN for every 
individual sequence (because for clusters with many dbs / sequences that 
may be a lot).

Perhaps we could track just a fixed number o LSN values in shared memory 
(say, 1024), and update/read just the element determined by hash(oid). 
That is, the backend WAL-logging sequence with given oid would set the 
current LSN to array[hash(oid) % 1024], and backend doing nextval() 
would simply remember the LSN in that slot. Yes, if there are conflicts 
that'll flush more than needed.

Alternatively we could simply use the current insert LSN, but that's 
going to flush more stuff than needed all the time.


> (2) I'm starting to wonder if we should rethink the sequence logging
> mechanism altogether.  It was cool when designed, but it seems
> really problematic when you start thinking about replication
> behaviors.  Perhaps if wal_level > minimal, we don't do things
> the same way?

Maybe, but I have no idea how should the reworked WAL logging work. Any 
batching seems to have this issue, and loging individual increments is 
likely going to be slower.

Of course, reworking how sequences are WAL-logged may invalidate the 
"sequence decoding" patch I've been working on :-(


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: sequences vs. synchronous replication
Следующее
От: Amit Kapila
Дата:
Сообщение: Re: parallel vacuum comments