sequences vs. synchronous replication
| От | Tomas Vondra | 
|---|---|
| Тема | sequences vs. synchronous replication | 
| Дата | |
| Msg-id | 712cad46-a9c8-1389-aef8-faf0203c9be9@enterprisedb.com обсуждение исходный текст | 
| Ответы | Re: sequences vs. synchronous replication Re: sequences vs. synchronous replication | 
| Список | pgsql-hackers | 
Hi,
while working on logical decoding of sequences, I ran into an issue with 
nextval() in a transaction that rolls back, described in [1]. But after 
thinking about it a bit more (and chatting with Petr Jelinek), I think 
this issue affects physical sync replication too.
Imagine you have a primary <-> sync_replica cluster, and you do this:
   CREATE SEQUENCE s;
   -- shutdown the sync replica
   BEGIN;
   SELECT nextval('s') FROM generate_series(1,50);
   ROLLBACK;
   BEGIN;
   SELECT nextval('s');
   COMMIT;
The natural expectation would be the COMMIT gets stuck, waiting for the 
sync replica (which is not running), right? But it does not.
The problem is exactly the same as in [1] - the aborted transaction 
generated WAL, but RecordTransactionAbort() ignores that and does not 
update LogwrtResult.Write, with the reasoning that aborted transactions 
do not matter. But sequences violate that, because we only write WAL 
once every 32 increments, so the following nextval() gets "committed" 
without waiting for the replica (because it did not produce WAL).
I'm not sure this is a clear data corruption bug, but it surely walks 
and quacks like one. My proposal is to fix this by tracking the lsn of 
the last LSN for a sequence increment, and then check that LSN in 
RecordTransactionCommit() before calling XLogFlush().
regards
[1] 
https://www.postgresql.org/message-id/ae3cab67-c31e-b527-dd73-08f196999ad4%40enterprisedb.com
-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
		
	В списке pgsql-hackers по дате отправления: