Re: logical decoding and replication of sequences, take 2
От | Tomas Vondra |
---|---|
Тема | Re: logical decoding and replication of sequences, take 2 |
Дата | |
Msg-id | 6e3a51b7-7a74-50ad-afeb-90f55a229284@enterprisedb.com обсуждение исходный текст |
Ответ на | Re: logical decoding and replication of sequences, take 2 (Amit Kapila <amit.kapila16@gmail.com>) |
Ответы |
Re: logical decoding and replication of sequences, take 2
(Amit Kapila <amit.kapila16@gmail.com>)
|
Список | pgsql-hackers |
On 7/24/23 12:40, Amit Kapila wrote: > On Wed, Jul 5, 2023 at 8:21 PM Ashutosh Bapat > <ashutosh.bapat.oss@gmail.com> wrote: >> >> 0005, 0006 and 0007 are all related to the initial sequence sync. [3] >> resulted in 0007 and I think we need it. That leaves 0005 and 0006 to >> be reviewed in this response. >> >> I followed the discussion starting [1] till [2]. The second one >> mentions the interlock mechanism which has been implemented in 0005 >> and 0006. While I don't have an objection to allowing LOCKing a >> sequence using the LOCK command, I am not sure whether it will >> actually work or is even needed. >> >> The problem described in [1] seems to be the same as the problem >> described in [2]. In both cases we see the sequence moving backwards >> during CATCHUP. At the end of catchup the sequence is in the right >> state in both the cases. >> > > I think we could see backward sequence value even after the catchup > phase (after the sync worker is exited and or the state of rel is > marked as 'ready' in pg_subscription_rel). The point is that there is > no guarantee that we will process all the pending WAL before > considering the sequence state is 'SYNCDONE' and or 'READY'. For > example, after copy_sequence, I see values like: > > postgres=# select * from s; > last_value | log_cnt | is_called > ------------+---------+----------- > 165 | 0 | t > (1 row) > postgres=# select nextval('s'); > nextval > --------- > 166 > (1 row) > postgres=# select nextval('s'); > nextval > --------- > 167 > (1 row) > postgres=# select currval('s'); > currval > --------- > 167 > (1 row) > > Then during the catchup phase: > postgres=# select * from s; > last_value | log_cnt | is_called > ------------+---------+----------- > 33 | 0 | t > (1 row) > postgres=# select * from s; > last_value | log_cnt | is_called > ------------+---------+----------- > 66 | 0 | t > (1 row) > > postgres=# select * from pg_subscription_rel; > srsubid | srrelid | srsubstate | srsublsn > ---------+---------+------------+----------- > 16394 | 16390 | r | 0/16374E8 > 16394 | 16393 | s | 0/1637700 > (2 rows) > > postgres=# select * from pg_subscription_rel; > srsubid | srrelid | srsubstate | srsublsn > ---------+---------+------------+----------- > 16394 | 16390 | r | 0/16374E8 > 16394 | 16393 | r | 0/1637700 > (2 rows) > > Here Sequence relid id 16393. You can see sequence state is marked as ready. > > postgres=# select * from s; > last_value | log_cnt | is_called > ------------+---------+----------- > 66 | 0 | t > (1 row) > > Even after that, see below the value of the sequence is still not > caught up. Later, when the apply worker processes all the WAL, the > sequence state will be caught up. > > postgres=# select * from s; > last_value | log_cnt | is_called > ------------+---------+----------- > 165 | 0 | t > (1 row) > > So, there will be a window where the sequence won't be caught up for a > certain period of time and any usage of it (even after the sync is > finished) during that time could result in inconsistent behaviour. > I'm rather confused about which node these queries are executed on. Presumably some of it is on publisher, some on subscriber? Can you create a reproducer (TAP test demonstrating this?) I guess it might require adding some sleeps to hit the right timing ... > The other question is whether it is okay to allow the sequence to go > backwards even during the initial sync phase? The reason I am asking > this question is that for the time sequence value moves backwards, one > is allowed to use it on the subscriber which will result in using > out-of-sequence values. For example, immediately, after copy_sequence > the values look like this: > postgres=# select * from s; > last_value | log_cnt | is_called > ------------+---------+----------- > 133 | 32 | t > (1 row) > postgres=# select nextval('s'); > nextval > --------- > 134 > (1 row) > postgres=# select currval('s'); > currval > --------- > 134 > (1 row) > > But then during the sync phase, it can go backwards and one is allowed > to use it on the subscriber: > postgres=# select * from s; > last_value | log_cnt | is_called > ------------+---------+----------- > 66 | 0 | t > (1 row) > postgres=# select nextval('s'); > nextval > --------- > 67 > (1 row) > Well, as for going back during the sync phase, I think the agreement was that's acceptable, as we don't make guarantees about that. The question is what's the state at the end of the sync (which I think leads to the first part of your message). regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Amit KapilaДата:
Сообщение: Re: logical decoding and replication of sequences, take 2
Следующее
От: Tomas VondraДата:
Сообщение: Re: logical decoding and replication of sequences, take 2