Re: logical decoding and replication of sequences, take 2

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: logical decoding and replication of sequences, take 2
Дата
Msg-id 6e3a51b7-7a74-50ad-afeb-90f55a229284@enterprisedb.com
обсуждение исходный текст
Ответ на Re: logical decoding and replication of sequences, take 2  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: logical decoding and replication of sequences, take 2  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers

On 7/24/23 12:40, Amit Kapila wrote:
> On Wed, Jul 5, 2023 at 8:21 PM Ashutosh Bapat
> <ashutosh.bapat.oss@gmail.com> wrote:
>>
>> 0005, 0006 and 0007 are all related to the initial sequence sync. [3]
>> resulted in 0007 and I think we need it. That leaves 0005 and 0006 to
>> be reviewed in this response.
>>
>> I followed the discussion starting [1] till [2]. The second one
>> mentions the interlock mechanism which has been implemented in 0005
>> and 0006. While I don't have an objection to allowing LOCKing a
>> sequence using the LOCK command, I am not sure whether it will
>> actually work or is even needed.
>>
>> The problem described in [1] seems to be the same as the problem
>> described in [2]. In both cases we see the sequence moving backwards
>> during CATCHUP. At the end of catchup the sequence is in the right
>> state in both the cases.
>>
> 
> I think we could see backward sequence value even after the catchup
> phase (after the sync worker is exited and or the state of rel is
> marked as 'ready' in pg_subscription_rel). The point is that there is
> no guarantee that we will process all the pending WAL before
> considering the sequence state is 'SYNCDONE' and or 'READY'. For
> example, after copy_sequence, I see values like:
> 
> postgres=# select * from s;
>  last_value | log_cnt | is_called
> ------------+---------+-----------
>         165 |       0 | t
> (1 row)
> postgres=# select nextval('s');
>  nextval
> ---------
>      166
> (1 row)
> postgres=# select nextval('s');
>  nextval
> ---------
>      167
> (1 row)
> postgres=# select currval('s');
>  currval
> ---------
>      167
> (1 row)
> 
> Then during the catchup phase:
> postgres=# select * from s;
>  last_value | log_cnt | is_called
> ------------+---------+-----------
>          33 |       0 | t
> (1 row)
> postgres=# select * from s;
>  last_value | log_cnt | is_called
> ------------+---------+-----------
>          66 |       0 | t
> (1 row)
> 
> postgres=# select * from pg_subscription_rel;
>  srsubid | srrelid | srsubstate | srsublsn
> ---------+---------+------------+-----------
>    16394 |   16390 | r          | 0/16374E8
>    16394 |   16393 | s          | 0/1637700
> (2 rows)
> 
> postgres=# select * from pg_subscription_rel;
>  srsubid | srrelid | srsubstate | srsublsn
> ---------+---------+------------+-----------
>    16394 |   16390 | r          | 0/16374E8
>    16394 |   16393 | r          | 0/1637700
> (2 rows)
> 
> Here Sequence relid id 16393. You can see sequence state is marked as ready.
> 
> postgres=# select * from s;
>  last_value | log_cnt | is_called
> ------------+---------+-----------
>          66 |       0 | t
> (1 row)
> 
> Even after that, see below the value of the sequence is still not
> caught up. Later, when the apply worker processes all the WAL, the
> sequence state will be caught up.
> 
> postgres=# select * from s;
>  last_value | log_cnt | is_called
> ------------+---------+-----------
>         165 |       0 | t
> (1 row)
> 
> So, there will be a window where the sequence won't be caught up for a
> certain period of time and any usage of it (even after the sync is
> finished) during that time could result in inconsistent behaviour.
> 

I'm rather confused about which node these queries are executed on.
Presumably some of it is on publisher, some on subscriber?

Can you create a reproducer (TAP test demonstrating this?) I guess it
might require adding some sleeps to hit the right timing ...

> The other question is whether it is okay to allow the sequence to go
> backwards even during the initial sync phase? The reason I am asking
> this question is that for the time sequence value moves backwards, one
> is allowed to use it on the subscriber which will result in using
> out-of-sequence values. For example, immediately, after copy_sequence
> the values look like this:
> postgres=# select * from s;
>  last_value | log_cnt | is_called
> ------------+---------+-----------
>         133 |      32 | t
> (1 row)
> postgres=# select nextval('s');
>  nextval
> ---------
>      134
> (1 row)
> postgres=# select currval('s');
>  currval
> ---------
>      134
> (1 row)
> 
> But then during the sync phase, it can go backwards and one is allowed
> to use it on the subscriber:
> postgres=# select * from s;
>  last_value | log_cnt | is_called
> ------------+---------+-----------
>          66 |       0 | t
> (1 row)
> postgres=# select nextval('s');
>  nextval
> ---------
>       67
> (1 row)
> 

Well, as for going back during the sync phase, I think the agreement was
that's acceptable, as we don't make guarantees about that. The question
is what's the state at the end of the sync (which I think leads to the
first part of your message).


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: logical decoding and replication of sequences, take 2
Следующее
От: Tomas Vondra
Дата:
Сообщение: Re: logical decoding and replication of sequences, take 2