Re: logical decoding and replication of sequences, take 2

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: logical decoding and replication of sequences, take 2
Дата
Msg-id 428af10b-64bc-5338-f9ec-b27054d70b40@enterprisedb.com
обсуждение исходный текст
Ответ на Re: logical decoding and replication of sequences, take 2  (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>)
Ответы Re: logical decoding and replication of sequences, take 2  (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>)
Re: logical decoding and replication of sequences, take 2  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Список pgsql-hackers

On 7/14/23 08:51, Ashutosh Bapat wrote:
> On Thu, Jul 13, 2023 at 8:29 PM Tomas Vondra
> <tomas.vondra@enterprisedb.com> wrote:
>>
>> On 6/23/23 15:18, Ashutosh Bapat wrote:
>>> ...
>>>
>>> I reviewed 0001 and related parts of 0004 and 0008 in detail.
>>>
>>> I have only one major change request, about
>>> typedef struct xl_seq_rec
>>> {
>>> RelFileLocator locator;
>>> + bool created; /* creates a new relfilenode (CREATE/ALTER) */
>>>
>>> I am not sure what are the repercussions of adding a member to an existing WAL
>>> record. I didn't see any code which handles the old WAL format which doesn't
>>> contain the "created" flag. IIUC, the logical decoding may come across
>>> a WAL record written in the old format after upgrade and restart. Is
>>> that not possible?
>>>
>>
>> I don't understand why would adding a new field to xl_seq_rec be an
>> issue, considering it's done in a new major version. Sure, if you
>> generate WAL with old build, and start with a patched version, that
>> would break things. But that's true for many other patches, and it's
>> irrelevant for releases.
> 
> There are two issues
> 1. the name of the field "created" - what does created mean in a
> "sequence status" WAL record? Consider following sequence of events
> Begin;
> Create sequence ('s');
> select nextval('s') from generate_series(1, 1000);
> 
> ...
> commit
> 
> This is going to create 1000/32 WAL records with "created" = true. But
> only the first one created the relfilenode. We might fix this little
> annoyance by changing the name to "transactional".
> 

I don't think that's true - this will create 1 record with
"created=true" (the one right after the CREATE SEQUENCE) and the rest
will have "created=false".

I realized I haven't modified seq_desc to show this flag, so I did that
in the updated patch version, which makes this easy to see.

And all of them need to be handled in a transactional way, because they
modify relfilenode visible only to that transaction. So calling the flag
"transactional" would be misleading, because the increments can be
transactional even with "created=false".


> 2. Consider following scenario
> v15 running logical decoding has restart_lsn before a "sequence
> change" WAL record written in old format
> stop the server
> upgrade to v16
> logical decoding will stat from restart_lsn pointing to a WAL record
> written by v15. When it tries to read "sequence change" WAL record it
> won't be able to get "created" flag.
> 
> Am I missing something here?
> 

You're missing the fact that pg_upgrade does not copy replication slots,
so the restart_lsn does not matter.

(Yes, this is pretty annoying consequence of using pg_upgrade. And maybe
we'll improve that in the future - but I'm pretty sure we won't allow
decoding old WAL.)

>>
>>> But I don't think it's necessary. We can add a
>>> decoding routine for RM_SMGR_ID. The decoding routine will add relfilelocator
>>> in XLOG_SMGR_CREATE record to txn->sequences hash. Rest of the logic will work
>>> as is. Of course we will add non-sequence relfilelocators as well but that
>>> should be fine. Creating a new relfilelocator shouldn't be a frequent
>>> operation. If at all we are worried about that, we can add only the
>>> relfilenodes associated with sequences to the hash table.
>>>
>>
>> Hmmmm, that might work. I feel a bit uneasy about having to keep all
>> relfilenodes, not just sequences ...
> 
> From relfilenode it should be easy to get to rel and then see if it's
> a sequence. Only add relfilenodes for the sequence.
> 

Will try.

Attached is an updated version with pg_waldump printing the "created"
flag in seq_desc, and removing the unnecessary interlock. I've kept the
protocol changes in a separate commit for now.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: Support logical replication of DDLs
Следующее
От: Yugo NAGATA
Дата:
Сообщение: Re: pgbnech: allow to cancel queries during benchmark