Re: logical decoding and replication of sequences, take 2

Поиск
Список
Период
Сортировка
От Ashutosh Bapat
Тема Re: logical decoding and replication of sequences, take 2
Дата
Msg-id CAExHW5tdDBKBjpBAbcVeNjTU=QqqOvZ==omR-SNov5KOuF2S5Q@mail.gmail.com
обсуждение исходный текст
Ответ на Re: logical decoding and replication of sequences, take 2  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Ответы Re: logical decoding and replication of sequences, take 2  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
On Mon, Jun 26, 2023 at 8:35 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
>
>
>
> On 6/26/23 15:18, Ashutosh Bapat wrote:
> > This is review of 0003 patch. Overall the patch looks good and helps
> > understand the decoding logic better.
> >
> > +                                          data
> > +----------------------------------------------------------------------------------------
> > + BEGIN
> > + sequence public.test_sequence: transactional:1 last_value: 1
> > log_cnt: 0 is_called:0
> > + COMMIT
> >
> > Looking at this output, I am wondering how would this patch work with DDL
> > replication. I should have noticed this earlier, sorry. A sequence DDL has two
> > parts, changes to the catalogs and changes to the data file. Support for
> > replicating the data file changes is added by these patches. The catalog
> > changes will need to be supported by DDL replication patch. When applying the
> > DDL changes, there are two ways 1. just apply the catalog changes and let the
> > support added here apply the data changes. 2. Apply both the changes. If the
> > second route is chosen, all the "transactional" decoding and application
> > support added by this patch will need to be ripped out. That will make the
> > "transactional" field in the protocol will become useless. It has potential to
> > be waste bandwidth in future.
> >
>
> I don't understand why would it need to be ripped out. Why would it make
> the transactional behavior useless? Can you explain?
>
> IMHO we replicate either changes (and then DDL replication does not
> interfere with that), or DDL (and then this patch should not interfere).
>
> > OTOH, I feel that waiting for the DDL repliation patch set to be commtted will
> > cause this patchset to be delayed for an unknown duration. That's undesirable
> > too.
> >
> > One solution I see is to use Storage RMID WAL again. While decoding it we send
> > a message to the subscriber telling it that a new relfilenode is being
> > allocated to a sequence. The subscriber too then allocates new relfilenode to
> > the sequence. The sequence data changes are decoded without "transactional"
> > flag; but they are decoded as transactional or non-transactional using the same
> > logic as the current patch-set. The subscriber will always apply these changes
> > to the reflilenode associated with the sequence at that point in time. This
> > would have the same effect as the current patch-set. But then there is
> > potential that the DDL replication patchset will render the Storage decoding
> > useless. So not an option. But anyway, I will leave this as a comment as an
> > alternative thought and discarded. Also this might trigger a better idea.
> >
> > What do you think?
> >
>
>
> I don't understand what the problem with DDL is, so I can't judge how
> this is supposed to solve it.

I have not looked at the DDL replication patch in detail so I may be
missing something. IIUC, that patch replicates the DDL statement in
some form: parse tree or statement. But it doesn't replicate the some
or all WAL records that the DDL execution generates.

Consider DDL "ALTER SEQUENCE test_sequence RESTART WITH 4000;". It
updates the catalogs with a new relfilenode and also the START VALUE.
It also writes to the new relfilenode. When publisher replicates the
DDL and the subscriber applies it, it will do the same - update the
catalogs and write to new relfilenode. We don't want the sequence data
to be replicated again when it's changed by a DDL. All the
transactional changes are associated with a DDL. Other changes to the
data sequence are non-transactional. So when replicating the sequence
data changes, "transactional" field becomes useless. What I am
pointing to is: if we add "transactional" field in the protocol today
and in future DDL replication is implemented in a way that
"transactional" field becomes redundant, we have introduced a
redundant field which will eat a byte on wire.  Of course we can
remove it by bumping protocol version, but that's some work.

Please note we will still need the code to determine whether a change
in sequence data is transactional or not IOW whether it's associated
with DDL or not. So that code remains.

> >
> >          }
> > +        else if (strcmp(elem->defname, "include-sequences") == 0)
> > +        {
> > +
> > +            if (elem->arg == NULL)
> > +                data->include_sequences = false;
> >
> > By default inlclude_sequences = true. Shouldn't then it be set to true here?
> >
>
> I don't follow. Is this still related to the DDL replication, or are you
> describing some new issue with savepoints?

Not related to DDL replication. Not an issue with savepoints either.
Just a comment about that particular change. So for not being clear.

--
Best Wishes,
Ashutosh Bapat



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: [PATCH] Honor PG_TEST_NOCLEAN for tempdirs
Следующее
От: Daniel Gustafsson
Дата:
Сообщение: Re: [PATCH] Honor PG_TEST_NOCLEAN for tempdirs