Re: Proposal: Conflict log history table for Logical Replication

Поиск
Список
Период
Сортировка
От shveta malik
Тема Re: Proposal: Conflict log history table for Logical Replication
Дата
Msg-id CAJpy0uDKbYWt+YPADj=4fHEvrGEWgnG1n_YsiGT_EZiZf0VSAw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Proposal: Conflict log history table for Logical Replication  (Dilip Kumar <dilipbalaut@gmail.com>)
Ответы Re: Proposal: Conflict log history table for Logical Replication
Список pgsql-hackers
On Fri, Sep 26, 2025 at 4:42 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Thu, Sep 25, 2025 at 4:19 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Thu, Sep 25, 2025 at 11:53 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > > [1]
> > > > /*
> > > > * For logical decode we need combo CIDs to properly decode the
> > > > * catalog
> > > > */
> > > > if (RelationIsAccessibleInLogicalDecoding(relation))
> > > > log_heap_new_cid(relation, &tp);
> > > >
> > >
> > > Meanwhile I am also exploring the option where we can just CREATE TYPE
> > > in initialize_data_directory() during initdb, basically we will create
> > > this type in template1 so that it will be available in all the
> > > databases, and that would simplify the table creation whether we
> > > create internally or we allow user to create it.  And while checking
> > > is_publishable_class we can check the type and avoid publishing those
> > > tables.
> > >
> >
> > Based on my off list discussion with Amit, one option could be to set
> > HEAP_INSERT_NO_LOGICAL option while inserting tuple into conflict
> > history table, for that we can not use SPI interface to insert instead
> > we will have to directly call the heap_insert() to add this option.
> > Since we do not want to create any trigger etc on this table, direct
> > insert should be fine, but if we plan to create this table as
> > partitioned table in future then direct heap insert might not work.
>
> Upon further reflection, I realized that while this approach avoids
> streaming inserts to the conflict log history table, it still requires
> that table to exist on the subscriber node upon subscription creation,
> which isn't ideal.
>
> We have two main options to address this:
>
> Option1:
> When calling pg_get_publication_tables(), if the 'alltables' option is
> used, we can scan all subscriptions and explicitly ignore (filter out)
> all conflict history tables.  This will not be very costly as this
> will scan the subscriber when pg_get_publication_tables() is called,
> which is only called during create subscription/alter subscription on
> the remote node.
>
> Option2:
> Alternatively, we could introduce a table creation option, like a
> 'non-publishable' flag, to prevent a table from being streamed
> entirely. I believe this would be a valuable, independent feature for
> users who want to create certain tables without including them in
> logical replication.
>
> I prefer option2, as I feel this can add value independent of this patch.
>

I agree that marking tables with a flag to easily exclude them during
publishing would be cleaner. In the current patch, for an ALL-TABLES
publication, we scan pg_subscription for each table in pg_class to
check its subconflicttable and decide whether to ignore it. But since
this only happens during create/alter subscription and refresh
publication, the overhead should be acceptable.

Introducing a ‘NON_PUBLISHABLE_TABLE’ option would be a good
enhancement but since we already have the EXCEPT list built in a
separate thread, that might be sufficient for now. IMO, such
conflict-tables should be marked internally (for example, with a
‘non_publishable’ or ‘conflict_log_table’ flag) so they can be easily
identified within the system, without requiring users to explicitly
specify them in EXCEPT or as NON_PUBLISHABLE_TABLE. I would like to
see what others think on this.
For the time being, the current implementation looks fine, considering
it runs only during a few publication-related DDL operations.

thanks
Shveta



В списке pgsql-hackers по дате отправления: