Обсуждение: Re: Restrict publishing of partitioned table with a foreign table as partition

Поиск
Список
Период
Сортировка

Re: Restrict publishing of partitioned table with a foreign table as partition

От
Álvaro Herrera
Дата:
Hello,

I think reimplementing list_member_oid() under a different name
(is_ancestor_member_relids) is pointless and should not be done.

It also appears to me that we haven't nailed the error messages just
yet.  I tried to fix it upthread, but didn't really get it correct.  For
instance, consider this case taken from the new regression tests:

/* test setup */
CREATE SCHEMA sch3;
CREATE TABLE sch3.tmain(id int) PARTITION BY RANGE(id);
CREATE TABLE sch3.part1 PARTITION OF sch3.tmain FOR VALUES FROM (0) TO (5);
CREATE TABLE sch3.part2(id int) PARTITION BY RANGE(id);
CREATE FOREIGN TABLE sch3.part2_1 PARTITION OF sch3.part2 FOR VALUES FROM (5) TO (10) SERVER fdw_server;
ALTER TABLE sch3.tmain ATTACH PARTITION sch3.part2 FOR VALUES FROM (5) TO (10);

The issue occurs in the next command:

 CREATE PUBLICATION pub1 FOR TABLE sch3.tmain WITH (publish_via_partition_root);
 ERROR:  cannot add partitioned table "tmain" to publication "pub1", which has "publish_via_partition_root" set to
"true"
 DETAIL:  The table contains a partition that's a foreign table.

The user is trying to create a publication, so an error saying that it's
not possible to add a partition is a bit off-point.  (If you squint hard
enough it sort of makes sense, but it seems inconsistent enough.)
We could replace the word "add" with the word "include" there and this
particular problem would be fixed, perhaps.  However, the next one is
worse:

  CREATE PUBLICATION pub1 FOR ALL TABLES WITH (publish_via_partition_root);
  ERROR:  cannot set parameter "publish_via_partition_root" to "true" for publication "pub1"
  DETAIL:  Published partitioned table "tmain" contains a partition that is a foreign table.

Here we again try to create a publication, but we get an error that we
cannot set the parameter, which is even more detached from reality.
Maybe what we need is to add another parameter to that function to
indicate which operation is being executed (create publication, add
schema, change the parameter), which is used to indicate the error
message to throw.

With that thought in mind, I think it would make things simpler for us
to remove the duplicate copies of the same error messages by adding yet
another function, after publication_check_foreign_parts(), which receives
the operation being executed and whatever additional parameter it needs,
and throws the error.  If publication_check_foreign_parts() also
receives the operation being executed, it can simply pass it through to
that new function when an error needs to be thrown; other places that
hardcode error situations would pass a constant for the operation.


But the non-idiomatic locking of pg_partitioned_table appears to
continue to be the pain point of this patch.  My impression is that
using a lock is the wrong approach to solve the concurrency problem.
Maybe we can use a ConditionVariable instead somehow.  (The real trick
here is to figure out exactly _how_ to use the CV, I mean what exactly
is the condition that the CV sleeps on?)

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"Industry suffers from the managerial dogma that for the sake of stability
and continuity, the company should be independent of the competence of
individual employees."                                      (E. Dijkstra)



Re: Restrict publishing of partitioned table with a foreign table as partition

От
Amit Kapila
Дата:
On Sun, May 11, 2025 at 6:53 AM Álvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> But the non-idiomatic locking of pg_partitioned_table appears to
> continue to be the pain point of this patch.  My impression is that
> using a lock is the wrong approach to solve the concurrency problem.
> Maybe we can use a ConditionVariable instead somehow.  (The real trick
> here is to figure out exactly _how_ to use the CV, I mean what exactly
> is the condition that the CV sleeps on?)
>

Can we fix this problem by having a check at the time of
CREATESUBSCRIPTION such that, if copy-data is true, then we ensure
that the specified publishers don't have a foreign table? We have a
somewhat similar check for publications in the function
check_publications_origin(), though for a different purpose. Along
with it, we can still have a check during foreign table
creation/attach that it doesn't become part of some publication, as
the patch may have it now.

--
With Regards,
Amit Kapila.



Re: Restrict publishing of partitioned table with a foreign table as partition

От
Ajin Cherian
Дата:
On Tue, May 20, 2025 at 2:33 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
>
> This approach seems better to me. I have created a patch with the
> above approach.
>
> Thanks and Regards,
> Shlok Kyal

Some quick comments on the patch:
1. In doc/src/sgml/ref/create_subscription.sgml:
+    has partitioned table with foreign table as partition. If this scenario is
+    detected we ERROR is logged to the user.
+   </para>
+

Should be: "If this scenario is detected an ERROR is logged to the
user." (remove "we").

In src/backend/commands/subscriptioncmds.c:
2. The comment header:
+ * This function is in charge of detecting if publisher with
+ * publish_via_partition_root=true publishes a partitioned table that has a
+ * foreign table as a partition.

Add "and throw an error if found" at the end of that sentence to
correctly describe what the function does.

3.
+    appendStringInfoString(&cmd,
+                           "SELECT DISTINCT P.pubname AS pubname "
+                           "from pg_catalog.pg_publication p, LATERAL "
+                           "pg_get_publication_tables(p.pubname) gpt, LATERAL "
+                           "pg_partition_tree(gpt.relid) gt JOIN
pg_catalog.pg_foreign_table ft ON "
+                           "ft.ftrelid = gt.relid WHERE p.pubviaroot
= true AND  p.pubname IN (");

use FROM rather than from to maintain SQL style consistency.

4.
+                errdetail_plural("The subscription being created on a
publication (%s) with publish_via_root_partition = true and contains
partitioned tables with foreign table as partition ",
+                                 "The subscription being created on
publications (%s) with publish_via_root_partition = true and contains
partitioned tables with foreign table as partition ",
+                                 list_length(publist), pubnames->data),

I think you meant "publish_via_partition_root" here and not
"publish_via_root_partition ".

regards,
Ajin Cherian
Fujitsu Australia



Re: Restrict publishing of partitioned table with a foreign table as partition

От
Shlok Kyal
Дата:
On Wed, 4 Jun 2025 at 16:12, Ajin Cherian <itsajin@gmail.com> wrote:
>
> On Tue, May 20, 2025 at 2:33 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
> >
> > This approach seems better to me. I have created a patch with the
> > above approach.
> >
> > Thanks and Regards,
> > Shlok Kyal
>
> Some quick comments on the patch:
> 1. In doc/src/sgml/ref/create_subscription.sgml:
> +    has partitioned table with foreign table as partition. If this scenario is
> +    detected we ERROR is logged to the user.
> +   </para>
> +
>
> Should be: "If this scenario is detected an ERROR is logged to the
> user." (remove "we").
>
> In src/backend/commands/subscriptioncmds.c:
> 2. The comment header:
> + * This function is in charge of detecting if publisher with
> + * publish_via_partition_root=true publishes a partitioned table that has a
> + * foreign table as a partition.
>
> Add "and throw an error if found" at the end of that sentence to
> correctly describe what the function does.
>
> 3.
> +    appendStringInfoString(&cmd,
> +                           "SELECT DISTINCT P.pubname AS pubname "
> +                           "from pg_catalog.pg_publication p, LATERAL "
> +                           "pg_get_publication_tables(p.pubname) gpt, LATERAL "
> +                           "pg_partition_tree(gpt.relid) gt JOIN
> pg_catalog.pg_foreign_table ft ON "
> +                           "ft.ftrelid = gt.relid WHERE p.pubviaroot
> = true AND  p.pubname IN (");
>
> use FROM rather than from to maintain SQL style consistency.
>
> 4.
> +                errdetail_plural("The subscription being created on a
> publication (%s) with publish_via_root_partition = true and contains
> partitioned tables with foreign table as partition ",
> +                                 "The subscription being created on
> publications (%s) with publish_via_root_partition = true and contains
> partitioned tables with foreign table as partition ",
> +                                 list_length(publist), pubnames->data),
>
> I think you meant "publish_via_partition_root" here and not
> "publish_via_root_partition ".
>

I have addressed all the comments and attached the updated patch.

Thanks and Regards,
Shlok Kyal

Вложения

Re: Restrict publishing of partitioned table with a foreign table as partition

От
Ajin Cherian
Дата:
On Mon, Jun 9, 2025 at 3:58 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
>
> On Wed, 4 Jun 2025 at 16:12, Ajin Cherian <itsajin@gmail.com> wrote:
> >
> > On Tue, May 20, 2025 at 2:33 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
> > >
> > > This approach seems better to me. I have created a patch with the
> > > above approach.
> > >
> > > Thanks and Regards,
> > > Shlok Kyal
> >
> > Some quick comments on the patch:
> > 1. In doc/src/sgml/ref/create_subscription.sgml:
> > +    has partitioned table with foreign table as partition. If this scenario is
> > +    detected we ERROR is logged to the user.
> > +   </para>
> > +
> >
> > Should be: "If this scenario is detected an ERROR is logged to the
> > user." (remove "we").
> >
> > In src/backend/commands/subscriptioncmds.c:
> > 2. The comment header:
> > + * This function is in charge of detecting if publisher with
> > + * publish_via_partition_root=true publishes a partitioned table that has a
> > + * foreign table as a partition.
> >
> > Add "and throw an error if found" at the end of that sentence to
> > correctly describe what the function does.
> >
> > 3.
> > +    appendStringInfoString(&cmd,
> > +                           "SELECT DISTINCT P.pubname AS pubname "
> > +                           "from pg_catalog.pg_publication p, LATERAL "
> > +                           "pg_get_publication_tables(p.pubname) gpt, LATERAL "
> > +                           "pg_partition_tree(gpt.relid) gt JOIN
> > pg_catalog.pg_foreign_table ft ON "
> > +                           "ft.ftrelid = gt.relid WHERE p.pubviaroot
> > = true AND  p.pubname IN (");
> >
> > use FROM rather than from to maintain SQL style consistency.
> >
> > 4.
> > +                errdetail_plural("The subscription being created on a
> > publication (%s) with publish_via_root_partition = true and contains
> > partitioned tables with foreign table as partition ",
> > +                                 "The subscription being created on
> > publications (%s) with publish_via_root_partition = true and contains
> > partitioned tables with foreign table as partition ",
> > +                                 list_length(publist), pubnames->data),
> >
> > I think you meant "publish_via_partition_root" here and not
> > "publish_via_root_partition ".
> >
>
> I have addressed all the comments and attached the updated patch.


Hi Shlok,

Some more comments:
1.
+     Replication is not supported for foreign tables.  When used as partitions
+     of partitioned tables, publishing of the partitioned table is only allowed
+     if the <literal>publish_via_partition_root</literal> is set to
+     <literal>false</literal>.

In patch: "When used as partitions of partitioned tables, publishing
of the partitioned table is only allowed if the
<literal>publish_via_partition_root</literal> is set to
<literal>false</literal>.

Change to: When foreign tables are used as partitions of partitioned
tables, publishing of the partitioned table is only allowed if the
<literal>publish_via_partition_root</literal> is set to
<literal>false</literal>.

2.
+   <para>
+    When using a subscription parameter copy_data = true, corresponding
+    publications are checked if it has publish_via_partition_root = true and
+    has partitioned table with foreign table as partition. If this scenario is
+    detected an ERROR is logged to the user.
+   </para>

In patch: "When using a subscription parameter copy_data = true, ..."
Change to: "When using the subscription parameter copy_data = true, ..."

In patch: "and has partitioned table with foreign table as partition."
Change to: "and has a partitioned table with a foreign table as its partition"

3.
+ * check_publications_foreign_parts
+ * Check if the publications, on which subscriber is subscribing, publishes any
+ * partitioned table that has an foreign table as its partition and has
+ * publish_via_partition_root set as true. The check is performed only if
+ * copy_data is set as true for the subscription.

In patch: "publishes any partitioned table that has an foreign table
as its partition"
Change to: "publishes any partitioned table that has a foreign table
as its partition"

4.
+ * DML data changes are not published for data in foreign tables,
+ * and yet the tablesync worker is not smart enough to omit data from
+ * foreign tables when they are partitions of partitioned tables.

change to:"Although DML changes to foreign tables are excluded from
publication, the tablesync worker will still attempt to copy data from
foreign table partitions during initial table synchronization."

5.
+ * When publish_via_partition_root is false, each partition published for
+ * replication is listed individually in pg_subscription_rel, and we
+ * don't add partitions that are foreign tables, so this check is not
+ * needed.

In patch: "so this check is not needed"
Change to: "so this function is not called for such tables"

6.
+                errdetail_plural("The subscription being created on a
publication (%s) with publish_via_partition_root = true and contains
partitioned tables with foreign table as partition ",
+                                 "The subscription being created on
publications (%s) with publish_via_partition_root = true and contains
partitioned tables with foreign table as partition ",
+                                 list_length(publist), pubnames->data),

Change to: "The subscription is for a publication (%s) with
publish_via_partition_root = true but one of the partitioned tables
has a foreign table as a partition"
"The subscription is for publications (%s) with
publish_via_partition_root = true but one of the partitioned tables
has a foreign table as a partition"

7.
+        /* capture all publications that include this relation directly */
+        puboids = list_concat(puboids, GetRelationPublications(rel->rd_id));
+        schemaid = RelationGetNamespace(rel);
+        puboids = list_concat(puboids, GetSchemaPublications(schemaid));
+
+        /* and do the same for its ancestors, if any */
+        ancestors = get_partition_ancestors(rel->rd_id);
+        foreach_oid(ancestor, ancestors)
+        {
+            puboids = list_concat(puboids, GetRelationPublications(ancestor));
+            schemaid = get_rel_namespace(ancestor);
+            puboids = list_concat(puboids, GetSchemaPublications(schemaid));
+        }
+
+        /* Now check the publish_via_partition_root bit for each of those */
+        list_sort(puboids, list_oid_cmp);
+        list_deduplicate_oid(puboids);
+        foreach_oid(puboid, puboids)

Why do we need to do all this logic for non partition foreign tables?
Just directly throw an error for those tables and do this extra logic
only for partitioned tables.


8.
+        pg_fatal("Your installation contains publications, which has
foreign table which are partitions of published table.\n"
+                 "A list of potentially-affected publications is in
the file:\n"

Change to: "Your installation contains publications, where one of the
partitioned tables has a foreign table as a partition.\n"

regards,
Ajin Cherian
Fujitsu Australia