Обсуждение: TRUNCATE on foreign tables

Поиск
Список
Период
Сортировка

TRUNCATE on foreign tables

От
Kohei KaiGai
Дата:
Hello,

We right now don't support TRUNCATE on foreign tables.
It may be a strange missing piece and restriction of operations.
For example, if a partitioned table contains some foreign tables in its leaf,
user cannot use TRUNCATE command to clean up the partitioned table.

Probably, API design is not complicated. We add a new callback for truncate
on the FdwRoutine, and ExecuteTruncateGuts() calls it if relation is foreign-
table. In case of postgres_fdw, it also issues "TRUNCATE" command on the
remote side in the transaction block [*1].

[*1] But I hope oracle_fdw does not follow this implementation as is. :-)

How about your thought?

I noticed this restriction when I'm working on Arrow_Fdw enhancement for
"writable" capability. Because Apache Arrow [*2] is a columnar file format,
it is not designed for UPDATE/DELETE, but capable to bulk-INSERT.
It is straightforward idea to support only INSERT, and clear data by TRUNCATE.

[*2] Apache Arrow - https://arrow.apache.org/docs/format/Columnar.html

Best regards,
-- 
HeteroDB, Inc / The PG-Strom Project
KaiGai Kohei <kaigai@heterodb.com>



Re: TRUNCATE on foreign tables

От
Kohei KaiGai
Дата:
Hello,

The attached patch adds TRUNCATE support on foreign table.

This patch adds an optional callback ExecForeignTruncate(Relation rel)
to FdwRoutine.
It is invoked during ExecuteTruncateGuts, then FDW driver hands over
the jobs related
to complete "truncate on the foreign table".
Of course, it is not clear to define the concept of "truncate" on some
FDW drivers.
In this case, TRUNCATE command prohibits to apply these foreign tables.

2019 is not finished at everywhere on the earth yet, so I believe it
is Ok to add this patch
to CF-2020:Jan.

Best regards,

2020年1月1日(水) 11:46 Kohei KaiGai <kaigai@heterodb.com>:
>
> Hello,
>
> We right now don't support TRUNCATE on foreign tables.
> It may be a strange missing piece and restriction of operations.
> For example, if a partitioned table contains some foreign tables in its leaf,
> user cannot use TRUNCATE command to clean up the partitioned table.
>
> Probably, API design is not complicated. We add a new callback for truncate
> on the FdwRoutine, and ExecuteTruncateGuts() calls it if relation is foreign-
> table. In case of postgres_fdw, it also issues "TRUNCATE" command on the
> remote side in the transaction block [*1].
>
> [*1] But I hope oracle_fdw does not follow this implementation as is. :-)
>
> How about your thought?
>
> I noticed this restriction when I'm working on Arrow_Fdw enhancement for
> "writable" capability. Because Apache Arrow [*2] is a columnar file format,
> it is not designed for UPDATE/DELETE, but capable to bulk-INSERT.
> It is straightforward idea to support only INSERT, and clear data by TRUNCATE.
>
> [*2] Apache Arrow - https://arrow.apache.org/docs/format/Columnar.html
>
> Best regards,
> --
> HeteroDB, Inc / The PG-Strom Project
> KaiGai Kohei <kaigai@heterodb.com>



--
HeteroDB, Inc / The PG-Strom Project
KaiGai Kohei <kaigai@heterodb.com>

Вложения

Re: TRUNCATE on foreign tables

От
Alvaro Herrera
Дата:
On 2020-Jan-01, Kohei KaiGai wrote:

> Hello,
> 
> The attached patch adds TRUNCATE support on foreign table.
> 
> This patch adds an optional callback ExecForeignTruncate(Relation rel)
> to FdwRoutine.
> It is invoked during ExecuteTruncateGuts, then FDW driver hands over
> the jobs related to complete "truncate on the foreign table".

I think this would need to preserve the notion of multi-table truncates.
Otherwise it won't be possible to truncate tables linked by FKs.  I
think this means the new entrypoint needs to receive a list of rels to
truncate, not just one.  (Maybe an alternative is to make it "please
truncate rel X, and be aware that relations Y,Z are also being
truncated at the same time".)

Looking at apache arrow documentation, it doesn't appear that it has
anything like FK constraints.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: TRUNCATE on foreign tables

От
Kohei KaiGai
Дата:
2020年1月2日(木) 12:16 Alvaro Herrera <alvherre@2ndquadrant.com>:
>
> On 2020-Jan-01, Kohei KaiGai wrote:
>
> > Hello,
> >
> > The attached patch adds TRUNCATE support on foreign table.
> >
> > This patch adds an optional callback ExecForeignTruncate(Relation rel)
> > to FdwRoutine.
> > It is invoked during ExecuteTruncateGuts, then FDW driver hands over
> > the jobs related to complete "truncate on the foreign table".
>
> I think this would need to preserve the notion of multi-table truncates.
> Otherwise it won't be possible to truncate tables linked by FKs.  I
> think this means the new entrypoint needs to receive a list of rels to
> truncate, not just one.  (Maybe an alternative is to make it "please
> truncate rel X, and be aware that relations Y,Z are also being
> truncated at the same time".)
>
Please check at ExecuteTruncateGuts(). It makes a list of relations to be
truncated, including the relations that references the specified table by FK,
prior to invocation of the new FDW callback.
So, if multiple foreign tables are involved in a single TRUNCATE command,
this callback can be invoked multiple times.

> Looking at apache arrow documentation, it doesn't appear that it has
> anything like FK constraints.
>
Yes. It is just a bunch of columnar data.
In Apache Arrow, no constraint are defined except for "NOT NULL".
(In case when Field::nullable == false, all the values are considered
valid date.)

Thanks,
--
HeteroDB, Inc / The PG-Strom Project
KaiGai Kohei <kaigai@heterodb.com>



Re: TRUNCATE on foreign tables

От
Alvaro Herrera
Дата:
On 2020-Jan-02, Kohei KaiGai wrote:

> 2020年1月2日(木) 12:16 Alvaro Herrera <alvherre@2ndquadrant.com>:
> >
> > I think this would need to preserve the notion of multi-table truncates.
> > Otherwise it won't be possible to truncate tables linked by FKs.  I
> > think this means the new entrypoint needs to receive a list of rels to
> > truncate, not just one.  (Maybe an alternative is to make it "please
> > truncate rel X, and be aware that relations Y,Z are also being
> > truncated at the same time".)
>
> Please check at ExecuteTruncateGuts(). It makes a list of relations to be
> truncated, including the relations that references the specified table by FK,
> prior to invocation of the new FDW callback.
> So, if multiple foreign tables are involved in a single TRUNCATE command,
> this callback can be invoked multiple times.

Yeah, that's my concern: if you have postgres_fdw tables linked by FKs
in the remote server, the truncate will fail because it'll try to
truncate them in separate commands instead of using a multi-table
truncate.

> > Looking at apache arrow documentation, it doesn't appear that it has
> > anything like FK constraints.
> >
> Yes. It is just a bunch of columnar data.
> In Apache Arrow, no constraint are defined except for "NOT NULL".
> (In case when Field::nullable == false, all the values are considered
> valid date.)

OK, I suppose that means there are no concerns such as what I mention
above.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: TRUNCATE on foreign tables

От
Kohei KaiGai
Дата:
2020年1月2日(木) 20:56 Alvaro Herrera <alvherre@2ndquadrant.com>:
>
> On 2020-Jan-02, Kohei KaiGai wrote:
>
> > 2020年1月2日(木) 12:16 Alvaro Herrera <alvherre@2ndquadrant.com>:
> > >
> > > I think this would need to preserve the notion of multi-table truncates.
> > > Otherwise it won't be possible to truncate tables linked by FKs.  I
> > > think this means the new entrypoint needs to receive a list of rels to
> > > truncate, not just one.  (Maybe an alternative is to make it "please
> > > truncate rel X, and be aware that relations Y,Z are also being
> > > truncated at the same time".)
> >
> > Please check at ExecuteTruncateGuts(). It makes a list of relations to be
> > truncated, including the relations that references the specified table by FK,
> > prior to invocation of the new FDW callback.
> > So, if multiple foreign tables are involved in a single TRUNCATE command,
> > this callback can be invoked multiple times.
>
> Yeah, that's my concern: if you have postgres_fdw tables linked by FKs
> in the remote server, the truncate will fail because it'll try to
> truncate them in separate commands instead of using a multi-table
> truncate.
>
Ah, it makes sense.
Probably, backend can make sub-list of the foreign tables to be
truncated for each
pair of FDW and Server, then invoke the FDW callback only once with this list.
FDW driver can issue multi-tables truncate on all the foreign tables
supplied, with
nothing difficult to do.

Best regards,
--
HeteroDB, Inc / The PG-Strom Project
KaiGai Kohei <kaigai@heterodb.com>



Re: TRUNCATE on foreign tables

От
Stephen Frost
Дата:
Greetings,

* Kohei KaiGai (kaigai@heterodb.com) wrote:
> 2020年1月2日(木) 20:56 Alvaro Herrera <alvherre@2ndquadrant.com>:
> > On 2020-Jan-02, Kohei KaiGai wrote:
> > > 2020年1月2日(木) 12:16 Alvaro Herrera <alvherre@2ndquadrant.com>:
> > > > I think this would need to preserve the notion of multi-table truncates.
> > > > Otherwise it won't be possible to truncate tables linked by FKs.  I
> > > > think this means the new entrypoint needs to receive a list of rels to
> > > > truncate, not just one.  (Maybe an alternative is to make it "please
> > > > truncate rel X, and be aware that relations Y,Z are also being
> > > > truncated at the same time".)
> > >
> > > Please check at ExecuteTruncateGuts(). It makes a list of relations to be
> > > truncated, including the relations that references the specified table by FK,
> > > prior to invocation of the new FDW callback.

Erm, sure it does, but we don't support having FKs on foreign tables
today, so that doesn't really help with this issue, does it?

> > > So, if multiple foreign tables are involved in a single TRUNCATE command,
> > > this callback can be invoked multiple times.
> >
> > Yeah, that's my concern: if you have postgres_fdw tables linked by FKs
> > in the remote server, the truncate will fail because it'll try to
> > truncate them in separate commands instead of using a multi-table
> > truncate.

I agree that the FDW callback should support multiple tables in the
TRUNCATE, but I think it also should include CASCADE as an option and
have that be passed on to the FDW to handle.

> Ah, it makes sense.
> Probably, backend can make sub-list of the foreign tables to be
> truncated for each
> pair of FDW and Server, then invoke the FDW callback only once with this list.
> FDW driver can issue multi-tables truncate on all the foreign tables
> supplied, with
> nothing difficult to do.

This doesn't really make sense as we don't track FK relationships in the
local server for foreign tables today- now, perhaps we should (and things
like primary keys too..), but I don't think that needs to be the job of
this particular patch.  Instead, I'd suggest we have the core code build
up a list of tables to truncate, for each server, based just on the list
passed in by the user, and then also pass in if CASCADE was included or
not, and then let the FDW handle that in whatever way makes sense for
the foreign server (which, for a PG system, would probably be just
building up the TRUNCATE command and running it with or without the
CASCADE option, but it might be different on other systems).

Just to be clear- I don't mean to suggest that we should explicitly
avoid the logic in TruncateGuts that builds up the list when CASCADE is
used, just saying that it's not going to actually do anything when we're
talking about foreign tables- and that's *fine*.  I don't think we need
to do more here until we're actually tracking remote FKs locally.

So, I think the patch just needs a bit of minor adjustment for that to
make it work for the case that Alvaro is concerned about.  One thing
that isn't really clear to me is if we should also support the 'ONLY'
option to TRUNCATE when it comes to FDWs; a table can't be both foreign
and partitioned, so it's not an issue there, but a foreign table CAN
be a child table of another foreign table.

Of course, if that's the case, things get pretty odd looking pretty
quickly if both sides see the table as a child table because we
actually end up scanning the foreign parent (which will include rows
from the child on the remote side) and then scanning the foreign child
*again*, resulting in duplicate rows coming back, so I'm not really sure
how much effort we should be thinking about putting into dealing with
child foreign tables..

Thanks,

Stephen

Вложения

Re: TRUNCATE on foreign tables

От
Michael Paquier
Дата:
On Thu, Jan 02, 2020 at 09:46:44AM -0500, Stephen Frost wrote:
> I agree that the FDW callback should support multiple tables in the
> TRUNCATE, but I think it also should include CASCADE as an option and
> have that be passed on to the FDW to handle.

As much as RESTRICT, ONLY and the INDENTITY clauses, no?  Just think
about postgres_fdw.
--
Michael

Вложения

Re: TRUNCATE on foreign tables

От
Stephen Frost
Дата:
Greetings,

* Michael Paquier (michael@paquier.xyz) wrote:
> On Thu, Jan 02, 2020 at 09:46:44AM -0500, Stephen Frost wrote:
> > I agree that the FDW callback should support multiple tables in the
> > TRUNCATE, but I think it also should include CASCADE as an option and
> > have that be passed on to the FDW to handle.
>
> As much as RESTRICT, ONLY and the INDENTITY clauses, no?  Just think
> about postgres_fdw.

RESTRICT, yes.  I don't know about ONLY being sensible as we don't
really deal with inheritance and foreign tables very cleanly today, as I
said up-thread, so I'm not sure if we really want to put in the effort
that it'd require to figure out how to make ONLY make sense.  The
question about how to handle IDENTITY is a good one.  I suppose we could
just pass that down and let the FDW sort it out..?

Thanks,

Stephen

Вложения

Re: TRUNCATE on foreign tables

От
Michael Paquier
Дата:
On Mon, Jan 06, 2020 at 04:32:39PM -0500, Stephen Frost wrote:
> RESTRICT, yes.  I don't know about ONLY being sensible as we don't
> really deal with inheritance and foreign tables very cleanly today, as I
> said up-thread, so I'm not sure if we really want to put in the effort
> that it'd require to figure out how to make ONLY make sense.

True enough.

> The question about how to handle IDENTITY is a good one.  I suppose
> we could just pass that down and let the FDW sort it out..?

Looking at the code, ExecuteTruncateGuts() passes down restart_seqs,
so it sounds sensible to me to just pass down that to the FDW
callback and the callback decide what to do.
--
Michael

Вложения

Re: TRUNCATE on foreign tables

От
Kohei KaiGai
Дата:
2020年1月7日(火) 16:03 Michael Paquier <michael@paquier.xyz>:
>
> On Mon, Jan 06, 2020 at 04:32:39PM -0500, Stephen Frost wrote:
> > RESTRICT, yes.  I don't know about ONLY being sensible as we don't
> > really deal with inheritance and foreign tables very cleanly today, as I
> > said up-thread, so I'm not sure if we really want to put in the effort
> > that it'd require to figure out how to make ONLY make sense.
>
> True enough.
>
> > The question about how to handle IDENTITY is a good one.  I suppose
> > we could just pass that down and let the FDW sort it out..?
>
> Looking at the code, ExecuteTruncateGuts() passes down restart_seqs,
> so it sounds sensible to me to just pass down that to the FDW
> callback and the callback decide what to do.
>
It looks to me the local sequences owned by a foreign table shall be restarted
by the core, regardless of relkind of the owner relation. So, even if FDW driver
is buggy, consistency of the local database is kept, right?
Indeed, "restart_seqs" flag is needed to propagate the behavior, however,
it shall be processed on the remote side via the secondary "TRUNCATE" command.
Is it so sensitive?

Best regards,
--
HeteroDB, Inc / The PG-Strom Project
KaiGai Kohei <kaigai@heterodb.com>



Re: TRUNCATE on foreign tables

От
Kohei KaiGai
Дата:
Hello,

The attached patch is the revised version of TRUNCATE on foreign tables.

Definition of the callback is revised as follows:

typedef void (*ExecForeignTruncate_function) (List *frels_list,
                                             bool is_cascade,
                                             bool restart_seqs);

The "frels_list" is a list of foreign tables that are connected to a particular
foreign server, thus, the server-id pulled out by foreign tables id should be
identical for all the relations in the list.
Due to the API design, this callback shall be invoked for each foreign server
involved in the TRUNCATE command, not per table basis.

The 2nd and 3rd arguments also informs FDW driver other options of the
command. If FDW has a concept of "cascaded truncate" or "restart sequence",
it can adjust its remote query. In postgres_fdw, it follows the manner of
usual TRUNCATE command.

Best regards,

2020年1月8日(水) 1:08 Kohei KaiGai <kaigai@heterodb.com>:
>
> 2020年1月7日(火) 16:03 Michael Paquier <michael@paquier.xyz>:
> >
> > On Mon, Jan 06, 2020 at 04:32:39PM -0500, Stephen Frost wrote:
> > > RESTRICT, yes.  I don't know about ONLY being sensible as we don't
> > > really deal with inheritance and foreign tables very cleanly today, as I
> > > said up-thread, so I'm not sure if we really want to put in the effort
> > > that it'd require to figure out how to make ONLY make sense.
> >
> > True enough.
> >
> > > The question about how to handle IDENTITY is a good one.  I suppose
> > > we could just pass that down and let the FDW sort it out..?
> >
> > Looking at the code, ExecuteTruncateGuts() passes down restart_seqs,
> > so it sounds sensible to me to just pass down that to the FDW
> > callback and the callback decide what to do.
> >
> It looks to me the local sequences owned by a foreign table shall be restarted
> by the core, regardless of relkind of the owner relation. So, even if FDW driver
> is buggy, consistency of the local database is kept, right?
> Indeed, "restart_seqs" flag is needed to propagate the behavior, however,
> it shall be processed on the remote side via the secondary "TRUNCATE" command.
> Is it so sensitive?
>
> Best regards,
> --
> HeteroDB, Inc / The PG-Strom Project
> KaiGai Kohei <kaigai@heterodb.com>



--
HeteroDB, Inc / The PG-Strom Project
KaiGai Kohei <kaigai@heterodb.com>

Вложения

Re: TRUNCATE on foreign tables

От
Michael Paquier
Дата:
On Tue, Jan 14, 2020 at 06:16:17PM +0900, Kohei KaiGai wrote:
> The "frels_list" is a list of foreign tables that are connected to a particular
> foreign server, thus, the server-id pulled out by foreign tables id should be
> identical for all the relations in the list.
> Due to the API design, this callback shall be invoked for each foreign server
> involved in the TRUNCATE command, not per table basis.
>
> The 2nd and 3rd arguments also informs FDW driver other options of the
> command. If FDW has a concept of "cascaded truncate" or "restart sequence",
> it can adjust its remote query. In postgres_fdw, it follows the manner of
> usual TRUNCATE command.

I have done a quick read through the patch.  You have modified the
patch to pass down to the callback a list of relation OIDs to execute
one command for all, and there are tests for FKs so that coverage
looks fine.

Regression tests are failing with this patch:
 -- TRUNCATE doesn't work on foreign tables, either directly or
 recursively
 TRUNCATE ft2;  -- ERROR
-ERROR:  "ft2" is not a table
+ERROR:  foreign-data wrapper "dummy" has no handler
You visibly just need to update the output because no handlers are
available for truncate in this case.

+void
+deparseTruncateSql(StringInfo buf, Relation rel)
+{
+   deparseRelation(buf, rel);
+}
Don't see much point in having this routine.

+     If FDW does not provide this callback, PostgreSQL considers
+     <command>TRUNCATE</command> is not supported on the foreign table.
+    </para>
This sentence is weird.  Perhaps you meant "as not supported"?

+     <literal>frels_list</literal> is a list of foreign tables that are
+     connected to a particular foreign server; thus, these foreign tables
+     should have identical foreign server ID
The list is built by the backend code, so that has to be true.

+       foreach (lc, frels_list)
+       {
+           Relation    frel = lfirst(lc);
+           Oid         frel_oid = RelationGetRelid(frel);
+
+           if (server_id == GetForeignServerIdByRelId(frel_oid))
+           {
+               frels_list = foreach_delete_current(frels_list, lc);
+               curr_frels = lappend(curr_frels, frel);
+           }
+       }
Wouldn't it be better to fill in a hash table for each server with a
list of relations?

+typedef void (*ExecForeignTruncate_function) (List *frels_list,
+                                             bool is_cascade,
+                                             bool restart_seqs);
I would recommend to pass down directly DropBehavior instead of a
boolean to the callback.  That's more extensible.
--
Michael

Вложения

Re: TRUNCATE on foreign tables

От
Kohei KaiGai
Дата:
2020年1月15日(水) 17:11 Michael Paquier <michael@paquier.xyz>:
>
> On Tue, Jan 14, 2020 at 06:16:17PM +0900, Kohei KaiGai wrote:
> > The "frels_list" is a list of foreign tables that are connected to a particular
> > foreign server, thus, the server-id pulled out by foreign tables id should be
> > identical for all the relations in the list.
> > Due to the API design, this callback shall be invoked for each foreign server
> > involved in the TRUNCATE command, not per table basis.
> >
> > The 2nd and 3rd arguments also informs FDW driver other options of the
> > command. If FDW has a concept of "cascaded truncate" or "restart sequence",
> > it can adjust its remote query. In postgres_fdw, it follows the manner of
> > usual TRUNCATE command.
>
> I have done a quick read through the patch.  You have modified the
> patch to pass down to the callback a list of relation OIDs to execute
> one command for all, and there are tests for FKs so that coverage
> looks fine.
>
> Regression tests are failing with this patch:
>  -- TRUNCATE doesn't work on foreign tables, either directly or
>  recursively
>  TRUNCATE ft2;  -- ERROR
> -ERROR:  "ft2" is not a table
> +ERROR:  foreign-data wrapper "dummy" has no handler
> You visibly just need to update the output because no handlers are
> available for truncate in this case.
>
What error message is better in this case? It does not print "ft2" anywhere,
so user may not notice that "ft2" is the source of the error.
How about 'foreign table "ft2" does not support truncate' ?

> +void
> +deparseTruncateSql(StringInfo buf, Relation rel)
> +{
> +   deparseRelation(buf, rel);
> +}
> Don't see much point in having this routine.
>
deparseRelation() is a static function in postgres_fdw/deparse.c
On the other hand, it may be better to move entire logic to construct
remote TRUNCATE command in the deparse.c side like other commands.

> +     If FDW does not provide this callback, PostgreSQL considers
> +     <command>TRUNCATE</command> is not supported on the foreign table.
> +    </para>
> This sentence is weird.  Perhaps you meant "as not supported"?
>
Yes.
If FDW does not provide this callback, PostgreSQL performs as if TRUNCATE
is not supported on the foreign table.

> +     <literal>frels_list</literal> is a list of foreign tables that are
> +     connected to a particular foreign server; thus, these foreign tables
> +     should have identical foreign server ID
> The list is built by the backend code, so that has to be true.
>
> +       foreach (lc, frels_list)
> +       {
> +           Relation    frel = lfirst(lc);
> +           Oid         frel_oid = RelationGetRelid(frel);
> +
> +           if (server_id == GetForeignServerIdByRelId(frel_oid))
> +           {
> +               frels_list = foreach_delete_current(frels_list, lc);
> +               curr_frels = lappend(curr_frels, frel);
> +           }
> +       }
> Wouldn't it be better to fill in a hash table for each server with a
> list of relations?
>
It's just a matter of preference. A temporary hash-table with server-id
and list of foreign-tables is an idea. Let me try to revise.

> +typedef void (*ExecForeignTruncate_function) (List *frels_list,
> +                                             bool is_cascade,
> +                                             bool restart_seqs);
> I would recommend to pass down directly DropBehavior instead of a
> boolean to the callback.  That's more extensible.
>
Ok,

Best regards,
--
HeteroDB, Inc / The PG-Strom Project
KaiGai Kohei <kaigai@heterodb.com>



Re: TRUNCATE on foreign tables

От
Michael Paquier
Дата:
On Wed, Jan 15, 2020 at 11:33:07PM +0900, Kohei KaiGai wrote:
> 2020年1月15日(水) 17:11 Michael Paquier <michael@paquier.xyz>:
>> I have done a quick read through the patch.  You have modified the
>> patch to pass down to the callback a list of relation OIDs to execute
>> one command for all, and there are tests for FKs so that coverage
>> looks fine.
>>
>> Regression tests are failing with this patch:
>>  -- TRUNCATE doesn't work on foreign tables, either directly or
>>  recursively
>>  TRUNCATE ft2;  -- ERROR
>> -ERROR:  "ft2" is not a table
>> +ERROR:  foreign-data wrapper "dummy" has no handler
>> You visibly just need to update the output because no handlers are
>> available for truncate in this case.
>>
> What error message is better in this case? It does not print "ft2" anywhere,
> so user may not notice that "ft2" is the source of the error.
> How about 'foreign table "ft2" does not support truncate' ?

It sounds to me that this message is kind of right.  This FDW "dummy"
does not have any FDW handler at all, so it complains about it.
Having no support for TRUNCATE is something that may happen after
that.  Actually, this error message from your patch used for a FDW
which has a  handler but no TRUNCATE support could be reworked:
+       if (!fdwroutine->ExecForeignTruncate)
+           ereport(ERROR,
+                   (errcode(ERRCODE_WRONG_OBJECT_TYPE),
+                    errmsg("\"%s\" is not a supported foreign table",
+                           relname)));
Something like "Foreign-data wrapper \"%s\" does not support
TRUNCATE"?

>> +     <literal>frels_list</literal> is a list of foreign tables that are
>> +     connected to a particular foreign server; thus, these foreign tables
>> +     should have identical foreign server ID
>> The list is built by the backend code, so that has to be true.
>>
>> +       foreach (lc, frels_list)
>> +       {
>> +           Relation    frel = lfirst(lc);
>> +           Oid         frel_oid = RelationGetRelid(frel);
>> +
>> +           if (server_id == GetForeignServerIdByRelId(frel_oid))
>> +           {
>> +               frels_list = foreach_delete_current(frels_list, lc);
>> +               curr_frels = lappend(curr_frels, frel);
>> +           }
>> +       }
>> Wouldn't it be better to fill in a hash table for each server with a
>> list of relations?
>
> It's just a matter of preference. A temporary hash-table with server-id
> and list of foreign-tables is an idea. Let me try to revise.

Thanks.  It would not matter much for relations without inheritance
children, but if truncating a partition tree with many foreign tables
using various FDWs that could matter performance-wise.
--
Michael

Вложения

Re: TRUNCATE on foreign tables

От
Kohei KaiGai
Дата:
Hi,

The v3 patch updated the points below:
- 2nd arg of ExecForeignTruncate was changed to DropBehavior, not bool
- ExecuteTruncateGuts() uses a local hash table to track a pair of server-id
  and list of the foreign tables managed by the server.
- Error message on truncate_check_rel() was revised as follows:
   "foreign data wrapper \"%s\" on behalf of the foreign table \"%s\"
does not support TRUNCATE"
- deparseTruncateSql() of postgres_fdw generates entire remote SQL as
like other commands.
- Document SGML was updated.

Best regards,

2020年1月16日(木) 14:40 Michael Paquier <michael@paquier.xyz>:
>
> On Wed, Jan 15, 2020 at 11:33:07PM +0900, Kohei KaiGai wrote:
> > 2020年1月15日(水) 17:11 Michael Paquier <michael@paquier.xyz>:
> >> I have done a quick read through the patch.  You have modified the
> >> patch to pass down to the callback a list of relation OIDs to execute
> >> one command for all, and there are tests for FKs so that coverage
> >> looks fine.
> >>
> >> Regression tests are failing with this patch:
> >>  -- TRUNCATE doesn't work on foreign tables, either directly or
> >>  recursively
> >>  TRUNCATE ft2;  -- ERROR
> >> -ERROR:  "ft2" is not a table
> >> +ERROR:  foreign-data wrapper "dummy" has no handler
> >> You visibly just need to update the output because no handlers are
> >> available for truncate in this case.
> >>
> > What error message is better in this case? It does not print "ft2" anywhere,
> > so user may not notice that "ft2" is the source of the error.
> > How about 'foreign table "ft2" does not support truncate' ?
>
> It sounds to me that this message is kind of right.  This FDW "dummy"
> does not have any FDW handler at all, so it complains about it.
> Having no support for TRUNCATE is something that may happen after
> that.  Actually, this error message from your patch used for a FDW
> which has a  handler but no TRUNCATE support could be reworked:
> +       if (!fdwroutine->ExecForeignTruncate)
> +           ereport(ERROR,
> +                   (errcode(ERRCODE_WRONG_OBJECT_TYPE),
> +                    errmsg("\"%s\" is not a supported foreign table",
> +                           relname)));
> Something like "Foreign-data wrapper \"%s\" does not support
> TRUNCATE"?
>
> >> +     <literal>frels_list</literal> is a list of foreign tables that are
> >> +     connected to a particular foreign server; thus, these foreign tables
> >> +     should have identical foreign server ID
> >> The list is built by the backend code, so that has to be true.
> >>
> >> +       foreach (lc, frels_list)
> >> +       {
> >> +           Relation    frel = lfirst(lc);
> >> +           Oid         frel_oid = RelationGetRelid(frel);
> >> +
> >> +           if (server_id == GetForeignServerIdByRelId(frel_oid))
> >> +           {
> >> +               frels_list = foreach_delete_current(frels_list, lc);
> >> +               curr_frels = lappend(curr_frels, frel);
> >> +           }
> >> +       }
> >> Wouldn't it be better to fill in a hash table for each server with a
> >> list of relations?
> >
> > It's just a matter of preference. A temporary hash-table with server-id
> > and list of foreign-tables is an idea. Let me try to revise.
>
> Thanks.  It would not matter much for relations without inheritance
> children, but if truncating a partition tree with many foreign tables
> using various FDWs that could matter performance-wise.
> --
> Michael



--
HeteroDB, Inc / The PG-Strom Project
KaiGai Kohei <kaigai@heterodb.com>



Re: TRUNCATE on foreign tables

От
Kyotaro Horiguchi
Дата:
Hello.

At Fri, 17 Jan 2020 22:49:41 +0900, Kohei KaiGai <kaigai@heterodb.com> wrote in 
> The v3 patch updated the points below:

Did you attached it?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: TRUNCATE on foreign tables

От
Kohei KaiGai
Дата:
2020年1月20日(月) 11:09 Kyotaro Horiguchi <horikyota.ntt@gmail.com>:
>
> Hello.
>
> At Fri, 17 Jan 2020 22:49:41 +0900, Kohei KaiGai <kaigai@heterodb.com> wrote in
> > The v3 patch updated the points below:
>
> Did you attached it?
>
Sorry, it was a midnight job on Friday.
Please check the attached patch.

Best regards,
--
HeteroDB, Inc / The PG-Strom Project
KaiGai Kohei <kaigai@heterodb.com>

Вложения

Re: TRUNCATE on foreign tables

От
Michael Paquier
Дата:
On Mon, Jan 20, 2020 at 11:30:34AM +0900, Kohei KaiGai wrote:
> Sorry, it was a midnight job on Friday.

Should I be, err, worried about that?  ;)

> Please check the attached patch.

+   switch (behavior)
+   {
+       case DROP_RESTRICT:
+           appendStringInfoString(buf, " RESTRICT");
+           break;
+       case DROP_CASCADE:
+           appendStringInfoString(buf, " CASCADE");
+           break;
+       default:
+           elog(ERROR, "Bug? unexpected DropBehavior (%d)",
(int)behavior);
+           break;
+   }
Here, you can actually remove the default clause.  By doing so,
compilation would generate a warning if a new value is added to
DropBehavior if it is not listed.  So anybody adding a new value to
the enum will need to think about this code path.

+   ereport(ERROR,
+           (errcode(ERRCODE_WRONG_OBJECT_TYPE),
+            errmsg("foreign data wrapper \"%s\" on behalf of the
foreign table \"%s\" does not support TRUNCATE",
+                   fdw->fdwname, relname)));
I see two problems here:
- The error message is too complicated.  I would just use "cannot
truncate foreign table \"%s\"".
- The error code should be ERRCODE_FEATURE_NOT_SUPPORTED.

The docs for the FDW description can be improved.  I found that a
large portion of it used rather unclear English, and that things were
not clear enough regarding the use of a list of relations, when an
error is raised because ExecForeignTruncate is NULL, etc.  I have also
cut the last paragraph which is actually implementation-specific
(think for example about callbacks at xact commit/abort time).

Documentation needs to be added to postgres_fdw about the truncation
support.  Particularly, providing details about the possibility to do
truncates in our shot for a set of relations so as dependencies are
automatically handled is an advantage to mention.

There is no need to include the truncate routine in
ForeignTruncateInfo, as the server OID can be used to find it.

Another thing is that I would prefer splitting the patch into two
separate commits, so attached are two patches:
- 0001 for the addition of the in-core API
- 0002 for the addition of support in postgres_fdw.

I have spent a good amount of time polishing 0001, tweaking the docs,
comments, error messages and a bit its logic.  I am getting
comfortable with it, but it still needs an extra lookup, an indent run
which has some noise and I lacked of time today.  0002 has some of its
issues fixed and I have not reviewed it fully yet.  There are still
some places not adapted in it (why do you use "Bug?" in all your
elog() messages by the way?), so the postgres_fdw part needs more
attention.  Could you think about some docs for it by the way?
--
Michael

Вложения

Re: TRUNCATE on foreign tables

От
Michael Paquier
Дата:
On Mon, Jan 20, 2020 at 10:50:21PM +0900, Michael Paquier wrote:
> I have spent a good amount of time polishing 0001, tweaking the docs,
> comments, error messages and a bit its logic.  I am getting
> comfortable with it, but it still needs an extra lookup, an indent run
> which has some noise and I lacked of time today.  0002 has some of its
> issues fixed and I have not reviewed it fully yet.  There are still
> some places not adapted in it (why do you use "Bug?" in all your
> elog() messages by the way?), so the postgres_fdw part needs more
> attention.  Could you think about some docs for it by the way?

I have more comments about the postgres_fdw that need to be
addressed.

+       if (!OidIsValid(server_id))
+       {
+           server_id = GetForeignServerIdByRelId(frel_oid);
+           user = GetUserMapping(GetUserId(), server_id);
+           conn = GetConnection(user, false);
+       }
+       else if (server_id != GetForeignServerIdByRelId(frel_oid))
+           elog(ERROR, "Bug? inconsistent Server-IDs were supplied");
I agree here that an elog() looks more adapted than an assert.
However I would change the error message to be more like "incorrect
server OID supplied by the TRUNCATE callback" or something similar.
The server OID has to be valid anyway, so don't you just bypass any
errors if it is not set?

+-- truncate two tables at a command
What does this sentence mean?  Isn't that "truncate two tables in one
single command"?

The table names in the tests are rather hard to parse.  I think that
it would be better to avoid underscores at the beginning of the
relation names.

It would be nice to have some coverage with inheritance, and also
track down in the tests what we expect when ONLY is specified in that
case (with and without ONLY, both parent and child relations).

Anyway, attached is the polished version for 0001 that I would be fine
to commit, except for one point: are there objections if we do not
have extra handling for ONLY when it comes to foreign tables with
inheritance?  As the patch stands, the list of relations is first
built, with an inheritance recursive lookup done depending on if ONLY
is used or not.  Hence, if using "TRUNCATE ONLY foreign_tab, ONLY
foreign_tab2", then only those two tables would be passed down to the
FDW.  If ONLY is removed, both tables as well as their children are
added to the lists of relations split by server OID.  One problem is
that this could be confusing for some users I guess?  For example,
with a 1:1 mapping in the schema of the local and remote servers, a
user asking for TRUNCATE ONLY foreign_tab would pass down to the
remote just the equivalent of "TRUNCATE foreign_tab" using
postgres_fdw, causing the full inheritance tree to be truncated on the
remote side.  The concept of ONLY mixed with inherited foreign tables
is rather blurry (point raised by Stephen upthread).
--
Michael

Вложения

Re: TRUNCATE on foreign tables

От
Kohei KaiGai
Дата:
2020年1月21日(火) 15:38 Michael Paquier <michael@paquier.xyz>:
>
> On Mon, Jan 20, 2020 at 10:50:21PM +0900, Michael Paquier wrote:
> > I have spent a good amount of time polishing 0001, tweaking the docs,
> > comments, error messages and a bit its logic.  I am getting
> > comfortable with it, but it still needs an extra lookup, an indent run
> > which has some noise and I lacked of time today.  0002 has some of its
> > issues fixed and I have not reviewed it fully yet.  There are still
> > some places not adapted in it (why do you use "Bug?" in all your
> > elog() messages by the way?), so the postgres_fdw part needs more
> > attention.  Could you think about some docs for it by the way?
>
> I have more comments about the postgres_fdw that need to be
> addressed.
>
> +       if (!OidIsValid(server_id))
> +       {
> +           server_id = GetForeignServerIdByRelId(frel_oid);
> +           user = GetUserMapping(GetUserId(), server_id);
> +           conn = GetConnection(user, false);
> +       }
> +       else if (server_id != GetForeignServerIdByRelId(frel_oid))
> +           elog(ERROR, "Bug? inconsistent Server-IDs were supplied");
> I agree here that an elog() looks more adapted than an assert.
> However I would change the error message to be more like "incorrect
> server OID supplied by the TRUNCATE callback" or something similar.
> The server OID has to be valid anyway, so don't you just bypass any
> errors if it is not set?
>
> +-- truncate two tables at a command
> What does this sentence mean?  Isn't that "truncate two tables in one
> single command"?
>
> The table names in the tests are rather hard to parse.  I think that
> it would be better to avoid underscores at the beginning of the
> relation names.
>
> It would be nice to have some coverage with inheritance, and also
> track down in the tests what we expect when ONLY is specified in that
> case (with and without ONLY, both parent and child relations).
>
> Anyway, attached is the polished version for 0001 that I would be fine
> to commit, except for one point: are there objections if we do not
> have extra handling for ONLY when it comes to foreign tables with
> inheritance?  As the patch stands, the list of relations is first
> built, with an inheritance recursive lookup done depending on if ONLY
> is used or not.  Hence, if using "TRUNCATE ONLY foreign_tab, ONLY
> foreign_tab2", then only those two tables would be passed down to the
> FDW.  If ONLY is removed, both tables as well as their children are
> added to the lists of relations split by server OID.  One problem is
> that this could be confusing for some users I guess?  For example,
> with a 1:1 mapping in the schema of the local and remote servers, a
> user asking for TRUNCATE ONLY foreign_tab would pass down to the
> remote just the equivalent of "TRUNCATE foreign_tab" using
> postgres_fdw, causing the full inheritance tree to be truncated on the
> remote side.  The concept of ONLY mixed with inherited foreign tables
> is rather blurry (point raised by Stephen upthread).
>
Hmm. Do we need to deliver another list to inform why these relations are
trundated?
If callback is invoked with a foreign-relation that is specified by TRUNCATE
command with ONLY, it seems to me reasonable that remote TRUNCATE
command specifies the relation on behalf of the foreign table with ONLY.

Foreign-tables can be truncated because ...
1. it is specified by user with ONLY-clause.
2. it is specified by user without ONLY-clause.
3. it is inherited child of the relations specified at 2.
4. it depends on the relations picked up at 1-3.

So, if ExecForeignTruncate() has another list to inform the context for each
relation, postgres_fdw can build proper remote query that may specify the
remote tables with ONLY-clause.

Regarding to the other comments, it's all Ok for me. I'll update the patch.
And, I forgot "updatable" option at postgres_fdw. It should be checked on
the truncate also, right?

Best regards,
--
HeteroDB, Inc / The PG-Strom Project
KaiGai Kohei <kaigai@heterodb.com>



Re: TRUNCATE on foreign tables

От
Michael Paquier
Дата:
On Mon, Jan 27, 2020 at 11:08:36PM +0900, Kohei KaiGai wrote:
> Hmm. Do we need to deliver another list to inform why these relations are
> trundated?

I got to think more about this one, and being able to control ONLY on
a per-relation basis would be the least surprising approach for the
commands generated.  But at least this avoids truncating a full
inheritance tree on a remote cluster even if ONLY is specified
locally.  Note that I'd like to assume that most applications have a
1:1 mapping in their schemas between a local and remote cluster, but
that's most likely not always the case ;)

> If callback is invoked with a foreign-relation that is specified by TRUNCATE
> command with ONLY, it seems to me reasonable that remote TRUNCATE
> command specifies the relation on behalf of the foreign table with ONLY.
>
> So, if ExecForeignTruncate() has another list to inform the context for each
> relation, postgres_fdw can build proper remote query that may specify the
> remote tables with ONLY-clause.

Yeah, TRUNCATE can specify ONLY on a per-table basis, so having a
second list makes sense.  Then in the FDW, just make sure to
elog(ERROR) if the lengths do no match, and then use forboth() to loop
over them.  One thing that you need to be careful about is that tables
which are added to the list because of inheritance should not be
marked with ONLY when generating the command to the remote.

> Regarding to the other comments, it's all Ok for me. I'll update the patch.
> And, I forgot "updatable" option at postgres_fdw. It should be checked on
> the truncate also, right?

Hmm.  Good point.  Being able to filter that silently through a
configuration parameter is kind of interesting.  Now I think that this
should be a separate option because updatable applies to DMLs.  Like,
truncatable?

For now, as the patch needs more work for its implementation, docs and
tests, I am marking it as returned with feedback.
--
Michael

Вложения

Re: TRUNCATE on foreign tables

От
Kohei KaiGai
Дата:
Hello,

The attached is revised version.

> > If callback is invoked with a foreign-relation that is specified by TRUNCATE
> > command with ONLY, it seems to me reasonable that remote TRUNCATE
> > command specifies the relation on behalf of the foreign table with ONLY.
> >
> > So, if ExecForeignTruncate() has another list to inform the context for each
> > relation, postgres_fdw can build proper remote query that may specify the
> > remote tables with ONLY-clause.
>
> Yeah, TRUNCATE can specify ONLY on a per-table basis, so having a
> second list makes sense.  Then in the FDW, just make sure to
> elog(ERROR) if the lengths do no match, and then use forboth() to loop
> over them.  One thing that you need to be careful about is that tables
> which are added to the list because of inheritance should not be
> marked with ONLY when generating the command to the remote.
>
The v5 patch added separated list for the FDW callback, to inform the context
when relations are specified by TRUNCATE command. The frels_extra
argument is a list of integers. 0 means that relevant foreign-table is specified
without "ONLY" clause. and positive means specified with "ONLY" clause.
Negative value means that foreign-tables are not specified in the TRUNCATE
command, but truncated due to dependency (like partition's child leaf).

The remote SQL generates TRUNCATE command according to the above
"extra" information. So, "TRUNCATE ONLY ftable" generate a remote query
with "TRUNCATE ONLY mapped_remote_table".
On the other hand, it can make strange results, although it is a corner case.
The example below shows the result of TRUNCATE ONLY on a foreign-table
that mapps a remote table with an inherited children.
The rows id < 10 belongs to the parent table, thus TRUNCATE ONLY tru_ftable
eliminated the remote parent, however, it looks the tru_ftable still
contains rows
after TRUNCATE command.

I wonder whether it is tangible behavior for users. Of course, "ONLY" clause
controls local hierarchy of partitioned / inherited tables, however, I'm not
certain whether the concept shall be expanded to the structure of remote tables.

+SELECT * FROM tru_ftable;
+ id |                x
+----+----------------------------------
+  5 | e4da3b7fbbce2345d7772b0674a318d5
+  6 | 1679091c5a880faf6fb5e6087eb1b2dc
+  7 | 8f14e45fceea167a5a36dedd4bea2543
+  8 | c9f0f895fb98ab9159f51fd0297e236d
+  9 | 45c48cce2e2d7fbdea1afc51c7c6ad26
+ 10 | d3d9446802a44259755d38e6d163e820
+ 11 | 6512bd43d9caa6e02c990b0a82652dca
+ 12 | c20ad4d76fe97759aa27a0c99bff6710
+ 13 | c51ce410c124a10e0db5e4b97fc2af39
+ 14 | aab3238922bcc25a6f606eb525ffdc56
+(10 rows)
+
+TRUNCATE ONLY tru_ftable;      -- truncate only parent portion
+SELECT * FROM tru_ftable;
+ id |                x
+----+----------------------------------
+ 10 | d3d9446802a44259755d38e6d163e820
+ 11 | 6512bd43d9caa6e02c990b0a82652dca
+ 12 | c20ad4d76fe97759aa27a0c99bff6710
+ 13 | c51ce410c124a10e0db5e4b97fc2af39
+ 14 | aab3238922bcc25a6f606eb525ffdc56
+(5 rows)

> > Regarding to the other comments, it's all Ok for me. I'll update the patch.
> > And, I forgot "updatable" option at postgres_fdw. It should be checked on
> > the truncate also, right?
>
> Hmm.  Good point.  Being able to filter that silently through a
> configuration parameter is kind of interesting.  Now I think that this
> should be a separate option because updatable applies to DMLs.  Like,
> truncatable?
>
Ok, "truncatable" option was added.
Please check the regression test and documentation updates.

Best regards,
-- 
HeteroDB, Inc / The PG-Strom Project
KaiGai Kohei <kaigai@heterodb.com>

Вложения

Re: TRUNCATE on foreign tables

От
Daniel Gustafsson
Дата:
> On 1 Mar 2020, at 03:24, Kohei KaiGai <kaigai@heterodb.com> wrote:

> The attached is revised version.

This version fails to apply to HEAD due to conflicts in postgres_fdw expected
test output.  Can you please submit a rebased version.  Marking the entry
Waiting on Author in the meantime.

cheers ./daniel



Re: TRUNCATE on foreign tables

От
Daniel Gustafsson
Дата:
> On 2 Jul 2020, at 16:40, Daniel Gustafsson <daniel@yesql.se> wrote:
>
>> On 1 Mar 2020, at 03:24, Kohei KaiGai <kaigai@heterodb.com> wrote:
>
>> The attached is revised version.
>
> This version fails to apply to HEAD due to conflicts in postgres_fdw expected
> test output.  Can you please submit a rebased version.  Marking the entry
> Waiting on Author in the meantime.

As this has stalled, I've marked this patch Returned with Feedback.  Feel free
to open a new entry if you return to this patch.

cheers ./daniel