Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails

Поиск
Список
Период
Сортировка
От Etsuro Fujita
Тема Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails
Дата
Msg-id CAPmGK15ohpwdAPYA=Fw+C+DTdoPMWrpr6_q_XR6u3eJH7MSkVg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On Thu, Sep 2, 2021 at 5:42 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> The real reason that this hasn't gotten committed is that I remain
> pretty uncomfortable about whether it's an acceptable solution to
> the problem.  Suddenly asking people to plaster COLLATE clauses
> on all their textual remote columns seems like a big compatibility
> gotcha.

I think so too.  I reviewed the patch:

                /*
                 * If the Var is from the foreign table, we consider its
-                * collation (if any) safe to use.  If it is from another
+                * collation (if any) safe to use, *unless* it's
+                * DEFAULT_COLLATION_OID.  We treat that as meaning "we don't
+                * know which collation this is".  If it is from another
                 * table, we treat its collation the same way as we would a
                 * Param's collation, ie it's not safe for it to have a
                 * non-default collation.
@@ -350,7 +352,12 @@ foreign_expr_walker(Node *node,

                    /* Else check the collation */
                    collation = var->varcollid;
-                   state = OidIsValid(collation) ? FDW_COLLATE_SAFE :
FDW_COLLATE_NONE;
+                   if (collation == InvalidOid)
+                       state = FDW_COLLATE_NONE;
+                   else if (collation == DEFAULT_COLLATION_OID)
+                       state = FDW_COLLATE_UNSAFE;
+                   else
+                       state = FDW_COLLATE_SAFE;

One thing I noticed about this change is:

explain (verbose, costs off) select * from ft3 order by f2;
                       QUERY PLAN
---------------------------------------------------------
 Sort
   Output: f1, f2, f3
   Sort Key: ft3.f2
   ->  Foreign Scan on public.ft3
         Output: f1, f2, f3
         Remote SQL: SELECT f1, f2, f3 FROM public.loct3
(6 rows)

where ft3 is defined as in the postgres_fdw regression test (see the
section “test handling of collations”).  For this query, the sort is
done locally, but I think it should be done remotely, or an error
should be raised, as we don’t know the collation assigned to the
column “f2”.  So I think we need to do something about this.

Having said that, I think another option for this would be to left the
code as-is; assume that 1) the foreign var has "COLLATE default”, not
an unknown collation, when labeled with "COLLATE default”, and 2)
"COLLATE default” on the local database matches "COLLATE default” on
the remote database.  This would be the same as before, so we could
avoid the concern mentioned above.  I agree with the
postgresImportForeignSchema() change, except creating a local column
with "COLLATE default" silently if that function can’t find a remote
collation matching the database's datcollate/datctype when seeing
"COLLATE default”, in which case I think an error should be raised to
prompt the user to check the settings for the remote server and/or
define foreign tables manually with collations that match the remote
side.  Maybe I’m missing something, though.

Anyway, here is a patch created on top of your patch to modify
async-related test cases to work as intended.  I’m also attaching your
patch to make the cfbot quiet.

Sorry for the delay.

Best regards,
Etsuro Fujita

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jaime Casanova
Дата:
Сообщение: Re: Asymmetric partition-wise JOIN
Следующее
От: Andrew Dunstan
Дата:
Сообщение: Re: Why does bootstrap and later initdb stages happen via client?