Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails
Дата
Msg-id 1072831.1597678008@sss.pgh.pa.us
обсуждение исходный текст
Ответ на BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails  (PG Bug reporting form <noreply@postgresql.org>)
Ответы Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails  (Jiří Fejfar <jurafejfar@gmail.com>)
Список pgsql-bugs
PG Bug reporting form <noreply@postgresql.org> writes:
> Joining two identical tables placed on separate DBs with different collation
> accessed through postgres_fdw failed when joined with merge join. Some
> records are missing (7 vs. 16 rows in example) in output. See this snippet
> https://gitlab.com/-/snippets/2004522 (or code pasted below) for psql script
> reproducing error also with expected output (working fine on alpine linux).

So I think what is happening here is that postgres_fdw's version of
IMPORT FOREIGN SCHEMA translates "COLLATE default" on the remote
server to "COLLATE default" on the local one, which of course is
a big fail if the defaults don't match.  That allows the local
planner to believe that remote ORDER BYs on the two foreign tables
will give compatible results, causing the merge join to not work
very well at all.

We probably need to figure out some way of substituting the remote
database's actual lc_collate setting when we see "COLLATE default".

I'm also thinking that the documentation is way too cavalier about
dismissing non-matching collation names by just saying that you
can turn off import_collate.  The fact is that doing so is likely
to be disastrous, the more so the more optimization intelligence
we add to postgres_fdw.

I wonder if we could do something like this:

* Change postgresImportForeignSchema() as above, so that it will never
apply "COLLATE default" to an imported column, except in the case
where you turn off import_collate.

* In postgres_fdw planning, treat "COLLATE default" on a foreign table
column as meaning "we don't know the collation"; never believe that
that column can be ordered in a way that matches any local collation.
(It'd be better perhaps if there were an explicit way to say "COLLATE
unknown", but I hesitate to invent such a concept in general.)

* Document that in manual creation of a postgres_fdw foreign table
with a text column, you need to explicitly write the correct collation
if you want the best query plans to be generated.

This seems like too big a behavioral change to consider back-patching,
unfortunately.

            regards, tom lane



В списке pgsql-bugs по дате отправления:

Предыдущее
От: PG Bug reporting form
Дата:
Сообщение: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails
Следующее
От: Andrii Palko
Дата:
Сообщение: Weird behaviour after update from 12.2 to 12.3 version