Обсуждение: How to schema-qualify "IS DISTINCT FROM" on trigger definition? (I created a db that cannot be dump/restored)

Поиск
Список
Период
Сортировка
I've created a database which my vendor (Supabase) cannot dump/restore/upgrade. Ultimately, it comes down to this trigger statement, and the fact that the underlying operations needed to perform the `IS DISTINCT FROM` comparison in the WHEN clause need to be found in the `public` schema. During the restore, the search path is empty, so it fails.

Full example file is below.

The trigger:

CREATE TRIGGER record_content_update BEFORE UPDATE OF content, embedding ON t1
  FOR EACH ROW
  WHEN (((new.content <> old.content) OR (new.embedding IS DISTINCT FROM old.embedding)))
  EXECUTE FUNCTION t1_content_update_handler();

The content field is a JSONB, and the embedding field is a vector from the pg_vector extension.

I make a backup using pg_dump, and upon restore it errors out with this:

psql:dump1.sql:122: ERROR:  operator does not exist: public.vector = public.vector
LINE 1: ... (((new.content <> old.content) OR (new.embedding IS DISTINC...
                                                             ^
HINT:  No operator matches the given name and argument types. You might need to add explicit type casts.

The ^ is under the "IS DISTINCT" in case the formatting makes it unclear.

If I make the operator just `<>` the pg_dump properly schema-qualifies it in the dump as

new.embedding OPERATOR(public.<>) old.embedding

but I need to account for the NULLs. I cannot find a way to schema-quailify the `IS DISTINCT FROM` comparison.

How do I make this trigger definition survive pg_dump/pg_restore? I cannot alter the dump file between the steps.

I'm running version: psql (PostgreSQL) 15.5. For my tests I'm on FreeBSD 14, but Supabase runs whatever version of linux they do and Pg version 15.1.

Full reproduction steps:

Save the file below as create.sql then run these commands:

createdb -U postgres t1
psql -U postgres -f create.sql t1
pg_dump -U postgres t1 > dump.sql
createdb -U postgres t2
psql -U postgres -f dump.sql t2

On the last step, the above referenced error will occur.

Is there a way to fix this, or is it a "don't do that" situation?

The only workaround I can think of is to move the IS DISTINCT FROM test to be inside my trigger function.

--- create.sql file ---
CREATE EXTENSION IF NOT EXISTS "vector";

CREATE TABLE t1 (
  id SERIAL PRIMARY KEY,
  content JSONB DEFAULT '{}'::JSONB NOT NULL,
  embedding vector
);

CREATE FUNCTION t1_content_update_handler() RETURNS TRIGGER
   LANGUAGE plpgsql
   AS $$
 BEGIN
   RAISE INFO '% trigger called for id=%', TG_TABLE_NAME, OLD.id;
   RETURN NEW;
 END;
$$;

CREATE TRIGGER record_content_update BEFORE UPDATE OF content, embedding ON t1
  FOR EACH ROW
  WHEN (((new.content <> old.content) OR (new.embedding IS DISTINCT FROM old.embedding)))
  EXECUTE FUNCTION t1_content_update_handler();
--- end ---
Vick Khera <vivek@khera.org> writes:
> I've created a database which my vendor (Supabase) cannot
> dump/restore/upgrade. Ultimately, it comes down to this trigger statement,
> and the fact that the underlying operations needed to perform the `IS
> DISTINCT FROM` comparison in the WHEN clause need to be found in the
> `public` schema. During the restore, the search path is empty, so it fails.

Yeah.  We've had some discussions about inventing a version of IS
DISTINCT FROM (and some other SQL-spec syntaxes with the same problem)
that supports schema-qualification of the underlying operator.  But
it hasn't gotten further than preliminary discussion.

For the moment, I think the only feasible solution is for your trigger
function to set the search path it needs by adding a "SET search_path
= whatever" clause to the function's CREATE command.

            regards, tom lane



On Wed, Feb 21, 2024 at 4:27 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
For the moment, I think the only feasible solution is for your trigger
function to set the search path it needs by adding a "SET search_path
= whatever" clause to the function's CREATE command.

The error is not in the function, it is the WHEN clause of the trigger. There's no way to set a search path on the trigger as far as I see.

The only option I see is to remove the WHEN clause on the trigger and wrap my function with an IF with those same conditions. I hope this will not result in any noticeable difference in speed.

It is rather unfortunate that one can end up with a schema that a backup with pg_dump cannot be restored. Feel free to keep my example for regression testing when postgres does grow the ability to schema-qualify such operators.
On 2024-02-22 22:14 +0100, Vick Khera wrote:
> On Wed, Feb 21, 2024 at 4:27 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> 
> > For the moment, I think the only feasible solution is for your trigger
> > function to set the search path it needs by adding a "SET search_path
> > = whatever" clause to the function's CREATE command.
> 
> 
> The error is not in the function, it is the WHEN clause of the trigger.
> There's no way to set a search path on the trigger as far as I see.
> 
> The only option I see is to remove the WHEN clause on the trigger and wrap
> my function with an IF with those same conditions. I hope this will not
> result in any noticeable difference in speed.

You may also try the equivalent CASE expression in the WHEN clause.

https://wiki.postgresql.org/wiki/Is_distinct_from#Writing_with_CASE_statements

-- 
Erik



On Thu, Feb 22, 2024 at 5:06 PM Erik Wienhold <ewie@ewie.name> wrote:
On 2024-02-22 22:14 +0100, Vick Khera wrote:
> On Wed, Feb 21, 2024 at 4:27 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> > For the moment, I think the only feasible solution is for your trigger
> > function to set the search path it needs by adding a "SET search_path
> > = whatever" clause to the function's CREATE command.
>
>
> The error is not in the function, it is the WHEN clause of the trigger.
> There's no way to set a search path on the trigger as far as I see.
>
> The only option I see is to remove the WHEN clause on the trigger and wrap
> my function with an IF with those same conditions. I hope this will not
> result in any noticeable difference in speed.

You may also try the equivalent CASE expression in the WHEN clause.

https://wiki.postgresql.org/wiki/Is_distinct_from#Writing_with_CASE_statements

Nice. It makes for a big ugly trigger statement, but probably my better choice.  I was considering doing this but didn't want to risk making my own interpretation.