Re: User's responsibility when using a chain of "immutable" functions?

Поиск
Список
Период
Сортировка
От Bryn Llewellyn
Тема Re: User's responsibility when using a chain of "immutable" functions?
Дата
Msg-id 1DFED436-377A-4601-9AC3-13422DD60638@yugabyte.com
обсуждение исходный текст
Ответ на Re: User's responsibility when using a chain of "immutable" functions?  (Christophe Pettus <xof@thebuild.com>)
Ответы Re: User's responsibility when using a chain of "immutable" functions?  ("David G. Johnston" <david.g.johnston@gmail.com>)
Список pgsql-general
xof@thebuild.com wrote:

I think "not allowed" is putting it too strongly. It would be a bit much to ask that every single user-written immutable function be 100% perfect when it is rolled out, and never have to fix any bugs in them. However, you definitely *do* have to understand that there are administrative consequences for doing so, like rebuilding indexes and invalidating session caches. I think that the OP's statement that you can't ever use user-defined functions from an immutable function is too strong, too; you need to be aware of the consequences if you change an immutable function in a way that alters the return result for a previously-valid set of arguments.

My email that started this discussion has prompted a lot of answers in a few branches. This means that it's too hard for me to respond carefully to everything that's been said. But it does seem that there are a few different opinions about how safety might be ensured in the face of wrong results risks and what, therefore, might define proper practice.

I realize, now, that I didn't distinguish  between:

(1) What you might do (with some caution and attention to detail) in the development shop; and

(2) What you might do when patching the database backend of a deployed production system.

Case #2 is arguably clear cut—as long as you accept that there's no such thing as safe hot patching (unless it's Oracle Database and you have edition-based redefinition). So, accepting this, you have to quiesce the system and make all your changes in a self-imposed single-session fashion. Of course, you have to pay attention to expression-based indexes. But anyone who adopts my maximally cautious approach of allowing only hermetic "immutable" functions and who never uses "create or replace" will find that the index safety risk looks after itself.

And case #1 is arguably less of a concern—unless it compromises your regression testing.

Anyway...

PG has a very lightweight scheme for dependencies that tracks just a few cases—like the dependence of an expression-based index that references a user-defined function upon that function. But, significantly, function-upon-function dependence (like my testcase showed) is not tracked. This is a non-negotiable fundamental property of PG.

It's worth noting that (at least as I have understood it) the "immutable" notion is simply a hint that gives PG permission to cache results rather than to calculate them afresh on every access. And that this is a device that seeks to improve performance. But significantly, there's no robust cache invalidation scheme (and nor could there be) so this leaves it to the user to promise safety.

There's no possible discussion about the trade-off between performance and correctness. So this argues for, at least, huge caution when you think that you might mark a function "immutable". Having said this, I noted this from pavel.stehule@gmail.com:

I know so many hard performance issues are related to missing STABLE or IMMUTABLE flags of some functions.

A caveat is needed because you're not allowed to reference a user-defined function in an expression-based index unless it's marked "immutable". But this ultimately is no more than a formal prick to the user's conscience. Nothing stops you from lying through your teeth in this scenario.

It was all these considerations that led me to my proposal for *my own* practice:

(1) Never use "create or replace" to change an "immutable" function—but, rather always use "drop" and a fresh bare "create".

(2) Never mark a function "immutable" unless its behavior is determined entirely by its own source text. (I'll say this as "unless the function is hermetic".) This notion accommodates use of some built-in functions (but even there, caution is needed because of how session settings can affect the behavior of some built-ins) but it most certainly prohibits any reference to user-defined artifacts (including to other "immutable" functions.) It also prohibits catalog queries.

Both david.g.johnston@gmail.com and xof@thebuild.com have argued that my stance is overly cautious. Fair enough. They can make their own minds up. But what convinces me is the complete lack of static dependencies and the infeasibility of discovering all dynamic dependencies by exhaustive human analysis of source text when the only help you have is a catalog query to identify all "immutable" functions in a database and another to identify all indexes that reference a function. (But here, as far as I can tell, you need human inspection to spot the subset that are user-defined functions.)

One more thing...

I see now that I didn't clearly describe an outcome shown by the testcase that I copied in my email that started this thread. Here's a sufficient, shortened, version. The setup is what I showed you before.

When I drop the first link, f1() in the chain of "immutable" functions, I see that I cannot invoke f(2) because it now complains that f1() doesn't exist. This surprises me because, after all, the result of f2() is now cached (at least as I suppose) and its body isn't executed to produce the result. This outcome almost suggests that there is, after all, a dependency tracking scheme at work.

Yet I can still invoke the third link, f(3), and it still does produce the value that it had cached!

This just tells me that it's stupid to try to deduce the intended behavior of a software system by empirical testing. (It's also hard to deduce what's intended by reading the source code.)

--------------------------------------------------------------------------------

create function f1()
  returns text
  immutable
  language plpgsql
as $body$
begin
  return 'dog';
end;
$body$;

create function f2()
  returns text
  immutable
  language plpgsql
as $body$
begin
  return f1();
end;
$body$;

create function f3()
  returns text
  immutable
  language plpgsql
as $body$
begin
  return f2();
end;
$body$;

select f1(), f2(), f3();

drop function f1() cascade;

select f2(); -- errors with "function f1() does not exist"

select f3(); -- happily returns 'dog'

В списке pgsql-general по дате отправления:

Предыдущее
От: DAN LU
Дата:
Сообщение: Question on pg_auto_failover extension
Следующее
От: "David G. Johnston"
Дата:
Сообщение: Re: User's responsibility when using a chain of "immutable" functions?