Tightening behaviour for non-immutable behaviour in immutable functions
От | Greg Stark |
---|---|
Тема | Tightening behaviour for non-immutable behaviour in immutable functions |
Дата | |
Msg-id | CAM-w4HO72ec5PwabFaAGE3cvm=ujcCsJJeCTV2uwZzf7KyY90w@mail.gmail.com обсуждение исходный текст |
Ответы |
Re: Tightening behaviour for non-immutable behaviour in immutable functions
(Mark Dilger <mark.dilger@enterprisedb.com>)
|
Список | pgsql-hackers |
Recently I had someone complaining about a pg_restore failure and I believe we semi-regularly get complaints that are similar -- though I'm having trouble searching for them because the keywords "dump restore failure" are pretty generic. The root cause here -- and I believe for a lot of users -- are functions that are declared IMMUTABLE but are not for reasons that aren't obvious to the user. Indeed poking at this more carefully I think it's downright challenging to write an IMMUTABLE function at all. I suspect *most* users, perhaps even nearly all users, who write functions intending them to be immutable are actually not really as successful as they believe. The biggest culprit is of course search_path. Afaics it's nigh impossible to write any non-trivial immutable function without just setting the search_path GUC on the function. And there's nothing Postgres that requires that. I don't even see anything in the docs recommending it. Many users probably always run with the same search_path so in practice they're probably mostly safe. But one day they could insert data with a different search path with a different function definition in their path and corrupt their index which would be.... poor... Or as in my user they could discover the problem only in the middle of an upgrade which is a terrible time to discover it. I would suggest we should probably at the very least warn users if they create an immutable function that doesn't have search_path set for the function. I would actually prefer to make it an error but that brings in compatibility issues. Perhaps it could be optional. But putting a GUC on a function imposes a pretty heavy performance cost. I'm not sure how bad it is compared to running plpgsql code let alone other languages but IIUC it invalidates some catalog caches which for something happening repeatedly in, e.g. a data load would be pretty bad. It would be nice to have a way to avoid the performance cost and I see two options. Thinking of plpgsql here, we already run the raw parser on all sql when the function is defined. We could somehow check whether the raw_parser found any non-schema-qualified references. This looks like it would be awkward but doable. That would allow users to write non-search_path-dependent code and if postgres doesn't warn they would know they've done it properly. It would still put quite a burden on users, especially when it comes to operators... Or alternatively we could offer lexical scoping so that all objects are looked up at parse time and the fully qualified reference is stored instead of the non-qualified reference. That would be more similar to how views and other object references are handled. I suppose there's a third option that we could provide something which instead of *setting* the guc when a function is entered just verifies that guc is set as expected. That way the function would simply throw an error if search_path is "incorrect" and not have to invalidate any caches. That would at least avoid index corruption but not guarantee dump/reload would work. -- greg
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Roberto C. SánchezДата:
Сообщение: Re: Request for assistance to backport CVE-2022-1552 fixes to 9.6 and 9.4