Re: [PATCH] Allow parallelism for plpgsql return expression after commit 556f7b7
От | DIPESH DHAMELIYA |
---|---|
Тема | Re: [PATCH] Allow parallelism for plpgsql return expression after commit 556f7b7 |
Дата | |
Msg-id | CABgZEgf0=VkmHjQwxVWzT4ecdRB48KBgNZwKPd9bKjUALKe6EQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [PATCH] Allow parallelism for plpgsql return expression after commit 556f7b7 (Dilip Kumar <dilipbalaut@gmail.com>) |
Ответы |
Re: [PATCH] Allow parallelism for plpgsql return expression after commit 556f7b7
|
Список | pgsql-hackers |
> On Tue, May 20, 2025 at 11:57 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Mon, May 5, 2025 at 11:19 AM DIPESH DHAMELIYA > <dipeshdhameliya125@gmail.com> wrote: > > > > Hello everyone, > > > > With the commit 556f7b7bc18d34ddec45392965c3b3038206bb62, Any plpgsql function that returns scalar value would not beable to use parallelism to evaluate a return statement. It will not be considered for parallel execution because we arepassing maxtuples = 2 to exec_run_select from exec_eval_expr to evaluate the return expression of the function. > > > I could not find commit '556f7b7bc18d34ddec45392965c3b3038206bb62' in > git log on the master branch, but here is my analysis after looking at > your patch. Here is the github link to commit - https://github.com/postgres/postgres/commit/556f7b7bc18d34ddec45392965c3b3038206bb62 and discussion - https://www.postgresql.org/message-id/flat/20241206062549.710dc01cf91224809dd6c0e1%40sraoss.co.jp > > I don't think we can remove the 'maxtuples' parameter from > exec_run_select(). In this particular case, the query itself is > returning a single tuple, so we are good. Still, in other cases where > the query returns more tuples, it makes sense to stop the execution as > soon as we have got enough tuples otherwise, it will do the execution > until we produce all the tuples. Consider the below example where we > just need to use the first tuple, but if we apply your patch, the > executor will end up processing all the tuples, and it will impact the > performance. So IMHO, the benefit you get by enabling a parallelism > in some cases may hurt badly in other cases, as you will end up > processing more tuples than required. > > CREATE OR REPLACE FUNCTION get_first_user_email() > RETURNS TEXT AS $$ > DECLARE > user_email TEXT; > BEGIN > user_email = (SELECT email FROM users); > RETURN user_email; > END; > $$ LANGUAGE plpgsql; > I understand but aren't we blocking parallelism for genuine cases with a very complex query where parallelism can help to some extent to improve execution time? Users can always rewrite a query (for example using TOP clause) if they are expecting one tuple to be returned.
В списке pgsql-hackers по дате отправления: