Re: [PATCH] Allow parallelism for plpgsql return expression after commit 556f7b7
От | Dilip Kumar |
---|---|
Тема | Re: [PATCH] Allow parallelism for plpgsql return expression after commit 556f7b7 |
Дата | |
Msg-id | CAFiTN-scg06ccJfS=BPGq=1xfAw40tA53NVPjpjKN-ZSybE0mQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [PATCH] Allow parallelism for plpgsql return expression after commit 556f7b7 (DIPESH DHAMELIYA <dipeshdhameliya125@gmail.com>) |
Список | pgsql-hackers |
On Tue, May 20, 2025 at 1:45 PM DIPESH DHAMELIYA <dipeshdhameliya125@gmail.com> wrote: > > > On Tue, May 20, 2025 at 11:57 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > I don't think we can remove the 'maxtuples' parameter from > > exec_run_select(). In this particular case, the query itself is > > returning a single tuple, so we are good. Still, in other cases where > > the query returns more tuples, it makes sense to stop the execution as > > soon as we have got enough tuples otherwise, it will do the execution > > until we produce all the tuples. Consider the below example where we > > just need to use the first tuple, but if we apply your patch, the > > executor will end up processing all the tuples, and it will impact the > > performance. So IMHO, the benefit you get by enabling a parallelism > > in some cases may hurt badly in other cases, as you will end up > > processing more tuples than required. > > > > CREATE OR REPLACE FUNCTION get_first_user_email() > > RETURNS TEXT AS $$ > > DECLARE > > user_email TEXT; > > BEGIN > > user_email = (SELECT email FROM users); > > RETURN user_email; > > END; > > $$ LANGUAGE plpgsql; > > > > I understand but aren't we blocking parallelism for genuine cases with > a very complex query where parallelism can help to some extent to > improve execution time? Users can always rewrite a query (for example > using TOP clause) if they are expecting one tuple to be returned. IMHO, you are targeting the fix at the wrong place. Basically if we accept this fix means the already existing functions for the users will start performing bad for enabling the parallelism in some other cases where they will see benefits, so it might not be acceptable by many users to change the application and rewrite all the procedures to get the same performance they were getting earlier. I would not say that your concern is wrong because for internal aggregate initplan we are processing all the tuple so logically it should use the parallel plan, so IMHO we need to target the fix for enabling the parallelism for initplan in cases where outer query has input the max number of tuple because that limit is for the outer plan not for the initplan. -- Regards, Dilip Kumar Google
В списке pgsql-hackers по дате отправления: