Обсуждение: A tidyup of pathkeys.c
When working on a5a68dd6d I noticed that truncate_useless_pathkeys() uses a bunch of different static helper functions that are mostly the same as each other. Most of them differ only in which field of PlannerInfo they look at. The attached aims to clean that up by making 2 reusable functions. I've also fixed a few other things I noticed along the way. Here's a list of what I've changed: 1. Add count_common_leading_pathkeys_ordered() function to check for leading common pathkeys and use that for sort_pathkeys, window_pathkeys and window_pathkeys. 2. Add count_common_leading_pathkeys_unordered() to check for leading common pathkeys that exist in any portion of the other list of pathkeys. Use this for group_pathkeys and distinct_pathkeys. 3. Add some short-circuiting to truncate_useless_pathkeys() as there's no point in trying to trim down the list when some other operation has already figured out that it needs all of the pathkeys. 4. Remove the stray " if (root->group_pathkeys != NIL) return true" from has_useful_pathkeys(). I thought #3 might be useful for queries such as: SELECT DISTINCT <dozens of columns> FROM ... ORDER BY <dozens of columns> as truncate_useless_pathkeys() may no longer have to call the more costly count_common_leading_pathkeys_unordered() function if all the pathkeys were deemed useful to the ORDER BY clause. pathkeys_useful_for_merging() looked the most expensive, so I put that last. I don't have any proof in the form of benchmarks that #3 is worthwhile, but I felt compelled to do it as it's simple and there are certainly cases where it could save some amount of effort. #4 isn't required as standard_qp_callback will set query_pathkeys if there's a GROUP BY. I guess the person who wrote that code might have been confused with the "might be able to use them for ordering" comment. I changed that comment in the hopes nobody else falls for the same mistake. The patch saves about 4 dozen lines of code. David
Вложения
On Tue, Oct 14, 2025 at 3:03 PM David Rowley <dgrowleyml@gmail.com> wrote: > Here's a list of what I've changed: > > 1. Add count_common_leading_pathkeys_ordered() function to check for > leading common pathkeys and use that for sort_pathkeys, > window_pathkeys and window_pathkeys. > 2. Add count_common_leading_pathkeys_unordered() to check for leading > common pathkeys that exist in any portion of the other list of > pathkeys. Use this for group_pathkeys and distinct_pathkeys. > 3. Add some short-circuiting to truncate_useless_pathkeys() as there's > no point in trying to trim down the list when some other operation has > already figured out that it needs all of the pathkeys. > 4. Remove the stray " if (root->group_pathkeys != NIL) return true" > from has_useful_pathkeys(). +1. I think this is a nice tidy-up. FWIW, I complained about the stray check in has_useful_pathkeys() in [1] last week, but you were quicker than me in making the code change to remove it. [1] https://postgr.es/m/CAMbWs4_zW5QU=Zk32s17p8qWY+ga-3ZUTons+y+Wopguiopm4A@mail.gmail.com - Richard
On Oct 14, 2025, at 14:02, David Rowley <dgrowleyml@gmail.com> wrote:When working on a5a68dd6d I noticed that truncate_useless_pathkeys()
uses a bunch of different static helper functions that are mostly the
same as each other. Most of them differ only in which field of
PlannerInfo they look at.
The attached aims to clean that up by making 2 reusable functions.
I've also fixed a few other things I noticed along the way.
Here's a list of what I've changed:
1. Add count_common_leading_pathkeys_ordered() function to check for
leading common pathkeys and use that for sort_pathkeys,
window_pathkeys and window_pathkeys.
2. Add count_common_leading_pathkeys_unordered() to check for leading
common pathkeys that exist in any portion of the other list of
pathkeys. Use this for group_pathkeys and distinct_pathkeys.
3. Add some short-circuiting to truncate_useless_pathkeys() as there's
no point in trying to trim down the list when some other operation has
already figured out that it needs all of the pathkeys.
4. Remove the stray " if (root->group_pathkeys != NIL) return true"
from has_useful_pathkeys().
<v1-0001-Tidyup-truncate_useless_pathkeys-function.patch>
I like 1 and 2 that make truncate_useless_pathkeys() easier to read. And I agree 3 is an optimization though I am not sure how much it will help.
For 4, I traced the function has_useful_pathkeys(), and it showed me that root->group_pathkeys and root->query_pathkeys actually point to the same list. So deletion of the check root->group_pathkeys is reasonable.
I have only a trivial comment. As you pull out the shared code into count_common_leading_pathkeys_ordered()/unordered(), it’s reasonable to make them inline, which ensures the new code has the same performance as before. However, I realized that truncate_useless_pathkeys() is not on a critical execution path, not making them inline won’t hurt very much. So, it’s up to you whether or not add “inline” to the two new functions.
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
HighGo Software Co., Ltd.
https://www.highgo.com/
Thanks for having a look. On Tue, 14 Oct 2025 at 21:15, Chao Li <li.evan.chao@gmail.com> wrote: > I have only a trivial comment. As you pull out the shared code into count_common_leading_pathkeys_ordered()/unordered(),it’s reasonable to make them inline, which ensures the new code has thesame performance as before. However, I realized that truncate_useless_pathkeys() is not on a critical execution path,not making them inline won’t hurt very much. So, it’s up to you whether or not add “inline” to the two new functions. What makes you think making them inline would make the performance the same as before? The previous functions were not inlined, and I've not made any changes that should affect the compiler's ability to choose to inline these functions or not. These are static functions, so I'd prefer to leave it up to the compiler to choose what it thinks is best here. I expect the compiler probably inlines count_common_leading_pathkeys_ordered() since it basically amounts to just calling another function. FWIW, I did try the performance using the schema generated with: select 'create table t1 (' || string_agg('c'||n||' int',',') || ');' from generate_Series(1,32)n; select 'create index on t1 (' || string_agg('c'|| ((x+n)%32+1), ',') || ');' from generate_series(0,31)n, generate_Series(0,31) x group by n; and the query generated with: select 'explain (costs off) select distinct * from t1 order by ' || string_agg(n::text,',') || ';' from generate_Series(1,32)n; I ran it under pgbench -T 10 -c 20 -j 20 for 50 times, both patched and unpatched. Patched *maybe* comes out a tiny bit faster, about 0.31% on average. The attached graph shows the results sorted by fastest time first. So, at least not slower. David
Вложения
On Tue, 14 Oct 2025 at 21:05, Richard Guo <guofenglinux@gmail.com> wrote: > FWIW, I complained about the stray check in has_useful_pathkeys() in > [1] last week, but you were quicker than me in making the code change > to remove it. I missed that. I'm confident that line does nothing but waste cycles. A quick look at how those pathkeys are set in standard_qp_callback() should remove any uncertainty. David > [1] https://postgr.es/m/CAMbWs4_zW5QU=Zk32s17p8qWY+ga-3ZUTons+y+Wopguiopm4A@mail.gmail.com
On Oct 14, 2025, at 19:22, David Rowley <dgrowleyml@gmail.com> wrote:What makes you think making them inline would make the performance the
same as before? The previous functions were not inlined, and I've not
made any changes that should affect the compiler's ability to choose
to inline these functions or not.
Ah… You are right. The old code:
static int
pathkeys_useful_for_ordering(PlannerInfo *root, List *pathkeys)
{
int n_common_pathkeys;
(void) pathkeys_count_contained_in(root->sort_pathkeys, pathkeys,
&n_common_pathkeys);
return n_common_pathkeys;
}
static int
count_common_leading_pathkeys_ordered(List *keys1, List *keys2)
{
int ncommon;
(void) pathkeys_count_contained_in(keys1, keys2, &ncommon);
return ncommon;
}
They both call pathkeys_count_contained_in(), you are NOT adding an extra wrapper. So, I withdraw the “inline” comment.
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
HighGo Software Co., Ltd.
https://www.highgo.com/