Re: [PoC] Reducing planning time when tables have many partitions
От | Yuya Watari |
---|---|
Тема | Re: [PoC] Reducing planning time when tables have many partitions |
Дата | |
Msg-id | CAJ2pMkaLi_2vCSrxwWjXvpGd_EExy4sQFL5-3BAwALRKhsviGA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [PoC] Reducing planning time when tables have many partitions (David Rowley <dgrowleyml@gmail.com>) |
Ответы |
Re: [PoC] Reducing planning time when tables have many partitions
|
Список | pgsql-hackers |
Hello David, Thank you very much for your thorough review and valuable comments. I have refactored the patches based on your feedback and attached the updated versions (v34). Additionally, I have included a diff between v33 and v34 for your quick reference. On Thu, Mar 13, 2025 at 1:53 PM David Rowley <dgrowleyml@gmail.com> wrote: > > 1) Can you describe the difference between > PlannerInfo.top_parent_relid_array and RelOptInfo.top_parent_relids? > If you've added the PlannerInfo field for performance reasons, then > that needs to be documented. I think the bar for adding another field > to do the same thing should be quite high. The > RelOptInfo.top_parent_relids field already is commented with > "redundant, but handy", so having another field in another struct > that's also redundant leads me to think that some design needs more > thought. > > If you need a cheap way to take the same shortcut as you're doing in > setup_eclass_child_member_iterator() with "if > (root->top_parent_relid_array == NULL)", then maybe PlannerInfo should > have a boolean field to record if there are any other member rels Thank you for highlighting this. I initially introduced PlannerInfo.top_parent_relid_array primarily for performance reasons to quickly determine whether a relation is a parent or child, particularly in setup_eclass_child_member_iterator(). As you mentioned, earlier versions utilized the check "if (root->top_parent_relid_array == NULL)" to skip unnecessary operations when no child relations exist. However, I have realized that the same behavior can be achieved by using root->append_rel_array. Specifically, if a relation is a parent, the corresponding AppendRelInfo is NULL, and if there are no child relations at all, the entire array itself is NULL. So, PlannerInfo.top_parent_relid_array is no longer necessary. In v34-0001, I removed root->top_parent_relid_array and instead utilized root->append_rel_array. However, this caused issues in add_setop_child_rel_equivalences(), since this function adds a new child EquivalenceMember without building a parent-child relationship in root->append_rel_array. To address this, I have created a dummy AppendRelInfo object in v34-0002. This is just a workaround, and there may be a more elegant solution. I'd greatly appreciate any suggestions or alternative approaches you might have. > 2) I think the naming of setup_eclass_child_member_iterator() and > dispose_eclass_child_member_iterator() is confusing. From the names, > I'd expect these to only be returning em_is_child == true members, but > that's not the case. I agree the original naming was misleading. In v34-0001, I have renamed these functions to setup_eclass_all_member_iterator_for_relids() and dispose_eclass_all_member_iterator_for_relids(). To align with this change, I have also renamed EquivalenceChildMemberIterator to EquivalenceAllMemberIterator. Does this new naming better address your concern? > 3) The header comment for setup_eclass_child_member_iterator() does > not seem concise enough. It claims "so that it can iterate over > EquivalenceMembers in 'ec'.", but what does that mean? The definition > of "EquivalenceMembers in 'ec'" isn't clear. Is that just the class's > ec_members, or also the child members that are stored somewhere else. > Users of this function need to know what they'll get so they know > which members they need to ignore or which they can assume won't be > returned. If you don't document that, then it's quite hard to > determine where the faulty code is when we get bugs. The "relids" > parameter needs to be documented too. I have clarified the header comment in v34-0001. It now explicitly states that the iterator iterates over all parent members and child members whose em_relids are subsets of the given 'relids'. I have also clearly documented the parameters, including 'relids'. > 4) add_transformed_child_version sounds like it does some > transformation, but all it does is add the EMs for the given > RelOptInfo to the iterator's list. I don't quite follow what's being > "transformed". Maybe there's a better name? Thank you for highlighting this. The original name was indeed misleading. I have renamed this function to add_eclass_child_members_to_iterator(). -- Best regards, Yuya Watari
Вложения
В списке pgsql-hackers по дате отправления: