Re: Removing unneeded self joins
От | David Rowley |
---|---|
Тема | Re: Removing unneeded self joins |
Дата | |
Msg-id | CAKJS1f8ySSsBfooH3bJK7OD3LBEbDb99d8J_FtqDd6w50p-eAQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Removing unneeded self joins (Alexander Kuzmenkov <a.kuzmenkov@postgrespro.ru>) |
Ответы |
Re: Removing unneeded self joins
(Alexander Kuzmenkov <a.kuzmenkov@postgrespro.ru>)
Re: Removing unneeded self joins (Alexander Kuzmenkov <a.kuzmenkov@postgrespro.ru>) |
Список | pgsql-hackers |
On 17 May 2018 at 03:43, Alexander Kuzmenkov <a.kuzmenkov@postgrespro.ru> wrote: > I'd be glad to hear your thoughts on this. (I only glanced at the patch) I've thought and discussed this before on this list. I think the arguments for and against it were much the same as you've received already. If you trawl through the archives you'll see my argument for matches quite closely to Robert regarding the nested-views. I personally experienced this issue in my previous job, although it was not with PostgreSQL. I think it's worth doing this providing that we can fast-path out quickly enough in cases where we can't possibly remove anything. Likely the success of this patch depends on how quick that fast-path is. From my experience on join removals, I imagine all this can be done just after the left join removal code has completed. I see your patch does it much later, which I don't think is particularly great since Paths have already been generated by that time. I think it makes sense to do this as early as possible to save wasting planning work for relations that will be removed. I think all this can be done just after left joins are removed by remove_useless_joins. You may like to move the code that exists in that function today into a new static function named remove_useless_left_joins, and put this new code in new static function named remove_useless_self_joins: 1. Allocate an array root->simple_rel_array_size in size. Populate it with a struct which is defined as struct { Index relid; Oid oid; } 2. Populate that array by looping over the simple_rel_array. Ignore anything that's not a baserel with relkind = 'r' 3. qsort the array on Oid. 4. Make a pass over the array (up to its size - 1) looking for elements where the current oid is the same as the next. Build a List of RelIds containing all relids of Oids which are duplicated. 5. If no pairs. Abort. 6. Process each combination of pairs found in each Relids in the list made in step 1. Probably start at the lowest relid. 7. For each pair: a. If there's a join condition, ensure all join OpExprs are equality exprs with a mergejoinable opno (copy what left join removal check with the opno used). Ensure Vars used in the OpExpr have the same attrno on each side. b. For bonus points don't reject non-Vars used in the join condition, but ensure they're equal and there are no non-immutable functions inside. c. Ensure relation_has_unique_index_for returns true for the Vars (and Exprs if doing b) used in the join condition. d. Choose the relation with the highest relid and rewrite the parse changing the varno of all Vars to use the one of the relation with the lowest relid. e. list_concat baserestictinfos from removed relation onto other relation. f. Check for Vars in the join condition that can contain NULLs and lappend IS NOT NULLs into the baserestrictinfo. (Removing the join could have removed NULL filtering) g. Mark highest relid relation as DEAD. (See what the left join removal code does (you may need to do some extra work around equivalence classes)) -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
В списке pgsql-hackers по дате отправления: