Re: Index Skip Scan
От | Jesper Pedersen |
---|---|
Тема | Re: Index Skip Scan |
Дата | |
Msg-id | 994cbf82-333c-56d7-2860-9c8c85aba790@redhat.com обсуждение исходный текст |
Ответ на | Re: Index Skip Scan (David Rowley <david.rowley@2ndquadrant.com>) |
Список | pgsql-hackers |
Hi David, On 7/11/19 7:38 AM, David Rowley wrote: > The UniqueKeys idea is quite separate from pathkeys. Currently, a > Path can have a List of PathKeys which define the order that the > tuples will be read from the Plan node that's created from that Path. > The idea with UniqueKeys is that a Path can also have a non-empty List > of UniqueKeys to define that there will be no more than 1 row with the > same value for the Paths UniqueKey column/exprs. > > As of now, if you look at standard_qp_callback() sets > root->query_pathkeys, the idea here would be that the callback would > also set a new List field named "query_uniquekeys" based on the > group_pathkeys when non-empty and !root->query->hasAggs, or by using > the distinct clause if it's non-empty. Then in build_index_paths() > around the call to match_pathkeys_to_index() we'll probably also want > to check if the index can support UniqueKeys that would suit the > query_uniquekeys that were set during standard_qp_callback(). > > As for setting query_uniquekeys in standard_qp_callback(), this should > be simply a matter of looping over either group_pathkeys or > distinct_pathkeys and grabbing the pk_eclass from each key and making > a canonical UniqueKey from that. To have these canonical you'll need > to have a new field in PlannerInfo named canon_uniquekeys which will > do for UniqueKeys what canon_pathkeys does for PathKeys. So you'll > need an equivalent of make_canonical_pathkey() in uniquekey.c > > With this implementation, the code that the patch adds in > create_distinct_paths() can mostly disappear. You'd only need to look > for a path that uniquekeys_contained_in() matches the > root->query_uniquekeys and then just leave it to > set_cheapest(distinct_rel); to pick the cheapest path. > > It would be wasted effort to create paths with UniqueKeys if there's > multiple non-dead base rels in the query as the final rel in > create_distinct_paths would be a join rel, so it might be worth > checking that before creating such index paths in build_index_paths(). > However, down the line, we could carry the uniquekeys forward into the > join paths if the join does not duplicate rows from the other side of > the join... That's future stuff though, not for this patch, I don't > think. > > I think a UniqueKey can just be a struct similar to PathKey, e.g be > located in pathnodes.h around where PathKey is defined. Likely we'll > need a uniquekeys.c file that has the equivalent to > pathkeys_contained_in() ... uniquekeys_contained_in(), which return > true if arg1 is a superset of arg2 rather than check for one set being > a prefix of another. As you mention above: UniqueKeys { x, y } == > UniqueKeys { y, x }. That superset check could perhaps be optimized > by sorting UniqueKey lists in memory address order, that'll save > having a nested loop, but likely that's not going to be required for a > first cut version. This would work since you'd want UniqueKeys to be > canonical the same as PathKeys are (Notice that compare_pathkeys() > only checks memory addresses of pathkeys and not equals()). > > I think the UniqueKey struct would only need to contain an > EquivalenceClass field. I think all the other stuff that's in PathKey > is irrelevant to UniqueKey. > Thanks for the feedback ! I'll work on these changes for the next uniquekey patch. Best regards, Jesper
В списке pgsql-hackers по дате отправления:
Следующее
От: Robert HaasДата:
Сообщение: Re: accounting for memory used for BufFile during hash joins