Re: upper planner path-ification
От | Simon Riggs |
---|---|
Тема | Re: upper planner path-ification |
Дата | |
Msg-id | CANP8+jKeGV0oF2SaOR3HyiE_KjcwF6GWT3N4nDVZNuUyG6BKbQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: upper planner path-ification (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: upper planner path-ification
(Tom Lane <tgl@sss.pgh.pa.us>)
|
Список | pgsql-hackers |
On 18 May 2015 at 14:50, Tom Lane <tgl@sss.pgh.pa.us> wrote:
--
Robert Haas <robertmhaas@gmail.com> writes:
> On Sun, May 17, 2015 at 12:11 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Rather than adding tlists per se to Paths, I've been vaguely toying with
>> a notion of identifying all the "interesting" subexpressions in a query
>> (expensive functions, aggregates, etc), giving them indexes 1..n, and then
>> marking Paths with bitmapsets showing which interesting subexpressions
>> they can produce values for. This would make things like "does this Path
>> compute all the needed aggregates" much cheaper to deal with than a raw
>> tlist representation would do. But maybe that's still not the best way.
> I don't know, but it seems like this might be pulling in the opposite
> direction from your previously-stated desire to get subquery_planner
> to output Paths rather than Plans as soon as possible.
Sorry, I didn't mean to suggest that that necessarily had to happen right
away.
What we do need right away, though, is *some* design for distinguishing
Paths for the different possible upper-level steps. I won't cry if we
change it around later, but we have to have something to start with.
So for the moment, let's assume that we still rigidly follow the sequence
of upper-level steps currently embodied in grouping_planner. (I'm not
sure if it even makes sense to consider other orderings of those
processing steps, but in any case we don't need to allow it on day zero.)
Then, make a dummy RelOptInfo corresponding to the result of each step,
and insert links to those in new fields in PlannerInfo. (We create these
*before* starting scan/join planning, so that FDWs, custom scans, etc, can
inject paths into these RelOptInfos if they want, so as to represent cases
like remote aggregation.) Then just use add_path with the appropriate
target RelOptInfo when producing different ways to do grouping etc.
This is a bit ad-hoc but it would be a place to start.
Comments?
My thinking was to push aggregation down to the lowest level possible in the plan, hopefully a single relation. That way we can generate paths for the current grouping_planner options as well as others, such as these
* Push down aggregate prior to a join (which might then affect join planning)
* Allow parallel queries to follow a scan-aggregate-collectfromslaves-aggregate strategy (hence need for double aggregation semantics)
* Allow a lookaside to a Mat View rather than do a scan-aggregate (assume for now these are maintained correctly)
* Allow a lookaside to an alternate datastore/mechanism via CustomScan (assume these are maintained correctly)
all of which need to be costed against each other and the current strategies (aggregate last).
The above proposal sounds like it will do that, but not completely sure.
I'm assuming the O(N^2) Mat View planning problem can be solved in part by recognizing that many MVs are just single-table plus aggregates, and that we'd have a small enough number of MVs in play that search would not be a problem in practice.
I'm also aware that LIMIT is still very badly optimized, so I'm hoping it helps there also.
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Josh BerkusДата:
Сообщение: Re: jsonb concatenate operator's semantics seem questionable
Следующее
От: Peter GeogheganДата:
Сообщение: Re: jsonb concatenate operator's semantics seem questionable