On Thu, Aug 10, 2017 at 1:07 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Aug 8, 2017 at 3:50 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> Right.
>>
>> I see two ways to include the cost of the target list for parallel
>> paths before rejecting them (a) Don't reject parallel paths
>> (Gather/GatherMerge) during add_path. This has the danger of path
>> explosion. (b) In the case of parallel paths, somehow try to identify
>> that path has a costly target list (maybe just check if the target
>> list has anything other than vars) and use it as a heuristic to decide
>> that whether a parallel path can be retained.
>
> I think the right approach to this problem is to get the cost of the
> GatherPath correct when it's initially created. The proposed patch
> changes the cost after-the-fact, but that (1) doesn't prevent a
> promising path from being rejected before we reach this point and (2)
> is probably unsafe, because it might confuse code that reaches the
> modified-in-place path through some other pointer (e.g. code which
> expects the RelOptInfo's paths to still be sorted by cost). Perhaps
> the way to do that is to skip generate_gather_paths() for the toplevel
> scan/join node and do something similar later, after we know what
> target list we want.
>
I think skipping a generation of gather paths for scan node or top
level join node generated via standard_join_search seems straight
forward, but skipping for paths generated via geqo seems to be tricky
(See use of generate_gather_paths in merge_clump). Assuming, we find
some way to skip it for top level scan/join node, I don't think that
will be sufficient, we have some special way to push target list below
Gather node in apply_projection_to_path, we need to move that part as
well in generate_gather_paths.
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com