Re: Terrible plan for join to nested union

Поиск
Список
Период
Сортировка
От Nate Allan
Тема Re: Terrible plan for join to nested union
Дата
Msg-id 9B2D6747F4AB8A47BE45216B06DEDAF92ABE6739@PREXMB01.myfamily.int
обсуждение исходный текст
Ответ на Re: Terrible plan for join to nested union  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Terrible plan for join to nested union
Список pgsql-performance
>>Nate Allan <nallan@ancestry.com> writes:
>> It seems to me that the join condition (and hence the restriction) should be pushed down into both sides of the
unionto bring the cardinality limit from millions to 1.  I'm imagining a rewrite like this:   
>>     R(a) J (b U c)  ->  (b J R(a)) U (c J R(a)) ...where R = Restrict, J
>> = Join, U = Union

>[ eyes that suspiciously ... ]  I'm not convinced that such a transformation is either correct in general (you seem to
beassuming at least that A's join column is unique, and >what is the UNION operator supposed to do with A's other
columns?)or likely to lead to a performance improvement in general. 

If there are more columns, you are correct that you might have to project off any additional columns within the union,
andleave the join outside of the union intact to bring in the extra columns.  Those are essentially the same
considerationsas when making other rewrites though.  As for this optimization making unions faster in general, I would
arguethat it is rather easy to produce a plan superior to complete materialization of the union. 

>We possibly could push down a join condition on the inner side of a nestloop, similarly to what's done in the UNION
ALLcase ... but that would require a complete >refactoring of what the planner does with UNIONs.  By and large, very
littleoptimization effort has been put into non-ALL UNION (or INTERSECT or EXCEPT).  You should >not expect that to
changeon a time scale of less than years. 

I hate to come across as contrary, but I'm pretty shocked by this answer for a couple reasons:
1) This is a clear-cut case of an untenable execution plan, essentially a bug in the planner.  This response
contradictsthe widely broadcast assertion that the PG community fixes planner bugs quickly and will not introduce hints
becausethey would rather address these kinds of issues "correctly". 
2) Why would more effort go into Union All rather than Union?  Are people using Union All more than Union, and if so is
thisbecause they actually want duplicates or is it because they've been trained to due to the performance problems with
Union? Union All, in many people's opinions, shouldn't even exist in a true relational sense. 

Again, sorry if I'm coming off as abrasive, I've spent political capital pushing to get PG in on this project, and now
I'ma little worried about whether it is going to work for this kind of scale and complexity, so I'm a little stressed.
Ido appreciate your responses. 

Best,

-Nate



В списке pgsql-performance по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Terrible plan for join to nested union
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Terrible plan for join to nested union