Re: planner missing a trick for foreign tables w/OR conditions
От | Tom Lane |
---|---|
Тема | Re: planner missing a trick for foreign tables w/OR conditions |
Дата | |
Msg-id | 12558.1387301313@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: planner missing a trick for foreign tables w/OR conditions (Robert Haas <robertmhaas@gmail.com>) |
Ответы |
Re: planner missing a trick for foreign tables w/OR conditions
Re: planner missing a trick for foreign tables w/OR conditions |
Список | pgsql-hackers |
Robert Haas <robertmhaas@gmail.com> writes: > On Mon, Dec 16, 2013 at 6:59 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> The hard part is not extracting the partial qual. The hard part is >> trying to make sure that adding this entirely-redundant scan qual doesn't >> catastrophically degrade join size estimates. > OK, I had a feeling that's where the problem was likely to be. Do you > have any thoughts about a more principled way of solving this problem? > I mean, off-hand, it's not clear to me that the comments about this > being a MAJOR HACK aren't overstated. Well, the business about injecting the correction by adjusting a cached selectivity is certainly a hack, but it's not one that I think is urgent to get rid of; I don't foresee anything that's likely to break it soon. > I might be missing something, but I suspect it works fine if every > path for the relation is generating the same rows. I had been thinking it would fall down if there are several OR conditions affecting different collections of rels, but after going through the math again, I'm now thinking I was wrong and it does in fact work out. As you say, we do depend on all paths generating the same rows, but since the extracted single-rel quals are inserted as plain baserestrictinfo quals, that'll be true. A bigger potential objection is that we're opening ourselves to larger problems with estimation failures due to correlated qual conditions, but again I'm finding that the math doesn't bear that out. It's reasonable to assume that our estimate for the extracted qual will be better than our estimate for the OR as a whole, so our adjusted size estimates for the filtered base relations are probably OK. And the adjustment to the OR clause selectivity means that the size estimate for the join comes out exactly the same. We'll actually be better off than with what is likely to happen now, which is that people manually extract the simplified condition and insert it into the query explicitly. PG doesn't realize that that's redundant and so will underestimate the join size. So at this point I'm pretty much talked into it. We could eliminate the dependence on indexes entirely, and replace this code with a step that simply tries to pull single-base-relation quals out of ORs wherever it can find one. You could argue that the produced quals would sometimes not be worth testing for, but we could apply a heuristic that says to forget it unless the estimated selectivity of the extracted qual is less than, I dunno, 0.5 maybe. (I wonder if it'd be worth inserting a check that there's not already a manually-generated equivalent clause, too ...) A very nice thing about this is we could do this step ahead of relation size estimate setting and thus remove the redundant work that currently happens in set_plain_rel_size when the optimization fires. Which is another aspect of the current code that's a hack, so getting rid of it would be a net reduction in hackiness. regards, tom lane
В списке pgsql-hackers по дате отправления: