Re: LATERAL quals revisited

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: LATERAL quals revisited
Дата
Msg-id 22570.1372876359@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: LATERAL quals revisited  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: LATERAL quals revisited  (Antonin Houska <antonin.houska@gmail.com>)
Список pgsql-hackers
I wrote:
> So attached is a draft patch for this.  It's not complete yet because
> there are various comments that are now wrong and need to be updated;
> but I think the code is functioning correctly.

Hm, spoke too soon :-(.  This query causes an assertion failure, with or
without my draft patch:

select c.*,a.*,ss1.q1,ss2.q1,ss3.* from int8_tbl c left join (   int8_tbl a left join     (select q1, coalesce(q2,f1)
asx from int8_tbl b, int4_tbl b2) ss1     on a.q2 = ss1.q1   cross join   lateral (select q1, coalesce(ss1.x,q2) as y
fromint8_tbl d) ss2 ) on c.q2 = ss2.q1, lateral (select * from int4_tbl i where ss2.y > f1) ss3;
 

TRAP: FailedAssertion("!(bms_is_subset(phinfo->ph_needed, phinfo->ph_may_need))", File: "initsplan.c", Line: 213)

What's happening is that distribute_qual_to_rels concludes (correctly)
that the "ss2.y > f1" clause must be postponed until after the nest of
left joins, since those could null ss2.y.  So the PlaceHolderVar for
ss2.y is marked as being needed at the topmost join level.  However,
find_placeholders_in_jointree had only marked the PHV as being "maybe
needed" to scan the "i" relation, since that's what the syntactic
location of the reference implies.  Since we depend on the assumption
that ph_needed is always a subset of ph_may_need, there's an assertion
that fires if that stops being true, and that's what's crashing.

After some thought about this, I'm coming to the conclusion that lateral
references destroy the ph_maybe_needed optimization altogether: we
cannot derive an accurate estimate of where a placeholder will end up in
the final qual distribution, short of essentially doing all the work in
deconstruct_jointree over again.  I guess in principle we could repeat
deconstruct_jointree until we had stable estimates of the ph_needed
locations, but that would be expensive and probably would induce a lot
of new planner bugs (since the data structure changes performed during
deconstruct_jointree aren't designed to be backed out easily).

The only place where ph_may_need is actually used is in this bit in
make_outerjoininfo():
   /*    * Examine PlaceHolderVars.  If a PHV is supposed to be evaluated within    * this join's nullable side, and it
mayget used above this join, then    * ensure that min_righthand contains the full eval_at set of the PHV.    * This
ensuresthat the PHV actually can be evaluated within the RHS.    * Note that this works only because we should already
havedetermined the    * final eval_at level for any PHV syntactically within this join.    */   foreach(l,
root->placeholder_list)  {       PlaceHolderInfo *phinfo = (PlaceHolderInfo *) lfirst(l);       Relids
ph_syn_level= phinfo->ph_var->phrels;
 
       /* Ignore placeholder if it didn't syntactically come from RHS */       if (!bms_is_subset(ph_syn_level,
right_rels))          continue;
 
       /* We can also ignore it if it's certainly not used above this join */       /* XXX this test is probably overly
conservative*/       if (bms_is_subset(phinfo->ph_may_need, min_righthand))           continue;
 
       /* Else, prevent join from being formed before we eval the PHV */       min_righthand =
bms_add_members(min_righthand,phinfo->ph_eval_at);   }
 

Looking at it again, it's not really clear that skipping placeholders in
this way results in very much optimization --- sometimes we can avoid
constraining join order, but how often?  I tried diking out the check
on ph_may_need from this loop, and saw no changes in the regression test
results (not that that proves a whole lot about optimization of complex
queries).  So I'm pretty tempted to just remove ph_may_need, along with
the machinery that computes it.

Another possibility would be to keep the optimization, but disable it in
queries that use LATERAL.  I don't much care for that though --- seems
too Rube Goldbergish, and in any case I have a lot less faith in the
whole concept now than I had before I started digging into this issue.

Thoughts?
        regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Eisentraut
Дата:
Сообщение: Re: Redesigning checkpoint_segments
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: Support for REINDEX CONCURRENTLY