Re: Reference to parent query from ANY sublink

Поиск
Список
Период
Сортировка
От Antonin Houska
Тема Re: Reference to parent query from ANY sublink
Дата
Msg-id 52A8F5D9.4090801@gmail.com
обсуждение исходный текст
Ответ на Re: Reference to parent query from ANY sublink  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Reference to parent query from ANY sublink
Список pgsql-hackers
On 12/11/2013 10:15 PM, Tom Lane wrote:
> 
> FWIW, that plan isn't obviously wrong; if it is broken, most likely the
> reason is that the HashAggregate is incorrectly unique-ifying the lower
> table.  (Unfortunately, EXPLAIN doesn't show enough about the HashAgg
> to know what it's doing exactly.)  The given query is, I think, in
> principle equivalent to
> 
>  SELECT ...
>   FROM SUBSELECT_TBL upper
>   WHERE (f1, f2::float) IN
>     (SELECT f2, f3 FROM SUBSELECT_TBL);
> 
> and if you ask unmodified HEAD to plan that you get
> 
>  Hash Join  (cost=41.55..84.83 rows=442 width=16)
>    Hash Cond: ((upper.f1 = subselect_tbl.f2) AND ((upper.f2)::double precision = subselect_tbl.f3))
>    ->  Seq Scan on subselect_tbl upper  (cost=0.00..27.70 rows=1770 width=16)
>    ->  Hash  (cost=38.55..38.55 rows=200 width=12)
>          ->  HashAggregate  (cost=36.55..38.55 rows=200 width=12)
>                ->  Seq Scan on subselect_tbl  (cost=0.00..27.70 rows=1770 width=12)

Before I opened your mail, I also recalled the technique that I noticed
in the planner code, to evaluate SEMI JOIN as INNER JOIN with the RHS
uniquified, so also thought it could be about the uniquification.

> which is the same thing at the visible level of detail ... but this
> version computes the correct result.  The cost of the HashAggregate is
> estimated higher, though, which suggests that maybe it's distinct'ing on
> two columns where the bogus plan only does one.

debug_print_plan output contains
:grpColIdx 2
in the AGG node. I think this corresponds to the join condition, which
IMO should be(upper.f1 = subselect_tbl.f2)
while the other condition was not in the list of join clauses and
therefore ignored for the uniquification's purpose.

And gdb tells me that create_unique_path() never gets more than 1
clause. I can't tell whether it should do just for this special purpose.

> Not sure about where Antonin's patch is going off the rails.  I suspect
> it's too simple somehow, but it's also possible that it's OK and the
> real issue is some previously undetected bug in LATERAL processing.

So far I have no idea how to achieve such conditions without this patch.
Thanks for your comments.

// Antonin Houska (Tony)





В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Reference to parent query from ANY sublink
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Reference to parent query from ANY sublink