Re: Nested loop does not preserve order. Why?

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Nested loop does not preserve order. Why?
Дата
Msg-id 19836.1392132971@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Nested loop does not preserve order. Why?  (Alexey Bashtanov <bashtanov@imap.cc>)
Список pgsql-bugs
Alexey Bashtanov <bashtanov@imap.cc> writes:
> Let us have a table T(A, B), an index on T(B, A) and a much smaller
> table V(B).
> And we are looking for all rows of T that have B present in V and sorted
> by B, A.
> The most natural way to obtain them would be to take all records from V
> ordered by B, and to perform an index-only scan on T for each B.
> But not for postgresql :(

AFAICS, what you're wishing is that given a plan like

    nestloop
      Join condition: x = y
      -> path ordered by x
      -> path ordered by y, z

the planner would realize that the output of the nestloop is ordered
by y,z.  Right now it only realizes that the output is ordered by x
(and hence also by y); this means it thinks it needs an additional
sort step.

As stated, though, this deduction about ordering is false.  To make it
true, you need the additional assumption that the outer relation is
unique in x (plus assumptions about the equality operators in question
all being compatible with the sort ordering).

Right now, the planner doesn't really track uniqueness so there's
no practical way to do this.  There's a patch in the current commitfest
that wants to add tracking of unique sort orders, so if that gets in
we might have appropriate infrastructure.  But even then, this seems
like a mighty narrow and unusual corner case; I can't recall ever
seeing a request for such behavior before.  So I doubt we'd do it
unless the deduction turns out to be basically free, which seems
unlikely.

Having said that, there does seem to be something fishy going on
here.  If I add an index on v.b to your example, and drop the
explicit request to sort on a, I can get this:

EXPLAIN select b, a from u where
b in (select b from v) order by b;
                                          QUERY PLAN
-----------------------------------------------------------------------------------------------
 Sort  (cost=13802.17..14052.17 rows=100000 width=8)
   Sort Key: u.b
   ->  Nested Loop  (cost=8.89..4128.85 rows=100000 width=8)
         ->  Unique  (cost=8.45..8.50 rows=10 width=4)
               ->  Sort  (cost=8.45..8.48 rows=10 width=4)
                     Sort Key: v.b
                     ->  Index Only Scan using v_b_idx on v  (cost=0.14..8.29 rows=10 width=4)
         ->  Index Only Scan using u_i on u  (cost=0.43..312.04 rows=10000 width=8)
               Index Cond: (b = v.b)
 Planning time: 0.481 ms
(10 rows)

I'd have expected the planner to realize that neither of
those sort steps are necessary.  Will investigate.

            regards, tom lane

В списке pgsql-bugs по дате отправления:

Предыдущее
От: RogerBerkelmans@arlcollect.com.au
Дата:
Сообщение: BUG #9182: Data type (text) issue with MS SQL 2012
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: BUG #8354: stripped positions can generate nonzero rank in ts_rank_cd