Re: sqlsmith crash incremental sort

Поиск
Список
Период
Сортировка
От Richard Guo
Тема Re: sqlsmith crash incremental sort
Дата
Msg-id CAMbWs4-Nq1rp_v5SonSykWOcVCK5XfUJ5F9oWa+U8X=7ronK1g@mail.gmail.com
обсуждение исходный текст
Ответ на Re: sqlsmith crash incremental sort  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Список pgsql-hackers

On Fri, Apr 17, 2020 at 2:44 AM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
On Thu, Apr 16, 2020 at 12:04:03PM -0400, James Coleman wrote:
>On Thu, Apr 16, 2020 at 8:22 AM Richard Guo <guofenglinux@gmail.com> wrote:
>> On Thu, Apr 16, 2020 at 6:35 PM Richard Guo <guofenglinux@gmail.com> wrote:
>>
>> Attached is what I'm thinking about this optimization. Does it make any
>> sense?
>
>Shouldn't this go one either a new thread or on the thread for the
>patch Tomas was referencing (by Teodor I believe)?
>

FWIW the optimization I had in mind is this:

   https://commitfest.postgresql.org/21/1651/

I now realize that was about GROUP BY, but it's not all that different
and the concerns will / should be fairly similar, I think.

Thanks for pointing out this thread. Very helpful.
 

IMO simply tweaking the sort keys to match the upper parts of the plan
is probably way too simplistic, I'm afraid. For example, if the Unique
significantly reduces cardinality, then the cost of the additional sort
is much less important. It may be much better to optimize the "large"
sort of the whole data set, either by reordering the columns as proposed
by Teodor in his patch (by number of distinct values and/or cost of the
comparison function function).

Since we don't have Teodor's patch for now, I think it is a clear win if
we can reorder the sort keys in 'Unique/SetOp->Sort->Append' path to
avoid a final Sort/Incremental Sort node, because currently the 'Sort
and Unique/SetOp' above Append simply performs sorting in the order of
columns it sees. I think this is the same logic we do for mergejoin
paths that we try to match the requested query_pathkeys to avoid a
second sort.
 

Furthermore, this is one of the places that is not using incremental
sort yet - I can easily imagine doing something like this:


    Sort
      -> Unique
         -> Incremenal Sort
           -> ...

could be a massive win. So I think we can't just rejigger the sort keys
abitrarily, we should / need to consider those alternatives.

This optimization would only apply to 'Unique/SetOp->Sort->Append' path.
I don't think it will affect our choise of incremental sort in other
cases. For example, with this optimization, we still can choose
incremental sort:

# explain (costs off)
select * from
    (select distinct a, c from (select * from foo order by 1,3) as sub) as sub1
order by 2,1;
                      QUERY PLAN
-------------------------------------------------------
 Sort
   Sort Key: sub.c, sub.a
   ->  Unique
         ->  Subquery Scan on sub
               ->  Incremental Sort
                     Sort Key: foo.a, foo.c
                     Presorted Key: foo.a
                     ->  Index Scan using foo_a on foo
(8 rows)
 

>Or are you saying you believe this patch guarantees we never see this
>problem in incremental sort costing?
>

Yeah, that's not entirely close to me. But maybe it shows us where we to
get the unprocessed target list?


I'm not sure if there are other cases that we would build
targetlit/pathkeys out of 'varno 0' Vars. But for this case here, the
'Unique/SetOp->Sort' above Append would sort the result of Append on all
columns, in the arbitrary order as it sees, (not based on any statistics
as Teodor's patch does), we can always reorder the sort keys trying to
match with result ordering requirements, thus to avoid the final
Sort/Incremental Sort node. So that we can prevent this problem in
incremental sort costing for this case.

Am I missing something? Please correct me.

Thanks
Richard 

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Kyotaro Horiguchi
Дата:
Сообщение: Re: Race condition in SyncRepGetSyncStandbysPriority
Следующее
От: Jeremy Morton
Дата:
Сообщение: Re: Support for DATETIMEOFFSET