Re: Clarify the ordering guarantees in combining queries (or lack thereof)

Поиск

Список

Период

Сортировка

От	Tom Lane
Тема	Re: Clarify the ordering guarantees in combining queries (or lack thereof)
Дата	14 июля 2022 г. 12:01:48
Msg-id	2742511.1657800108@sss.pgh.pa.us обсуждение исходный текст
Ответ на	Re: Clarify the ordering guarantees in combining queries (or lack thereof) (Pantelis Theodosiou <ypercube@gmail.com>)
Ответы	Re: Clarify the ordering guarantees in combining queries (or lack thereof)
Список	pgsql-docs

Дерево обсуждения

Pantelis Theodosiou <ypercube@gmail.com> writes:
> On Thu, Jul 14, 2022 at 9:16 AM Shay Rojansky <roji@roji.org> wrote:
>> I was trying to understand what - if any - are the guarantees with
>> regards to ordering for combining queries (UNION/UNION ALL/...).

> No, there is no guarantee. It's just that UNION ALL works this way today
> (preserving the order of the subselects) - and I'm not even sure about
> that, it may not preserve the order in all cases, with different indexes or
> partitioning or a parallel plan, etc.

Yeah, that.  You can get a parallelized plan today for UNION ALL:

=# explain analyze select * from foo union all select * from foo;
                                                                 QUERY PLAN
                    

--------------------------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=0.00..208552.05 rows=5120008 width=244) (actual time=0.652..390.135 rows=5120000 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Parallel Append  (cost=0.00..208552.05 rows=2133336 width=244) (actual time=0.021..228.848 rows=1706667 loops=3)
         ->  Parallel Seq Scan on foo  (cost=0.00..98942.68 rows=1066668 width=244) (actual time=0.453..78.084
rows=853333loops=3) 
         ->  Parallel Seq Scan on foo foo_1  (cost=0.00..98942.68 rows=1066668 width=244) (actual time=0.024..125.299
rows=1280000loops=2) 
 Planning Time: 0.094 ms
 Execution Time: 488.352 ms

It's true that in simple non-parallelized cases we'll do the first query
then the second, but SQL doesn't promise that to be true and neither does
Postgres.

>>> If you want ordered output use ORDER BY.

>> I don't see how that could be done. Consider the following:
>> (SELECT id FROM data ORDER BY id)
>> UNION ALL
>> (SELECT id FROM data ORDER BY id DESC);

You do it like this:

=# explain analyze (select * from foo union all select * from foo) order by unique1;
                                                                    QUERY PLAN
                          

--------------------------------------------------------------------------------------------------------------------------------------------------
 Gather Merge  (cost=839258.04..889070.63 rows=4266672 width=244) (actual time=931.054..1707.780 rows=5120000 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Sort  (cost=839258.01..844591.35 rows=2133336 width=244) (actual time=924.821..1132.096 rows=1706667 loops=3)
         Sort Key: foo.unique1
         Sort Method: external merge  Disk: 433552kB
         Worker 0:  Sort Method: external merge  Disk: 423400kB
         Worker 1:  Sort Method: external merge  Disk: 415592kB
         ->  Parallel Append  (cost=0.00..208552.05 rows=2133336 width=244) (actual time=0.051..218.476 rows=1706667
loops=3)
               ->  Parallel Seq Scan on foo  (cost=0.00..98942.68 rows=1066668 width=244) (actual time=0.030..79.118
rows=853333loops=3) 
               ->  Parallel Seq Scan on foo foo_1  (cost=0.00..98942.68 rows=1066668 width=244) (actual
time=0.073..118.566rows=1280000 loops=2) 
 Planning Time: 0.109 ms
 Execution Time: 1830.390 ms
(13 rows)

The parentheses are actually optional here, if memory serves --- to get
the ORDER BY to be applied inside the second sub-select, you'd have to
use parens as Shay had it.

            regards, tom lane

В списке pgsql-docs по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Clarify the ordering guarantees in combining queries (or lack thereof)