Re: Preserving order through an inner join

Поиск
Список
Период
Сортировка
От Craig Ringer
Тема Re: Preserving order through an inner join
Дата
Msg-id 4CA16EE8.5040406@postnewspapers.com.au
обсуждение исходный текст
Ответ на Re: Preserving order through an inner join  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-general
On 09/27/2010 03:37 AM, Tom Lane wrote:
> Kevin Jardine<kevinjardine@yahoo.com>  writes:
>> I have a query structured like this:
>> SELECT stuff FROM
>> (SELECT more stuff FROM
>> table1
>> ORDER BY field1) AS q1
>> INNER JOIN table2 ON ( ... )
>
>> and have found that the INNER JOIN is ignoring the order set for q1.
>
>> The final results are not ordered by field1.
>
> Indeed.  Many of the possible join techniques won't preserve that ordering.

Just to elaborate on this for the OP:

When joining two data sets, PostgreSQL has a number of choices about how
to do it. Some of these are:

- Merge join (http://en.wikipedia.org/wiki/Sort-merge_join)
- Hash join (http://en.wikipedia.org/wiki/Hash_join)
- Nested loop + index/sequential table scans
   (http://en.wikipedia.org/wiki/Nested_loop_join)

As you will see from the descriptions and how they work, many join
algorithms do not preserve the order of the input relations. Different
join types are optimal for different absolute and relative sizes of
input relations and different join conditions; all of them are useful in
one situation or another. Only using joins that preserved the natural
order of the relations (which is undefined as per the SQL spec and 99%
of the time people don't care about anyway) would be pretty nasty for
performance.

You could potentially force preservation of order in your particular
query by telling PostgreSQL to use a nested loop with your subquery as
the outer loop, but it'd require overriding the query planner's join
method and join order optimization in ugly ways that aren't well supported.

Not only that, but you'll have to do that kind of hacking-around if you
target any other database that supports at least hash joins, and you'll
have to do it using database-specific query hints, configuration
options, session variables, etc.

I hope this helps to explain in a bit more detail why it's worth putting
in the effort to update your code.

--
Craig ringer

В списке pgsql-general по дате отправления:

Предыдущее
От: Michael Glaesemann
Дата:
Сообщение: Re: Deleting orphaned records (not exists is very slow)
Следующее
От: John R Pierce
Дата:
Сообщение: Re: Merge replication with Postgresql on Windows?