Re: How to read query plan

Поиск
Список
Период
Сортировка
От John Arbash Meinel
Тема Re: How to read query plan
Дата
Msg-id 4235BD81.1060606@arbash-meinel.com
обсуждение исходный текст
Ответ на Re: How to read query plan  (Miroslav Šulc <miroslav.sulc@startnet.cz>)
Список pgsql-performance
Miroslav Šulc wrote:

> PFC wrote:
>
>>     Your query seems of the form :
>>
>>     SELECT FROM main_table LEFT JOIN a lot of tables ORDER BY
>> sort_key LIMIT  N OFFSET M;
>>
>>     I would suggest to rewrite it in a simpler way : instead of
>> generating  the whole result set, sorting it, and then grabbing a
>> slice, generate only  the ror id's, grab a slice, and then generate
>> the full rows from that.
>>
>>     - If you order by a field which is in main_table :
>>     SELECT FROM main_table LEFT JOIN a lot of tables WHERE
>> main_table.id IN  (SELECT id FROM main_table ORDER BY sort_key LIMIT
>> N OFFSET M
>> ) ORDER BY sort_key LIMIT N OFFSET M;
>>
>>     - If you order by a field in one of the child tables, I guess you
>> only  want to display the rows in the main table which have this
>> field, ie.  not-null in the LEFT JOIN. You can also use the principle
>> above.
>>
>>     - You can use a straight join instead of an IN.
>
>
> Do you mean something like this?
>
> SELECT Table.IDPK, Table2.varchar1, Table2.varchar2, ...
> FROM Table
> LEFT JOIN many tables
> INNER JOIN Table AS Table2
>
> Miroslav

I would also recommend using the subselect format. Where any columns
that you are going to need to sort on show up in the subselect.

So you would have:

SELECT ...
    FROM main_table
    LEFT JOIN tablea ON ...
    LEFT JOIN tableb ON ...
    ...
    JOIN other_table ON ...
    WHERE main_table.idpk IN
       (SELECT idpk
            FROM main_table JOIN other_table ON main_table.idpk =
other_table.<main_idpk>
            WHERE ...
            ORDER BY other_table.abcd LIMIT n OFFSET m)
;

I think the final LIMIT + OFFSET would give you the wrong results, since
you have already filtered out the important rows.
I also think you don't need the final order by, since the results should
already be in sorted order.

Now this also assumes that if someone is sorting on a row, then they
don't want null entries. If they do, then you can change the subselect
into a left join. But with appropriate selectivity and indexes, an inner
join can filter out a lot of rows, and give you better performance.

The inner subselect gives you selectivity on the main table, so that you
don't have to deal with all the columns in the search, and then you
don't have to deal with all the rows later on.

I think you can also do this:

SELECT ...
    FROM (SELECT main_table.idpk, other_table.<columns> FROM main_table
JOIN other_table ....) as p
    LEFT JOIN ...
    JOIN main_table ON main_table.idpk = p.idpk;

In that case instead of selecting out the id and putting that into the
where, you put it in the from, and then join against it.
I don't really know which is better.

John
=:->


Вложения

В списке pgsql-performance по дате отправления:

Предыдущее
От: Miroslav Šulc
Дата:
Сообщение: Re: How to read query plan
Следующее
От: Gaetano Mendola
Дата:
Сообщение: Bad Performance[2]