Re: Slow join over three tables

Поиск
Список
Период
Сортировка
От Alban Hertroys
Тема Re: Slow join over three tables
Дата
Msg-id 792DDB09-CCAC-41EB-9127-0425E84F06B6@gmail.com
обсуждение исходный текст
Ответ на Re: Slow join over three tables  (David Rowley <david.rowley@2ndquadrant.com>)
Ответы Re: Slow join over three tables  (Tim van der Linden <tim@shisaa.jp>)
Re: Slow join over three tables  (Tim van der Linden <tim@shisaa.jp>)
Список pgsql-general
> On 27 Apr 2016, at 4:09, David Rowley <david.rowley@2ndquadrant.com> wrote:
>
> On 27 April 2016 at 11:27, Tim van der Linden <tim@shisaa.jp> wrote:

>> The query:
>>
>> SELECT r.id, r.age, r.gender, r.created, a.adverse, d.drug
>> FROM reports r
>> JOIN report_drugs d ON d.rid = r.id
>> JOIN report_adverses a ON a.rid = r.id
>> WHERE a.adverse = ANY (ARRAY['back pain - nonspecific', 'nonspecific back pain', 'back pain'])
>> AND d.drug = ANY (ARRAY[359, 360, 361, 362, 363]) ORDER BY r.created;
>>

> Likely the only way to speed this up would be to create indexes;
>
> create index on reports (id, age, gender, created);

Since you're not using age and gender in this (particular) query until the rows are combined into a result set already,
itdoesn't make a whole lot of sense to add them to the index. Moreover, since your'e ordering by created, I'd at least
putthat directly after id: 

    create index on reports (id, created);

> create index on report_adverses (rid, adverse);

In this case, you're using the values in adverse to filter relevant rid's for the FK join, so you might be better off
withthe inverse of above index: 
    create index on report_adverses (adverse, rid);
    create index on report_drugs (drug, rid);

Do these tables have a primary key and can that be used here?

I hope I'm not being superfluous here, but don't forget to vacuum analyze after creating those indexes. Eventually
autovacuumwill do that for you, but you could otherwise be running tests to verify the impact of adding those indexes
beforeautovacuum gets around to it. 

Finally, perhaps it's more efficient to weed out all unrelated drugs and adverses before relating them to reports,
somethinglike: 

SELECT r.id, r.age, r.gender, r.created, x.adverse, x.drug
FROM reports r
JOIN (
    SELECT d.rid, a.adverse, d.drug
    FROM report_drugs d
    JOIN report_adverses a ON a.rid = d.rid
    WHERE a.adverse = ANY (ARRAY['back pain - nonspecific', 'nonspecific back pain', 'back pain'])
    AND d.drug = ANY (ARRAY[359, 360, 361, 362, 363])
) x ON x.rid = r.id
ORDER BY r.created;

Looking at the cardinality of your tables that does seem a bit unlikely though. Still, worth a shot...

Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.



В списке pgsql-general по дате отправления:

Предыдущее
От: Tim van der Linden
Дата:
Сообщение: Re: Slow join over three tables
Следующее
От: Tim van der Linden
Дата:
Сообщение: Re: Slow join over three tables