Re: SQL statement over 500% slower with 9.2 compared with 9.1

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: SQL statement over 500% slower with 9.2 compared with 9.1
Дата
Msg-id 521D19BD.70208@fuzzy.cz
обсуждение исходный текст
Ответ на Re: SQL statement over 500% slower with 9.2 compared with 9.1  (Rafael Martinez <r.m.guerrero@usit.uio.no>)
Ответы Re: SQL statement over 500% slower with 9.2 compared with 9.1  (Rafael Martinez <r.m.guerrero@usit.uio.no>)
Список pgsql-performance
On 27.8.2013 11:19, Rafael Martinez wrote:
> On 08/26/2013 02:33 PM, Rafael Martinez wrote:
> [............]
>> The SQL statement is:
>
>> SELECT  firstname || ' ' || lastname AS Name FROM    Person R WHERE
>> R.gender like 'F' AND  19 < (SELECT COUNT(DISTINCT filmId) FROM
>> FilmParticipation F WHERE  F.partType = 'director' AND F.personId =
>> R.personId    ) AND NOT EXISTS ( SELECT  * FROM
>> FilmParticipation D WHERE   D.partType = 'director' AND D.personId
>> = R.personId AND NOT EXISTS ( SELECT  * FROM    FilmParticipation
>> C WHERE   C.partType = 'cast' AND C.filmId = D.filmId AND
>> C.personId = D.personId ) ) ;
>
>
> [.............]
>
>> We can see that the query plan is very different between versions
>> and that 9.2 is really wrong with the number of rows involved. Why
>> is 9.2 taking so wrong about the number of rows involved in some
>> parts of the plan?
>
>
> Hei
>
> More information:
>
> If we turn off enable_indexscan the runtime gets more similar to the
> one we get with 9.1, we are down to 4200ms.
>
> The query plan with this configuration is here:
> http://explain.depesz.com/s/jVR
>
> The question remains the same, why is 9.2 using such a different and
> bad plan compared to 9.1, when the data and the configuration are the
> same?

Hi,

seems the problem is mostly about the inner-most query, i.e. this:

    SELECT  *
    FROM    FilmParticipation C
    WHERE   C.partType = 'cast'
            AND C.filmId = D.filmId
            AND C.personId = D.personId
        )

In 9.2 it's estimated to return 1 row, but it returns 595612 of them (or
97780 after materialization). I believe this is the culprit that causes
cost estimates that are way off, and that in turn leads to choice of
"cheaper" plan that actually takes much longer to evaluate.

Because the slow plan is estimated to "cost" 122367017.97 while the fast
one 335084834.95 (i.e. 3x more).

I don't immediately see where's the problem - maybe some other hacker on
this list can. Can you prepare a testcase for this? I.e. a structure of
the tables + data so that we can reproduce it?

regards
Tomas



В списке pgsql-performance по дате отправления:

Предыдущее
От: Merlin Moncure
Дата:
Сообщение: Re: Cpu usage 100% on slave. s_lock problem.
Следующее
От: Tomas Vondra
Дата:
Сообщение: Re: Poor performance on simple queries compared to sql server express