Re: Query-Planer from 6seconds TO DAYS

Поиск
Список
Период
Сортировка
От Böckler Andreas
Тема Re: Query-Planer from 6seconds TO DAYS
Дата
Msg-id 4FCA6A79-9D9B-4576-A661-4C290C1B6EF7@boeckler.org
обсуждение исходный текст
Ответ на Query-Planer from 6seconds TO DAYS  (Böckler Andreas <andy@boeckler.org>)
Ответы Re: Query-Planer from 6seconds TO DAYS
Список pgsql-performance
Hi Jeff,

thanks for your answer!

Am 24.10.2012 um 19:00 schrieb Jeff Janes:

> On Wed, Oct 24, 2012 at 8:41 AM, Böckler Andreas <andy@boeckler.org> wrote:
>
>> SELECT  m.machine_id, s.timestamp, s.errorcode
>> FROM events m INNER JOIN spsdata as s ON (m.machine_id= s.machine_id
>
> m.machine_id is equal to itself?  you must be retyping the query by hand…
Yes I did …  i changed the vars from german to english ..
That should be m.machine_id=s.machine_id
>
> You should report the results of "EXPLAIN ANALYZE" rather than merely
> EXPLAIN, as that would make it much easier to verify where the
> selectivity estimates are off.
>
OK ..
i can do that for the FAST query.
But the other one would take days. (see below )

>
>> FAST:
>>                                                                            QUERY PLAN
>>
------------------------------------------------------------------------------------------------------------------------------------------------------------------
>> Nested Loop  (cost=0.00..144979241.24 rows=42662 width=14)
>>   Join Filter: ((s."timestamp" <= m."timestamp") AND (m.machine_id = s.machine_id) AND (s."timestamp" >=
(m."timestamp"- '00:00:30'::interval))) 
>>   ->  Index Scan using events_code on events m  (cost=0.00..4911.18 rows=25 width=12)
>>         Index Cond: (code = 2024)
>>         Filter: (("timestamp" >= '2012-08-14 00:00:00'::timestamp without time zone) AND ("timestamp" <= '2012-08-26
00:00:00'::timestampwithout time zone)) 
>>   ->  Append  (cost=0.00..5770958.44 rows=1400738 width=14)
>>         ->  Index Scan using spsdata_machine_id on spsdata s  (cost=0.00..4.11 rows=1 width=14)
>>               Index Cond: (s.machine_id = m.machine_id)
>
> Was there more to the plan that you snipped?  If not, why isn't it
> checking all the other partitions?

Your right. It's checking all partitions!. So the constraint exclusion doesn't kick in.
This can be fixed with
SELECT
    m.machine_id, s.timestamp, s.errorcode
FROM
    events m
    INNER JOIN spsdata as s ON (m.machine_id=s.machine_id AND s.timestamp BETWEEN m.timestamp - interval '30 seconds'
ANDm.timestamp) 
WHERE
    m.code IN (2024)
    AND m.timestamp  BETWEEN '2012-08-01' AND '2012-08-29'
    AND s.timestamp  BETWEEN '2012-08-01' AND '2012-08-29'
    AND s.errorcode in ('2024');

It doesn't take hours to end, but it's not the performance gain you would expect.

I'v changed the query to one partition spsdata_2012m08  and attached the slow and fast cases with EXPLAIN ANALYZE.

The difference is one day in the WHERE-Clause
290.581 ms VS  687887.674 ms !
Thats 2372 times slower.

How can i force the fast query plan in a select?

At least I know that spsdata_2012m08 has way more records than events
spsdata_2012m08: reltuples -> 5.74082 * 10^7
events: count(1) for that time range -> 51383

>
> If you can't fix the selectivity estimates, one thing you could do to
> drive it to the faster query is to decrease random_page_cost to be the
> same seq_page_cost.  That should push the cross-over point to the
> sequential scan out to a region you might not care about.  However, it
> could also drive other queries in your system to use worse plans than
> they currently are.
> Or, you could "set enable_seqscan = off" before running this
> particular query, then reset it afterwards.
>
> Cheers,
>
> Jeff
I've played with seq_page_cost and enable_seqscan already, but you have to know the right values before SELECT to get
goodresults ;) 

Cheers,

Andy


--
Andreas Böckler
andy@boeckler.org

Вложения

В списке pgsql-performance по дате отправления:

Предыдущее
От: Shaun Thomas
Дата:
Сообщение: Setting Statistics on Functional Indexes
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Setting Statistics on Functional Indexes