Hi Jeff,
thanks for your answer!
Am 24.10.2012 um 19:00 schrieb Jeff Janes:
> On Wed, Oct 24, 2012 at 8:41 AM, Böckler Andreas <andy@boeckler.org> wrote:
>
>> SELECT m.machine_id, s.timestamp, s.errorcode
>> FROM events m INNER JOIN spsdata as s ON (m.machine_id= s.machine_id
>
> m.machine_id is equal to itself? you must be retyping the query by hand…
Yes I did … i changed the vars from german to english ..
That should be m.machine_id=s.machine_id
>
> You should report the results of "EXPLAIN ANALYZE" rather than merely
> EXPLAIN, as that would make it much easier to verify where the
> selectivity estimates are off.
>
OK ..
i can do that for the FAST query.
But the other one would take days. (see below )
>
>> FAST:
>> QUERY PLAN
>>
------------------------------------------------------------------------------------------------------------------------------------------------------------------
>> Nested Loop (cost=0.00..144979241.24 rows=42662 width=14)
>> Join Filter: ((s."timestamp" <= m."timestamp") AND (m.machine_id = s.machine_id) AND (s."timestamp" >=
(m."timestamp"- '00:00:30'::interval)))
>> -> Index Scan using events_code on events m (cost=0.00..4911.18 rows=25 width=12)
>> Index Cond: (code = 2024)
>> Filter: (("timestamp" >= '2012-08-14 00:00:00'::timestamp without time zone) AND ("timestamp" <= '2012-08-26
00:00:00'::timestampwithout time zone))
>> -> Append (cost=0.00..5770958.44 rows=1400738 width=14)
>> -> Index Scan using spsdata_machine_id on spsdata s (cost=0.00..4.11 rows=1 width=14)
>> Index Cond: (s.machine_id = m.machine_id)
>
> Was there more to the plan that you snipped? If not, why isn't it
> checking all the other partitions?
Your right. It's checking all partitions!. So the constraint exclusion doesn't kick in.
This can be fixed with
SELECT
m.machine_id, s.timestamp, s.errorcode
FROM
events m
INNER JOIN spsdata as s ON (m.machine_id=s.machine_id AND s.timestamp BETWEEN m.timestamp - interval '30 seconds'
ANDm.timestamp)
WHERE
m.code IN (2024)
AND m.timestamp BETWEEN '2012-08-01' AND '2012-08-29'
AND s.timestamp BETWEEN '2012-08-01' AND '2012-08-29'
AND s.errorcode in ('2024');
It doesn't take hours to end, but it's not the performance gain you would expect.
I'v changed the query to one partition spsdata_2012m08 and attached the slow and fast cases with EXPLAIN ANALYZE.
The difference is one day in the WHERE-Clause
290.581 ms VS 687887.674 ms !
Thats 2372 times slower.
How can i force the fast query plan in a select?
At least I know that spsdata_2012m08 has way more records than events
spsdata_2012m08: reltuples -> 5.74082 * 10^7
events: count(1) for that time range -> 51383
>
> If you can't fix the selectivity estimates, one thing you could do to
> drive it to the faster query is to decrease random_page_cost to be the
> same seq_page_cost. That should push the cross-over point to the
> sequential scan out to a region you might not care about. However, it
> could also drive other queries in your system to use worse plans than
> they currently are.
> Or, you could "set enable_seqscan = off" before running this
> particular query, then reset it afterwards.
>
> Cheers,
>
> Jeff
I've played with seq_page_cost and enable_seqscan already, but you have to know the right values before SELECT to get
goodresults ;)
Cheers,
Andy
--
Andreas Böckler
andy@boeckler.org