A very long running query....

Поиск
Список
Период
Сортировка
От Ioannis Anagnostopoulos
Тема A very long running query....
Дата
Msg-id 5009CB5A.7070801@anatec.com
обсуждение исходный текст
Ответы Re: A very long running query....
Re: A very long running query....
Список pgsql-performance
Hello,
the following query seems to take ages to get executed. However I am more than sure (as you can see from the explain analyse) that uses all the correct indexes. In general I have serious issues with joins in my database. This is a Postgres ver. 9.0 running postgis with the "_int.sql" contrib enabled. Further more I think that the execution of this query seriously degrades the performance of the database. I had to device this query and run it  like an overnight batch to populate a table as I couldn't afford users to execute it over and over in a "need to do" base. Unfortunately it is still slow and some times it either brings down the whole database (my insertions are buffered on the app server) or it never completes before morning.

SELECT
        src_id,
        date_trunc('day', message_copies.msg_date_rec) as date_count,
        message_copies.pos_georef1,
        message_copies.pos_georef2,
        message_copies.pos_georef3,
        message_copies.pos_georef4,
        ais_server.array_accum(CASE WHEN msg_type BETWEEN  1 and  3 THEN message_copies.msg_id END) as msgA_array,
        ais_server.array_accum(CASE WHEN msg_type = 18 THEN message_copies.msg_id END) as msgB_std_array,
        ais_server.array_accum(CASE WHEN msg_type = 19 THEN message_copies.msg_id END) as msgB_ext_array,
        uniq
        (
            ais_server.array_accum(CASE WHEN obj_type = 'SHIP_TYPE_A' THEN obj_mmsi END)
        ) as mmsi_type_A_array,
        uniq
        (
            ais_server.array_accum(CASE WHEN obj_type = 'SHIP_TYPE_B' THEN obj_mmsi END)
        ) as mmsi_type_B_array,
        avg(ship_speed) / 10.0 as avg_speed,
        avg(ship_heading) as avg_heading,
        avg(ship_course) / 10.0 as avg_course,
        ST_Multi(ST_Collect(ship_pos_messages.pos_point)) as geom
    from
        feed_all_y2012m07.message_copies join 
            (feed_all_y2012m07.ship_pos_messages join ais_server.ship_objects on (ship_pos_messages.obj_id = ship_objects.obj_id))
            on (message_copies.msg_id = ship_pos_messages.msg_id)
    where
        extract('day' from message_copies.msg_date_rec) = 17
        and date_trunc('day', message_copies.msg_date_rec) = '2012-07-17'
        and message_copies.src_id = 1
        and (message_copies.pos_georef1 || message_copies.pos_georef2 || message_copies.pos_georef3 || message_copies.pos_georef4) <> ''
        and not (message_copies.pos_georef1 || message_copies.pos_georef2 || message_copies.pos_georef3 || message_copies.pos_georef4) is null
        and extract('day' from ship_pos_messages.msg_date_rec) = 17
        group by src_id, date_count, message_copies.pos_georef1, message_copies.pos_georef2, message_copies.pos_georef3, message_copies.pos_georef4;


What follows is the Explain Analyze:
"HashAggregate  (cost=21295.20..21298.51 rows=53 width=148) (actual time=17832235.321..17832318.546 rows=2340 loops=1)"
"  ->  Nested Loop  (cost=0.00..21293.21 rows=53 width=148) (actual time=62.188..17801780.764 rows=387105 loops=1)"
"        ->  Nested Loop  (cost=0.00..20942.93 rows=53 width=144) (actual time=62.174..17783236.718 rows=387105 loops=1)"
"              Join Filter: (feed_all_y2012m07.message_copies.msg_id = feed_all_y2012m07.ship_pos_messages.msg_id)"
"              ->  Append  (cost=0.00..19057.93 rows=53 width=33) (actual time=62.124..5486473.545 rows=387524 loops=1)"
"                    ->  Seq Scan on message_copies  (cost=0.00..0.00 rows=1 width=68) (actual time=0.000..0.000 rows=0 loops=1)"
"                          Filter: ((src_id = 1) AND (date_trunc('day'::text, msg_date_rec) = '2012-07-17 00:00:00'::timestamp without time zone) AND (date_part('day'::text, msg_date_rec) = 17::double precision) AND (NOT (((((pos_georef1)::text || (pos_georef2)::text) || (pos_georef3)::text) || (pos_georef4)::text) IS NULL)) AND (((((pos_georef1)::text || (pos_georef2)::text) || (pos_georef3)::text) || (pos_georef4)::text) <> ''::text))"
"                    ->  Index Scan using idx_message_copies_wk2_date_src_pos_partial on message_copies_wk2 message_copies  (cost=0.00..19057.93 rows=52 width=32) (actual time=62.124..5486270.845 rows=387524 loops=1)"
"                          Index Cond: ((date_trunc('day'::text, msg_date_rec) = '2012-07-17 00:00:00'::timestamp without time zone) AND (src_id = 1))"
"                          Filter: ((date_part('day'::text, msg_date_rec) = 17::double precision) AND (NOT (((((pos_georef1)::text || (pos_georef2)::text) || (pos_georef3)::text) || (pos_georef4)::text) IS NULL)) AND (((((pos_georef1)::text || (pos_georef2)::text) || (pos_georef3)::text) || (pos_georef4)::text) <> ''::text))"
"              ->  Append  (cost=0.00..35.50 rows=5 width=93) (actual time=31.684..31.724 rows=1 loops=387524)"
"                    ->  Seq Scan on ship_pos_messages  (cost=0.00..0.00 rows=1 width=52) (actual time=0.001..0.001 rows=0 loops=387524)"
"                          Filter: (date_part('day'::text, feed_all_y2012m07.ship_pos_messages.msg_date_rec) = 17::double precision)"
"                    ->  Seq Scan on ship_a_pos_messages ship_pos_messages  (cost=0.00..0.00 rows=1 width=52) (actual time=0.000..0.000 rows=0 loops=387524)"
"                          Filter: (date_part('day'::text, feed_all_y2012m07.ship_pos_messages.msg_date_rec) = 17::double precision)"
"                    ->  Index Scan using ship_b_std_pos_messages_pkey on ship_b_std_pos_messages ship_pos_messages  (cost=0.00..9.03 rows=1 width=120) (actual time=0.008..0.008 rows=0 loops=387524)"
"                          Index Cond: (feed_all_y2012m07.ship_pos_messages.msg_id = feed_all_y2012m07.message_copies.msg_id)"
"                          Filter: (date_part('day'::text, feed_all_y2012m07.ship_pos_messages.msg_date_rec) = 17::double precision)"
"                    ->  Index Scan using ship_b_ext_pos_messages_pkey on ship_b_ext_pos_messages ship_pos_messages  (cost=0.00..7.90 rows=1 width=120) (actual time=0.004..0.004 rows=0 loops=387524)"
"                          Index Cond: (feed_all_y2012m07.ship_pos_messages.msg_id = feed_all_y2012m07.message_copies.msg_id)"
"                          Filter: (date_part('day'::text, feed_all_y2012m07.ship_pos_messages.msg_date_rec) = 17::double precision)"
"                    ->  Index Scan using ship_a_pos_messages_wk2_pkey on ship_a_pos_messages_wk2 ship_pos_messages  (cost=0.00..18.57 rows=1 width=120) (actual time=31.670..31.710 rows=1 loops=387524)"
"                          Index Cond: (feed_all_y2012m07.ship_pos_messages.msg_id = feed_all_y2012m07.message_copies.msg_id)"
"                          Filter: (date_part('day'::text, feed_all_y2012m07.ship_pos_messages.msg_date_rec) = 17::double precision)"
"        ->  Index Scan using ship_objects_pkey on ship_objects  (cost=0.00..6.59 rows=1 width=12) (actual time=0.041..0.044 rows=1 loops=387105)"
"              Index Cond: (ship_objects.obj_id = feed_all_y2012m07.ship_pos_messages.obj_id)"
"Total runtime: 17832338.082 ms"


A few more information: feed_all_y2012m07.message_copies_wk2 has 24.5 million rows only for the 17th of July and more or less the same amount for rows per day since the 15th that I started populating it. So I guess we are looking around 122million rows. The tables are populated with around 16K rows per minute.

As always any help will be greatly appreciated.
Kind Regards
Yiannis

В списке pgsql-performance по дате отправления:

Предыдущее
От: Laszlo Nagy
Дата:
Сообщение: Re: queries are fast after dump->restore but slow again after some days dispite vacuum
Следующее
От: Claudio Freire
Дата:
Сообщение: Re: A very long running query....