Re: less than 2 sec for response - possible?
От | trafdev |
---|---|
Тема | Re: less than 2 sec for response - possible? |
Дата | |
Msg-id | 2f7771c1-40f2-8c9b-7c0a-232935ff93ea@mail.ru обсуждение исходный текст |
Ответ на | Re: less than 2 sec for response - possible? (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: less than 2 sec for response - possible?
|
Список | pgsql-performance |
Thanks Tom. I've created index on aid, date: create index aaa on stats.feed_sub(aid,date); and simplified a query (dropped gran as it's equal for all rows anyway): SELECT sum(stats.feed_sub.c_filt_click_cap_ip) AS clicks_from_ip, sum(stats.feed_sub.c_filt_doubleclick) AS clicks_on_target, sum(stats.feed_sub.c_filt_delay_clicks) AS ip_click_period, sum(stats.feed_sub.c_filt_fast_click) AS fast_click, sum(stats.feed_sub.c_filt_ip_mismatch) AS ip_mismatch, sum(stats.feed_sub.c_filt_lng_mismatch) AS lng_mismatch, sum(stats.feed_sub.c_filt_ref_mismatch) AS ref_mismatch, sum(stats.feed_sub.c_filt_ua_mismatch) AS ua_mismatch, sum(stats.feed_sub.c_filt_url_expired) AS url_expired, stats.feed_sub.subid AS stats_feed_sub_subid, stats.feed_sub.sid AS stats_feed_sub_sid FROM stats.feed_sub WHERE stats.feed_sub.date >= '2016-06-01' :: TIMESTAMP AND stats.feed_sub.date <= '2016-06-30' :: TIMESTAMP AND stats.feed_sub.aid = 3 GROUP BY stats.feed_sub.subid, stats.feed_sub.sid; All data is in the cache and it still takes almost 5 seconds to complete: QUERY PLAN HashAggregate (cost=792450.42..803727.24 rows=346979 width=86) (actual time=4742.145..4882.468 rows=126533 loops=1) " Group Key: subid, sid" Buffers: shared hit=1350371 -> Index Scan using aaa on feed_sub (cost=0.43..697031.39 rows=3469783 width=86) (actual time=0.026..1655.394 rows=3588376 loops=1) Index Cond: ((aid = 3) AND (date >= '2016-06-01 00:00:00'::timestamp without time zone) AND (date <= '2016-06-30 00:00:00'::timestamp without time zone)) Buffers: shared hit=1350371 Planning time: 0.159 ms Execution time: 4899.934 ms It's better, but still is far from "<2 secs" goal. Any thoughts? On 07/01/16 18:23, Tom Lane wrote: > trafdev <trafdev@mail.ru> writes: >> CREATE INDEX ix_feed_sub_date >> ON stats.feed_sub >> USING brin >> (date); > >> CREATE UNIQUE INDEX ixu_feed_sub >> ON stats.feed_sub >> USING btree >> (date, gran, aid, pid, sid, fid, subid COLLATE pg_catalog."default"); > >> HashAggregate (cost=901171.72..912354.97 rows=344100 width=86) (actual >> time=7207.825..7335.473 rows=126044 loops=1) >> " Group Key: subid, sid" >> Buffers: shared hit=3635804 >> -> Index Scan using ixu_feed_sub on feed_sub (cost=0.56..806544.38 >> rows=3440994 width=86) (actual time=0.020..3650.208 rows=3578344 loops=1) >> Index Cond: ((date >= '2016-06-01 00:00:00'::timestamp without >> time zone) AND (date <= '2016-06-30 00:00:00'::timestamp without time >> zone) AND (gran = '1 day'::interval) AND (aid = 3)) >> Buffers: shared hit=3635804 >> Planning time: 0.150 ms >> Execution time: 7352.009 ms > > Neither of those indexes is terribly well designed for this query. > A btree index on (aid, gran, date) or (gran, aid, date) would work > much better. See > > https://www.postgresql.org/docs/9.5/static/indexes-multicolumn.html > > You could rearrange the column order in that giant unique index > and get some of the benefit. But if you're desperate to optimize > this particular query, an index not bearing so many irrelevant columns > would probably be better for it. > > An alternative way of thinking would be to create an index with those > three leading columns and then all of the other columns used by this > query as later columns. That would be an even larger index, but it would > allow an index-only scan, which might be quite a lot faster. The fact > that you seem to be hitting about one page for each row retrieved says > that the data you need is pretty badly scattered, so constructing an index > that concentrates everything you need into one range of the index might > be the ticket. > > Either of these additional-index ideas is going to penalize table > insertions/updates, so keep an eye on that end of the performance > question too. > > regards, tom lane >
В списке pgsql-performance по дате отправления: