Re: less than 2 sec for response - possible?

Поиск
Список
Период
Сортировка
От trafdev
Тема Re: less than 2 sec for response - possible?
Дата
Msg-id 2f7771c1-40f2-8c9b-7c0a-232935ff93ea@mail.ru
обсуждение исходный текст
Ответ на Re: less than 2 sec for response - possible?  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: less than 2 sec for response - possible?
Список pgsql-performance
Thanks Tom.

I've created index on aid, date:

create index aaa on stats.feed_sub(aid,date);

and simplified a query (dropped gran as it's equal for all rows anyway):

SELECT
   sum(stats.feed_sub.c_filt_click_cap_ip) AS clicks_from_ip,
   sum(stats.feed_sub.c_filt_doubleclick)  AS clicks_on_target,
   sum(stats.feed_sub.c_filt_delay_clicks) AS ip_click_period,
   sum(stats.feed_sub.c_filt_fast_click)   AS fast_click,
   sum(stats.feed_sub.c_filt_ip_mismatch)  AS ip_mismatch,
   sum(stats.feed_sub.c_filt_lng_mismatch) AS lng_mismatch,
   sum(stats.feed_sub.c_filt_ref_mismatch) AS ref_mismatch,
   sum(stats.feed_sub.c_filt_ua_mismatch)  AS ua_mismatch,
   sum(stats.feed_sub.c_filt_url_expired)  AS url_expired,
   stats.feed_sub.subid                    AS stats_feed_sub_subid,
   stats.feed_sub.sid                      AS stats_feed_sub_sid
FROM stats.feed_sub
WHERE stats.feed_sub.date >= '2016-06-01' :: TIMESTAMP AND
       stats.feed_sub.date <= '2016-06-30' :: TIMESTAMP AND
       stats.feed_sub.aid = 3
GROUP BY
   stats.feed_sub.subid, stats.feed_sub.sid;

All data is in the cache and it still takes almost 5 seconds to complete:

QUERY PLAN
HashAggregate  (cost=792450.42..803727.24 rows=346979 width=86) (actual
time=4742.145..4882.468 rows=126533 loops=1)
"  Group Key: subid, sid"
   Buffers: shared hit=1350371
   ->  Index Scan using aaa on feed_sub  (cost=0.43..697031.39
rows=3469783 width=86) (actual time=0.026..1655.394 rows=3588376 loops=1)
         Index Cond: ((aid = 3) AND (date >= '2016-06-01
00:00:00'::timestamp without time zone) AND (date <= '2016-06-30
00:00:00'::timestamp without time zone))
         Buffers: shared hit=1350371
Planning time: 0.159 ms
Execution time: 4899.934 ms

It's better, but still is far from "<2 secs" goal.

Any thoughts?


On 07/01/16 18:23, Tom Lane wrote:
> trafdev <trafdev@mail.ru> writes:
>> CREATE INDEX ix_feed_sub_date
>>    ON stats.feed_sub
>>    USING brin
>>    (date);
>
>> CREATE UNIQUE INDEX ixu_feed_sub
>>    ON stats.feed_sub
>>    USING btree
>>    (date, gran, aid, pid, sid, fid, subid COLLATE pg_catalog."default");
>
>> HashAggregate  (cost=901171.72..912354.97 rows=344100 width=86) (actual
>> time=7207.825..7335.473 rows=126044 loops=1)
>> "  Group Key: subid, sid"
>>    Buffers: shared hit=3635804
>>    ->  Index Scan using ixu_feed_sub on feed_sub  (cost=0.56..806544.38
>> rows=3440994 width=86) (actual time=0.020..3650.208 rows=3578344 loops=1)
>>          Index Cond: ((date >= '2016-06-01 00:00:00'::timestamp without
>> time zone) AND (date <= '2016-06-30 00:00:00'::timestamp without time
>> zone) AND (gran = '1 day'::interval) AND (aid = 3))
>>          Buffers: shared hit=3635804
>> Planning time: 0.150 ms
>> Execution time: 7352.009 ms
>
> Neither of those indexes is terribly well designed for this query.
> A btree index on (aid, gran, date) or (gran, aid, date) would work
> much better.  See
>
> https://www.postgresql.org/docs/9.5/static/indexes-multicolumn.html
>
> You could rearrange the column order in that giant unique index
> and get some of the benefit.  But if you're desperate to optimize
> this particular query, an index not bearing so many irrelevant columns
> would probably be better for it.
>
> An alternative way of thinking would be to create an index with those
> three leading columns and then all of the other columns used by this
> query as later columns.  That would be an even larger index, but it would
> allow an index-only scan, which might be quite a lot faster.  The fact
> that you seem to be hitting about one page for each row retrieved says
> that the data you need is pretty badly scattered, so constructing an index
> that concentrates everything you need into one range of the index might
> be the ticket.
>
> Either of these additional-index ideas is going to penalize table
> insertions/updates, so keep an eye on that end of the performance
> question too.
>
>             regards, tom lane
>


В списке pgsql-performance по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: less than 2 sec for response - possible?
Следующее
От: devel.brain99@xoxy.net
Дата:
Сообщение: Re: Random slow queries