less than 2 sec for response - possible?

Поиск

Список

Период

Сортировка

От	trafdev
Тема	less than 2 sec for response - possible?
Дата	2 июля 2016 г. 00:55:06
Msg-id	38b8542e-e011-370d-3863-207126ac9b33@mail.ru обсуждение исходный текст
Ответы	Re: less than 2 sec for response - possible? Re: less than 2 sec for response - possible?
Список	pgsql-performance

Дерево обсуждения

Hi.

I'm trying to build an OLAP-oriented DB based on PostgresSQL.

User works with a paginated report in the web-browser. Interface allows
to fetch data for a custom date-range selection,
display individual rows (20-50 per page) and totals (for entire
selection, even not visible on the current page) and sorting by any column.

The main goal is to deliver results of the basic SELECT queries to the
end-user in less than 2 seconds.

I was able to achieve that except for one table (the biggest one).

It consist of multiple dimensions (date, gran, aid, pid, sid, fid,
subid) and metrics (see below).
User can filter by any dimension and sort by any metric.

Here is a CREATE script for this table:

CREATE TABLE stats.feed_sub
(
   date date NOT NULL,
   gran interval NOT NULL,
   aid smallint NOT NULL,
   pid smallint NOT NULL,
   sid smallint NOT NULL,
   fid smallint NOT NULL,
   subid text NOT NULL,
   rev_est_pub real NOT NULL,
   rev_est_feed real NOT NULL,
   rev_raw real NOT NULL,
   c_total bigint NOT NULL,
   c_passed bigint NOT NULL,
   q_total bigint NOT NULL,
   q_passed bigint NOT NULL,
   q_filt_geo bigint NOT NULL,
   q_filt_browser bigint NOT NULL,
   q_filt_os bigint NOT NULL,
   q_filt_ip bigint NOT NULL,
   q_filt_subid bigint NOT NULL,
   q_filt_pause bigint NOT NULL,
   q_filt_click_cap_ip bigint NOT NULL,
   q_filt_query_cap bigint NOT NULL,
   q_filt_click_cap bigint NOT NULL,
   q_filt_rev_cap bigint NOT NULL,
   q_filt_erpm_floor bigint NOT NULL,
   c_filt_click_cap_ip bigint NOT NULL,
   c_filt_doubleclick bigint NOT NULL,
   c_filt_url_expired bigint NOT NULL,
   c_filt_fast_click bigint NOT NULL,
   c_filt_delay_clicks bigint NOT NULL,
   c_filt_ip_mismatch bigint NOT NULL,
   c_filt_ref_mismatch bigint NOT NULL,
   c_filt_lng_mismatch bigint NOT NULL,
   c_filt_ua_mismatch bigint NOT NULL,
   res_impr bigint NOT NULL,
   rev_ver_pub real,
   rev_ver_feed real,
   c_ver bigint,
   q_filt_ref bigint NOT NULL
)
WITH (
   OIDS=FALSE
);

CREATE INDEX ix_feed_sub_date
   ON stats.feed_sub
   USING brin
   (date);

CREATE UNIQUE INDEX ixu_feed_sub
   ON stats.feed_sub
   USING btree
   (date, gran, aid, pid, sid, fid, subid COLLATE pg_catalog."default");

Here is some sizing info (https://wiki.postgresql.org/wiki/Disk_Usage):

relation,size
stats.feed_sub,5644 MB
stats.ixu_feed_sub,1594 MB

row_estimate
15865627

Here is the typical query (for totals beige):
SELECT
   sum(stats.feed_sub.c_filt_click_cap_ip) AS clicks_from_ip,
   sum(stats.feed_sub.c_filt_doubleclick)  AS clicks_on_target,
   sum(stats.feed_sub.c_filt_delay_clicks) AS ip_click_period,
   sum(stats.feed_sub.c_filt_fast_click)   AS fast_click,
   sum(stats.feed_sub.c_filt_ip_mismatch)  AS ip_mismatch,
   sum(stats.feed_sub.c_filt_lng_mismatch) AS lng_mismatch,
   sum(stats.feed_sub.c_filt_ref_mismatch) AS ref_mismatch,
   sum(stats.feed_sub.c_filt_ua_mismatch)  AS ua_mismatch,
   sum(stats.feed_sub.c_filt_url_expired)  AS url_expired,
   stats.feed_sub.subid                    AS stats_feed_sub_subid,
   stats.feed_sub.sid                      AS stats_feed_sub_sid
FROM stats.feed_sub
WHERE stats.feed_sub.date >= '2016-06-01' :: TIMESTAMP AND
       stats.feed_sub.date <= '2016-06-30' :: TIMESTAMP AND
       stats.feed_sub.gran = '1 day'
       AND stats.feed_sub.aid = 3
GROUP BY
   stats.feed_sub.subid, stats.feed_sub.sid;

QUERY PLAN
HashAggregate  (cost=901171.72..912354.97 rows=344100 width=86) (actual
time=7207.825..7335.473 rows=126044 loops=1)
"  Group Key: subid, sid"
   Buffers: shared hit=3635804
   ->  Index Scan using ixu_feed_sub on feed_sub  (cost=0.56..806544.38
rows=3440994 width=86) (actual time=0.020..3650.208 rows=3578344 loops=1)
         Index Cond: ((date >= '2016-06-01 00:00:00'::timestamp without
time zone) AND (date <= '2016-06-30 00:00:00'::timestamp without time
zone) AND (gran = '1 day'::interval) AND (aid = 3))
         Buffers: shared hit=3635804
Planning time: 0.150 ms
Execution time: 7352.009 ms

As I can see - it takes 3.6 seconds just for an index scan (which sits
in RAM).
+3 seconds for groupings +1-2 seconds for network transfers, so I'm
completely out of my "sub 2 seconds" goal.

Questions are:
1. Am I using the right DB\architecture for achieving my goal? Are there
any better solution for that?
2. Have I reached some physical limits? Will installing faster RAM\CPU help?

Thanks in advance!

Server config:

OS:
 > uname -a
FreeBSD sqldb 10.2-RELEASE-p9 FreeBSD 10.2-RELEASE-p9 #0: Thu Jan 14
01:32:46 UTC 2016
root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC  amd64


CPU: Intel(R) Xeon(R) CPU E5-1630 v3
 > sysctl -a | egrep -i 'hw.machine|hw.model|hw.ncpu'
hw.machine: amd64
hw.model: Intel(R) Xeon(R) CPU E5-1630 v3 @ 3.70GHz
hw.ncpu: 8
hw.machine_arch: amd64


MEM: 64GB
 > sysctl hw.physmem
hw.physmem: 68572983296


HDD: 2x480GB SSD (ZFS mirror)
 > camcontrol devlist
<INTEL SSDSC2BB480H4 D2010380>     at scbus5 target 0 lun 0 (ada0,pass1)
<INTEL SSDSC2BB480H4 D2010380>     at scbus6 target 0 lun 0 (ada1,pass2)


FS:
 > zfs list
NAME                    USED  AVAIL  REFER  MOUNTPOINT
zroot                  36.5G   390G    96K  /zroot
...
zroot/ara/sqldb/pgsql  33.7G   390G  33.7G  /ara/sqldb/pgsql

 > zfs get primarycache,recordsize,logbias,compression zroot/ara/sqldb/pgsql
NAME                   PROPERTY      VALUE         SOURCE
zroot/ara/sqldb/pgsql  primarycache  all           local
zroot/ara/sqldb/pgsql  recordsize    8K            local
zroot/ara/sqldb/pgsql  logbias       latency       local
zroot/ara/sqldb/pgsql  compression   lz4           inherited from zroot


Misc:
 > cat /etc/sysctl.conf
vfs.zfs.metaslab.lba_weighting_enabled=0


Postgres:
 > /usr/local/bin/postgres --version
postgres (PostgreSQL) 9.5.3

 > cat postgresql.conf:
...
listen_addresses = '*'

max_connections = 100
shared_buffers = 16GB
effective_cache_size = 48GB
work_mem = 500MB
maintenance_work_mem = 2GB
min_wal_size = 4GB
max_wal_size = 8GB
checkpoint_completion_target = 0.9
wal_buffers = 16MB
default_statistics_target = 500
random_page_cost = 1

log_lock_waits = on
log_directory = 'pg_log'
log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log'
log_destination = 'csvlog'
logging_collector = on
log_min_duration_statement = 10000

shared_preload_libraries = 'pg_stat_statements'
track_activity_query_size = 10000
track_io_timing = on

В списке pgsql-performance по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

less than 2 sec for response - possible?