Re: [PERFORM] Very poor read performance, query independent

Поиск
Список
Период
Сортировка
От Charles Nadeau
Тема Re: [PERFORM] Very poor read performance, query independent
Дата
Msg-id CADFyZw4Fsj4xzFXYSzBGe0kwhDkV2-2bzJHWA+q9GP5T2C_vKg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [PERFORM] Very poor read performance, query independent  (Igor Neyman <ineyman@perceptron.com>)
Список pgsql-performance
Igor,

The 1st clause of the where statement won't select addresses the same way as the one I wrote using the extension for IPv6 and IPv6 data types.

flowstimestamp is a view:
flows=# \d+ flowstimestamp 
                        View "public.flowstimestamp"
   Column    |           Type           | Modifiers | Storage | Description 
-------------+--------------------------+-----------+---------+-------------
 flow_id     | bigint                   |           | plain   | 
 unix_secs   | bigint                   |           | plain   | 
 unix_nsecs  | bigint                   |           | plain   | 
 sysuptime   | bigint                   |           | plain   | 
 exaddr      | ip4                      |           | plain   | 
 dpkts       | integer                  |           | plain   | 
 doctets     | bigint                   |           | plain   | 
 first       | bigint                   |           | plain   | 
 last        | bigint                   |           | plain   | 
 engine_type | smallint                 |           | plain   | 
 engine_id   | smallint                 |           | plain   | 
 srcaddr     | ip4                      |           | plain   | 
 dstaddr     | ip4                      |           | plain   | 
 nexthop     | ip4                      |           | plain   | 
 input       | integer                  |           | plain   | 
 output      | integer                  |           | plain   | 
 srcport     | integer                  |           | plain   | 
 dstport     | integer                  |           | plain   | 
 prot        | smallint                 |           | plain   | 
 tos         | smallint                 |           | plain   | 
 tcp_flags   | smallint                 |           | plain   | 
 src_mask    | smallint                 |           | plain   | 
 dst_mask    | smallint                 |           | plain   | 
 src_as      | integer                  |           | plain   | 
 dst_as      | integer                  |           | plain   | 
 timestamp   | timestamp with time zone |           | plain   | 
View definition:
 SELECT flows.flow_id,
    flows.unix_secs,
    flows.unix_nsecs,
    flows.sysuptime,
    flows.exaddr,
    flows.dpkts,
    flows.doctets,
    flows.first,
    flows.last,
    flows.engine_type,
    flows.engine_id,
    flows.srcaddr,
    flows.dstaddr,
    flows.nexthop,
    flows.input,
    flows.output,
    flows.srcport,
    flows.dstport,
    flows.prot,
    flows.tos,
    flows.tcp_flags,
    flows.src_mask,
    flows.dst_mask,
    flows.src_as,
    flows.dst_as,
    to_timestamp((flows.unix_secs + flows.unix_nsecs / 1000000000)::double precision) AS "timestamp"
   FROM flows;

And it can use the indexes of flows:
Indexes:
    "flows_pkey" PRIMARY KEY, btree (flow_id)
    "flows_dstaddr_dstport" btree (dstaddr, dstport)
    "flows_srcaddr_dstaddr_idx" btree (srcaddr, dstaddr)
    "flows_srcaddr_srcport" btree (srcaddr, srcport)
    "flows_srcport_dstport_idx" btree (srcport, dstport)

Thanks!

Charles

On Fri, Jul 14, 2017 at 10:18 PM, Igor Neyman <ineyman@perceptron.com> wrote:

 

 

From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Igor Neyman
Sent: Friday, July 14, 2017 3:13 PM
To: Charles Nadeau <charles.nadeau@gmail.com>


Cc: Jeff Janes <jeff.janes@gmail.com>; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Very poor read performance, query independent

 

From: Charles Nadeau [mailto:charles.nadeau@gmail.com]
Sent: Friday, July 14, 2017 11:35 AM
To: Igor Neyman <ineyman@perceptron.com>
Cc: Jeff Janes <jeff.janes@gmail.com>; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Very poor read performance, query independent

 

Igor,

 

Initially temp_buffer was left to its default value (8MB). Watching the content of the directory that stores the temporary files, I found that I need at most 21GB of temporary files space. Should I set temp_buffer to 21GB?

Here is the explain you requested with work_mem set to 6GB:

flows=# set work_mem='6GB';

SET

flows=# explain (analyze, buffers) SELECT DISTINCT

   srcaddr,

   dstaddr,

   dstport,

   COUNT(*) AS conversation,

   SUM(doctets) / 1024 / 1024 AS mbytes

FROM

   flowscompact,

   mynetworks

WHERE

   mynetworks.ipaddr >>= flowscompact.srcaddr

   AND dstaddr IN

   (

      SELECT

         dstaddr

      FROM

         dstexterne

   )

GROUP BY

   srcaddr,

   dstaddr,

   dstport

ORDER BY

   mbytes DESC LIMIT 50;

                                                                                           QUERY PLAN                                                                                           

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

 Limit  (cost=48135680.07..48135680.22 rows=50 width=52) (actual time=2227678.196..2227678.223 rows=50 loops=1)

   Buffers: shared hit=728798038 read=82974833, temp read=381154 written=381154

   ->  Unique  (cost=48135680.07..48143613.62 rows=2644514 width=52) (actual time=2227678.194..2227678.217 rows=50 loops=1)

         Buffers: shared hit=728798038 read=82974833, temp read=381154 written=381154

         ->  Sort  (cost=48135680.07..48137002.33 rows=2644514 width=52) (actual time=2227678.192..2227678.202 rows=50 loops=1)

               Sort Key: (((sum(flows.doctets) / '1024'::numeric) / '1024'::numeric)) DESC, flows.srcaddr, flows.dstaddr, flows.dstport, (count(*))

               Sort Method: quicksort  Memory: 654395kB

               Buffers: shared hit=728798038 read=82974833, temp read=381154 written=381154

               ->  GroupAggregate  (cost=48059426.65..48079260.50 rows=2644514 width=52) (actual time=2167909.030..2211446.192 rows=5859671 loops=1)

                     Group Key: flows.srcaddr, flows.dstaddr, flows.dstport

                     Buffers: shared hit=728798038 read=82974833, temp read=381154 written=381154

                     ->  Sort  (cost=48059426.65..48060748.90 rows=2644514 width=20) (actual time=2167896.815..2189107.205 rows=91745640 loops=1)

                           Sort Key: flows.srcaddr, flows.dstaddr, flows.dstport

                           Sort Method: external merge  Disk: 3049216kB

                           Buffers: shared hit=728798038 read=82974833, temp read=381154 written=381154

                           ->  Gather  (cost=30060688.07..48003007.07 rows=2644514 width=20) (actual time=1268989.000..1991357.232 rows=91745640 loops=1)

                                 Workers Planned: 12

                                 Workers Launched: 12

                                 Buffers: shared hit=728798037 read=82974833

                                 ->  Hash Semi Join  (cost=30059688.07..47951761.31 rows=220376 width=20) (actual time=1268845.181..2007864.725 rows=7057357 loops=13)

                                       Hash Cond: (flows.dstaddr = flows_1.dstaddr)

                                       Buffers: shared hit=728795193 read=82974833

                                       ->  Nested Loop  (cost=0.03..17891246.86 rows=220376 width=20) (actual time=0.207..723790.283 rows=37910370 loops=13)

                                             Buffers: shared hit=590692229 read=14991777

                                             ->  Parallel Seq Scan on flows  (cost=0.00..16018049.14 rows=55094048 width=20) (actual time=0.152..566179.117 rows=45371630 loops=13)

                                                   Buffers: shared hit=860990 read=14991777

                                             ->  Index Only Scan using mynetworks_ipaddr_idx on mynetworks  (cost=0.03..0.03 rows=1 width=8) (actual time=0.002..0.002 rows=1 loops=589831190)

                                                   Index Cond: (ipaddr >>= (flows.srcaddr)::ip4r)

                                                   Heap Fetches: 0

                                                   Buffers: shared hit=589831203

                                       ->  Hash  (cost=30059641.47..30059641.47 rows=13305 width=4) (actual time=1268811.101..1268811.101 rows=3803508 loops=13)

                                             Buckets: 4194304 (originally 16384)  Batches: 1 (originally 1)  Memory Usage: 166486kB

                                             Buffers: shared hit=138102964 read=67983056

                                             ->  HashAggregate  (cost=30059561.64..30059601.56 rows=13305 width=4) (actual time=1265248.165..1267432.083 rows=3803508 loops=13)

                                                   Group Key: flows_1.dstaddr

                                                   Buffers: shared hit=138102964 read=67983056

                                                   ->  Nested Loop Anti Join  (cost=0.00..29729327.92 rows=660467447 width=4) (actual time=0.389..1201072.707 rows=125838232 loops=13)

                                                         Join Filter: (mynetworks_1.ipaddr >> (flows_1.dstaddr)::ip4r)

                                                         Rows Removed by Join Filter: 503353617

                                                         Buffers: shared hit=138102964 read=67983056

                                                         ->  Seq Scan on flows flows_1  (cost=0.00..17836152.73 rows=661128576 width=4) (actual time=0.322..343152.274 rows=589831190 loops=13)

                                                               Buffers: shared hit=138102915 read=67983056

                                                         ->  Materialize  (cost=0.00..1.02 rows=4 width=8) (actual time=0.000..0.000 rows=2 loops=7667805470)

                                                               Buffers: shared hit=13

                                                               ->  Seq Scan on mynetworks mynetworks_1  (cost=0.00..1.01 rows=4 width=8) (actual time=0.006..0.007 rows=4 loops=13)

                                                                     Buffers: shared hit=13

 Planning time: 0.941 ms

 Execution time: 2228345.171 ms

(48 rows)

 

With a work_mem at 6GB, I noticed that for the first 20 minutes the query was running, the i/o wait was much lower, hovering aroun 3% then it jumped 45% until almost the end of the query. 

 

flowscompact and dstexterne are actually views. I use views to simplify query writing and to "abstract" queries that are use often in other queries. flowscompact is a view built on table flows (having about 590 million rows), it only keeps the most often used fields.

flows=# \d+ flowscompact;

                        View "public.flowscompact"

  Column   |           Type           | Modifiers | Storage | Description 

-----------+--------------------------+-----------+---------+-------------

 flow_id   | bigint                   |           | plain   | 

 sysuptime | bigint                   |           | plain   | 

 exaddr    | ip4                      |           | plain   | 

 dpkts     | integer                  |           | plain   | 

 doctets   | bigint                   |           | plain   | 

 first     | bigint                   |           | plain   | 

 last      | bigint                   |           | plain   | 

 srcaddr   | ip4                      |           | plain   | 

 dstaddr   | ip4                      |           | plain   | 

 srcport   | integer                  |           | plain   | 

 dstport   | integer                  |           | plain   | 

 prot      | smallint                 |           | plain   | 

 tos       | smallint                 |           | plain   | 

 tcp_flags | smallint                 |           | plain   | 

 timestamp | timestamp with time zone |           | plain   | 

View definition:

 SELECT flowstimestamp.flow_id,

    flowstimestamp.sysuptime,

    flowstimestamp.exaddr,

    flowstimestamp.dpkts,

    flowstimestamp.doctets,

    flowstimestamp.first,

    flowstimestamp.last,

    flowstimestamp.srcaddr,

    flowstimestamp.dstaddr,

    flowstimestamp.srcport,

    flowstimestamp.dstport,

    flowstimestamp.prot,

    flowstimestamp.tos,

    flowstimestamp.tcp_flags,

    flowstimestamp."timestamp"

   FROM flowstimestamp;

mynetworks is a table having one column and 4 rows; it contains a list of our network networks:

flows=# select * from mynetworks;

     ipaddr     

----------------

(4 row)

flows=# \d+ mynetworks;

                    Table "public.mynetworks"

 Column | Type | Modifiers | Storage | Stats target | Description 

--------+------+-----------+---------+--------------+-------------

 ipaddr | ip4r |           | plain   |              | 

Indexes:

    "mynetworks_ipaddr_idx" gist (ipaddr)

dstexterne is a view listing all the destination IPv4 addresses not inside our network; it has one column and 3.8 million rows.

flows=# \d+ dstexterne;

              View "public.dstexterne"

 Column  | Type | Modifiers | Storage | Description 

---------+------+-----------+---------+-------------

 dstaddr | ip4  |           | plain   | 

View definition:

 SELECT DISTINCT flowscompact.dstaddr

   FROM flowscompact

     LEFT JOIN mynetworks ON mynetworks.ipaddr >> flowscompact.dstaddr::ip4r

  WHERE mynetworks.ipaddr IS NULL;

Thanks!

 

Charles

 

Charles,

 

Also, let’s try to simplify your query and see if it performs better.

You are grouping by srcaddr, dstaddr, dstport, that makes DISTINCT not needed.

And after simplifying WHERE clause (let me know if the result is not what you want), the query looks like:

 

SELECT srcaddr, dstaddr, dstport,

   COUNT(*) AS conversation,

   SUM(doctets) / 1024 / 1024 AS mbytes

FROM flowscompact

WHERE srcaddr IN (SELECT ipaddr FROM mynetworks)

     AND dstaddr NOT IN (SELECT ipaddr FROM mynetworks)

GROUP BY srcaddr, dstaddr, dstport

ORDER BY mbytes DESC

LIMIT 50;

 

Now, you didn’t provide the definition of flowstimestamp table.

If this table doesn’t have an index on (srcaddr, dstaddr, dstport) creating one should help (I think).

 

Igor

 

 

 




--

В списке pgsql-performance по дате отправления:

Предыдущее
От: Charles Nadeau
Дата:
Сообщение: Re: [PERFORM] Very poor read performance, query independent
Следующее
От: Claudio Freire
Дата:
Сообщение: Re: [PERFORM] Very poor read performance, query independent