Обсуждение: Fast insert, but slow join and updates for table with 4 billion rows

Поиск
Список
Период
Сортировка

Fast insert, but slow join and updates for table with 4 billion rows

От
Lars Aksel Opsahl
Дата:


Hi


I have two main problems and that is slow updates and joins, but when I build up the table met_vaer_wisline.nora_bc25_observation with more than 4 billion we are able to insert about 85.000 rows pr sekund so thats ok.


The problems start when I need to update or joins with other tables using this table.


In this example I have two tables one with 4 billion rows and another with 50000 rows and then I try to do a standard simple join between this two tables and this takes 397391  ms. with this SQL (the query plan is added is further down)

SELECT o.*

FROM 

met_vaer_wisline.nora_bc25_observation o,

met_vaer_wisline.new_data n

WHERE o.point_uid_ref = n.id_point AND o.epoch = n.epoch


but if I use this SQL  it takes 25727 ms (the query plan is added is further down).

SELECT 

o.*

FROM 

(

SELECT o.*

FROM 

met_vaer_wisline.nora_bc25_observation o

WHERE 

EXISTS (SELECT 1 FROM  (SELECT distinct epoch FROM  met_vaer_wisline.new_data) AS n WHERE n.epoch = o.epoch )

AND 

EXISTS (SELECT 1 FROM  (SELECT distinct id_point FROM  met_vaer_wisline.new_data) AS n WHERE n.id_point = o.point_uid_ref )

) AS o,

met_vaer_wisline.new_data n

WHERE o.point_uid_ref = n.id_point AND o.epoch = n.epoch


The columns are indexed and I did run vacuum analyze on both tables before I tested. work_mem is 200MB but I also tested with much more work_mem but that does not change the execution time.

The CPU goes to 100% when the query is running and there is no IOWait while the SQL is running.


Why is the second SQL 15 times faster ?


Is this normal or have I done something wrong here ?


I have tested clustering around a index but that did not help.


Is the only way to fix slow updates and joins to use partitioning ?

https://www.postgresql.org/docs/9.6/static/ddl-partitioning.html



Here are the SQL and more info


EXPLAIN  analyze

SELECT o.*

FROM 

met_vaer_wisline.nora_bc25_observation o,

met_vaer_wisline.new_data n

WHERE o.point_uid_ref = n.id_point AND o.epoch = n.epoch



-[ RECORD 1 ]-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

QUERY PLAN | Merge Join  (cost=0.87..34374722.51 rows=52579 width=16) (actual time=0.127..397379.844 rows=50000 loops=1)

-[ RECORD 2 ]-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

QUERY PLAN |   Merge Cond: (n.id_point = o.point_uid_ref)

-[ RECORD 3 ]-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

QUERY PLAN |   Join Filter: (o.epoch = n.epoch)

-[ RECORD 4 ]-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

QUERY PLAN |   Rows Removed by Join Filter: 2179150000

-[ RECORD 5 ]-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

QUERY PLAN |   ->  Index Scan using idx_met_vaer_wisline_new_data_id_point on new_data n  (cost=0.29..23802.89 rows=50000 width=8) (actual time=0.024..16.736 rows=50000 loops=1)

-[ RECORD 6 ]-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

QUERY PLAN |   ->  Index Scan using idx_met_vaer_wisline_nora_bc25_observation_point_uid_ref on nora_bc25_observation o  (cost=0.58..2927642364.25 rows=4263866624 width=16) (actual time=0.016..210486.136 rows=2179200001 loops=1)

-[ RECORD 7 ]-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

QUERY PLAN | Total runtime: 397383.663 ms


Time: 397391.388 ms



EXPLAIN  analyze

SELECT 

o.*

FROM 

(

SELECT o.*

FROM 

met_vaer_wisline.nora_bc25_observation o

WHERE 

EXISTS (SELECT 1 FROM  (SELECT distinct epoch FROM  met_vaer_wisline.new_data) AS n WHERE n.epoch = o.epoch )

AND 

EXISTS (SELECT 1 FROM  (SELECT distinct id_point FROM  met_vaer_wisline.new_data) AS n WHERE n.id_point = o.point_uid_ref )

) AS o,

met_vaer_wisline.new_data n

WHERE o.point_uid_ref = n.id_point AND o.epoch = n.epoch


-[ RECORD 1 ]----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

QUERY PLAN | Hash Semi Join  (cost=1019.70..1039762.81 rows=54862 width=16) (actual time=359.284..25717.838 rows=50096 loops=1)

-[ RECORD 2 ]----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

QUERY PLAN |   Hash Cond: (o.point_uid_ref = new_data_1.id_point)

-[ RECORD 3 ]----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

QUERY PLAN |   ->  Nested Loop  (cost=0.87..972602.28 rows=24964326 width=16) (actual time=0.287..24412.088 rows=24262088 loops=1)

-[ RECORD 4 ]----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

QUERY PLAN |         ->  Unique  (cost=0.29..1014.29 rows=248 width=4) (actual time=0.117..6.849 rows=248 loops=1)

-[ RECORD 5 ]----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

QUERY PLAN |               ->  Index Only Scan using idx_met_vaer_wisline_new_data_epoch on new_data  (cost=0.29..889.29 rows=50000 width=4) (actual time=0.115..4.521 rows=50000 loops=1)

-[ RECORD 6 ]----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

QUERY PLAN |                     Heap Fetches: 0

-[ RECORD 7 ]----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

QUERY PLAN |         ->  Index Scan using idx_met_vaer_wisline_nora_bc25_observation_epoch on nora_bc25_observation o  (cost=0.58..2911.05 rows=100663 width=16) (actual time=0.014..89.512 rows=97831 loops=248)

-[ RECORD 8 ]----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

QUERY PLAN |               Index Cond: (epoch = new_data.epoch)

-[ RECORD 9 ]----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

QUERY PLAN |   ->  Hash  (cost=1016.31..1016.31 rows=202 width=4) (actual time=16.636..16.636 rows=202 loops=1)

-[ RECORD 10 ]---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

QUERY PLAN |         Buckets: 1024  Batches: 1  Memory Usage: 8kB

-[ RECORD 11 ]---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

QUERY PLAN |         ->  Unique  (cost=0.29..1014.29 rows=202 width=4) (actual time=0.046..16.544 rows=202 loops=1)

-[ RECORD 12 ]---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

QUERY PLAN |               ->  Index Only Scan using idx_met_vaer_wisline_new_data_id_point on new_data new_data_1  (cost=0.29..889.29 rows=50000 width=4) (actual time=0.046..11.315 rows=50000 loops=1)

-[ RECORD 13 ]---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

QUERY PLAN |                     Heap Fetches: 0

-[ RECORD 14 ]---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

QUERY PLAN | Total runtime: 25719.120 ms


Time: 25727.097 ms



select version();

                                                   version                                                    

--------------------------------------------------------------------------------------------------------------

 PostgreSQL 9.3.9 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-9), 64-bit

(1 row)


 \d met_vaer_wisline.nora_bc25_observation;

Table "met_vaer_wisline.nora_bc25_observation"

       Column       |  Type   | Modifiers 

--------------------+---------+-----------

 point_uid_ref      | integer | not null

 epoch              | integer | not null

 windspeed_10m      | real    | 

 air_temperature_2m | real    | 

Indexes:

    "idx_met_vaer_wisline_nora_bc25_observation_epoch" btree (epoch)

    "idx_met_vaer_wisline_nora_bc25_observation_point_uid_ref" btree (point_uid_ref)

    

\d met_vaer_wisline.new_data ;

     Unlogged table "met_vaer_wisline.new_data"

       Column       |       Type        | Modifiers 

--------------------+-------------------+-----------

 windspeed_10m      | real              | 

 air_temperature_2m | real              | 

 lon                | character varying | not null

 lat                | character varying | not null

 epoch              | integer           | 

 epoch_as_numeric   | numeric           | not null

 rest               | character varying

 id_point           | integer           | 

Indexes:

    "idx_met_vaer_wisline_new_data_epoch" btree (epoch)

    "idx_met_vaer_wisline_new_data_id_point" btree (id_point)


vacuum analyze met_vaer_wisline.nora_bc25_observation;


vacuum analyze met_vaer_wisline.new_data;


SELECT count(*) from met_vaer_wisline.new_data;

 count 

-------

 50000

(1 row)


SELECT count(*) from met_vaer_wisline.nora_bc25_observation ;

   count    

------------

  4263866304


Thanks .


Lars



Re: Fast insert, but slow join and updates for table with 4 billion rows

От
Tom Lane
Дата:
Lars Aksel Opsahl <Lars.Opsahl@nibio.no> writes:
> In this example I have two tables one with 4 billion rows and another with 50000 rows and then I try to do a standard
simplejoin between this two tables and this takes 397391  ms. with this SQL (the query plan is added is further down) 

This particular query would work a lot better if you had an index on
nora_bc25_observation (point_uid_ref, epoch), ie both join columns
in one index.  I get the impression that that ought to be the primary
key of the table, which would be an even stronger reason to have a
unique index on it.

            regards, tom lane


Re: Fast insert, but slow join and updates for table with 4 billion rows

От
Lars Aksel Opsahl
Дата:
Hi

Yes this makes both the update and both selects much faster. We are now down to 3000 ms. for select, but then I get a
problemwith another SQL where I only use epoch in the query.  

SELECT count(o.*) FROM  met_vaer_wisline.nora_bc25_observation o WHERE o.epoch = 1288440000;
 count
-------
 97831
(1 row)
Time: 92763.389 ms

To get the SQL above work fast it seems like we also need a single index on the epoch column, this means two indexes on
thesame column and that eats memory when we have more than 4 billion rows. 

Is it any way to avoid to two indexes on the epoch column ?

Thanks.

Lars

EXPLAIN  analyze SELECT count(o.*) FROM  met_vaer_wisline.nora_bc25_observation o WHERE o.epoch = 1288440000;
-[ RECORD 1
]-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
QUERY PLAN | Aggregate  (cost=44016888.13..44016888.14 rows=1 width=42) (actual time=91307.470..91307.471 rows=1
loops=1)
-[ RECORD 2
]-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
QUERY PLAN |   ->  Index Scan using idx_met_vaer_wisline_nora_bc25_observation_test on nora_bc25_observation o
(cost=0.58..44016649.38rows=95500 width=42) (actual time=1.942..91287.495 rows=97831 loops=1) 
-[ RECORD 3
]-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
QUERY PLAN |         Index Cond: (epoch = 1288440000)
-[ RECORD 4
]-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
QUERY PLAN | Total runtime: 91307.534 ms


EXPLAIN  analyze
SELECT count(o.*)
FROM
met_vaer_wisline.nora_bc25_observation o,
met_vaer_wisline.new_data n
WHERE o.point_uid_ref = n.id_point AND o.epoch = n.epoch;
-[ RECORD 1
]----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
QUERY PLAN | Aggregate  (cost=131857.71..131857.72 rows=1 width=42) (actual time=182.459..182.459 rows=1 loops=1)
-[ RECORD 2
]----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
QUERY PLAN |   ->  Nested Loop  (cost=0.58..131727.00 rows=52283 width=42) (actual time=0.114..177.420 rows=50000
loops=1)
-[ RECORD 3
]----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
QUERY PLAN |         ->  Seq Scan on new_data n  (cost=0.00..1136.00 rows=50000 width=8) (actual time=0.050..7.873
rows=50000loops=1) 
-[ RECORD 4
]----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
QUERY PLAN |         ->  Index Scan using idx_met_vaer_wisline_nora_bc25_observation_test on nora_bc25_observation o
(cost=0.58..2.60rows=1 width=50) (actual time=0.003..0.003 rows=1 loops=50000) 
-[ RECORD 5
]----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
QUERY PLAN |               Index Cond: ((point_uid_ref = n.id_point) AND (epoch = n.epoch))
-[ RECORD 6
]----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
QUERY PLAN | Total runtime: 182.536 ms

Time: 3095.618 ms


Lars


________________________________________
Fra: pgsql-performance-owner@postgresql.org <pgsql-performance-owner@postgresql.org> på vegne av Tom Lane
<tgl@sss.pgh.pa.us>
Sendt: 24. oktober 2016 14:52
Til: Lars Aksel Opsahl
Kopi: pgsql-performance@postgresql.org
Emne: Re: [PERFORM] Fast insert, but slow join and updates for table with 4 billion rows

Lars Aksel Opsahl <Lars.Opsahl@nibio.no> writes:
> In this example I have two tables one with 4 billion rows and another with 50000 rows and then I try to do a standard
simplejoin between this two tables and this takes 397391  ms. with this SQL (the query plan is added is further down) 

This particular query would work a lot better if you had an index on
nora_bc25_observation (point_uid_ref, epoch), ie both join columns
in one index.  I get the impression that that ought to be the primary
key of the table, which would be an even stronger reason to have a
unique index on it.

                        regards, tom lane


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: Fast insert, but slow join and updates for table with 4 billion rows

От
Scott Marlowe
Дата:
On Mon, Oct 24, 2016 at 2:07 PM, Lars Aksel Opsahl <Lars.Opsahl@nibio.no> wrote:
> Hi
>
> Yes this makes both the update and both selects much faster. We are now down to 3000 ms. for select, but then I get a
problemwith another SQL where I only use epoch in the query. 
>
> SELECT count(o.*) FROM  met_vaer_wisline.nora_bc25_observation o WHERE o.epoch = 1288440000;
>  count
> -------
>  97831
> (1 row)
> Time: 92763.389 ms
>
> To get the SQL above work fast it seems like we also need a single index on the epoch column, this means two indexes
onthe same column and that eats memory when we have more than 4 billion rows. 
>
> Is it any way to avoid to two indexes on the epoch column ?

You could try reversing the order. Basically whatever comes first in a
two column index is easier / possible for postgres to use like a
single column index. If not. then you're probably stuck with two
indexes.


Re: Fast insert, but slow join and updates for table with 4 billion rows

От
Lars Aksel Opsahl
Дата:
Hi

Yes that helps, I tested this on now on the first column now.

This basically means that only the first column in multiple column index may be used in single column query.

EXPLAIN  analyze SELECT count(o.*) FROM met_vaer_wisline.nora_bc25_observation o WHERE o.point_uid_ref = 15 ;
                                                                               QUERY PLAN
                                                

------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=45540.97..45540.98 rows=1 width=42) (actual time=24.715..24.715 rows=1 loops=1)
   ->  Bitmap Heap Scan on nora_bc25_observation o  (cost=477.66..45427.40 rows=45430 width=42) (actual
time=6.436..19.006rows=43832 loops=1) 
         Recheck Cond: (point_uid_ref = 15)
         ->  Bitmap Index Scan on idx_met_vaer_wisline_nora_bc25_observation_test  (cost=0.00..466.30 rows=45430
width=0)(actual time=6.320..6.320 rows=43832 loops=1) 
               Index Cond: (point_uid_ref = 15)
 Total runtime: 24.767 ms
(6 rows)


Thanks

Lars

________________________________________
Fra: Scott Marlowe <scott.marlowe@gmail.com>
Sendt: 24. oktober 2016 22:23
Til: Lars Aksel Opsahl
Kopi: Tom Lane; pgsql-performance@postgresql.org
Emne: Re: [PERFORM] Fast insert, but slow join and updates for table with 4 billion rows

On Mon, Oct 24, 2016 at 2:07 PM, Lars Aksel Opsahl <Lars.Opsahl@nibio.no> wrote:
> Hi
>
> Yes this makes both the update and both selects much faster. We are now down to 3000 ms. for select, but then I get a
problemwith another SQL where I only use epoch in the query. 
>
> SELECT count(o.*) FROM  met_vaer_wisline.nora_bc25_observation o WHERE o.epoch = 1288440000;
>  count
> -------
>  97831
> (1 row)
> Time: 92763.389 ms
>
> To get the SQL above work fast it seems like we also need a single index on the epoch column, this means two indexes
onthe same column and that eats memory when we have more than 4 billion rows. 
>
> Is it any way to avoid to two indexes on the epoch column ?

You could try reversing the order. Basically whatever comes first in a
two column index is easier / possible for postgres to use like a
single column index. If not. then you're probably stuck with two
indexes.

Re: Fast insert, but slow join and updates for table with 4 billion rows

От
Lars Aksel Opsahl
Дата:
Hi

I have now tested through insert and the updated and it works extremely good with out doing any partitioning on big
table(428.812.8392 rows) this case, when we used a common index as Tom Suggested.  

We are able to insert 172.000 rows pr. second.

The number of rows are competed based total time from when we start to read the csv files and until the last file is
done.We use GNU parallel and run 5 threads. The number of inserts are actually 172.000 * 2 because first we copy the
rowsinto a temp table and there we prepare the date and then insert them into the common big main table. There is no
errorsin the log.  

We are able update 98.000 rows pr, second.

Since each update also means one insert we are close 200.000 inserts and updates pr. second. For update we give a
columncolumn that is null a value. Thats is done for all the 4.3 billions rows. We run 5 threads in parallel here also,
andthere is no error and no dead locks.  

To get around the problem with duplication of indexes it's solvable in this project because first we add date and then
wedo analyses, this means that we can have different indexes when adding data and we are using them. 

In this project we going add about 25 billions geo located observations which which will be used for doing analyses. I
supposethat we at some level have to do partitioning but so far Postgres has worked extremely well even if it's based
onMVCC.  

Postgres/Postgis software and communities are sure for sure really fun to work with Postgres/Postgis open source
softwarehold a very high quality.  

Thanks.

Lars

________________________________________
Fra: pgsql-performance-owner@postgresql.org <pgsql-performance-owner@postgresql.org> på vegne av Tom Lane
<tgl@sss.pgh.pa.us>
Sendt: 24. oktober 2016 14:52
Til: Lars Aksel Opsahl
Kopi: pgsql-performance@postgresql.org
Emne: Re: [PERFORM] Fast insert, but slow join and updates for table with 4 billion rows

Lars Aksel Opsahl <Lars.Opsahl@nibio.no> writes:
> In this example I have two tables one with 4 billion rows and another with 50000 rows and then I try to do a standard
simplejoin between this two tables and this takes 397391  ms. with this SQL (the query plan is added is further down) 

This particular query would work a lot better if you had an index on
nora_bc25_observation (point_uid_ref, epoch), ie both join columns
in one index.  I get the impression that that ought to be the primary
key of the table, which would be an even stronger reason to have a
unique index on it.

                        regards, tom lane


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance