Re: Fast insert, but slow join and updates for table with 4 billion rows

Поиск
Список
Период
Сортировка
От Lars Aksel Opsahl
Тема Re: Fast insert, but slow join and updates for table with 4 billion rows
Дата
Msg-id f2909d9b68694c4bb8cae754a3fce5f3@nibio.no
обсуждение исходный текст
Ответ на Re: Fast insert, but slow join and updates for table with 4 billion rows  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Fast insert, but slow join and updates for table with 4 billion rows  (Scott Marlowe <scott.marlowe@gmail.com>)
Список pgsql-performance
Hi

Yes this makes both the update and both selects much faster. We are now down to 3000 ms. for select, but then I get a
problemwith another SQL where I only use epoch in the query.  

SELECT count(o.*) FROM  met_vaer_wisline.nora_bc25_observation o WHERE o.epoch = 1288440000;
 count
-------
 97831
(1 row)
Time: 92763.389 ms

To get the SQL above work fast it seems like we also need a single index on the epoch column, this means two indexes on
thesame column and that eats memory when we have more than 4 billion rows. 

Is it any way to avoid to two indexes on the epoch column ?

Thanks.

Lars

EXPLAIN  analyze SELECT count(o.*) FROM  met_vaer_wisline.nora_bc25_observation o WHERE o.epoch = 1288440000;
-[ RECORD 1
]-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
QUERY PLAN | Aggregate  (cost=44016888.13..44016888.14 rows=1 width=42) (actual time=91307.470..91307.471 rows=1
loops=1)
-[ RECORD 2
]-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
QUERY PLAN |   ->  Index Scan using idx_met_vaer_wisline_nora_bc25_observation_test on nora_bc25_observation o
(cost=0.58..44016649.38rows=95500 width=42) (actual time=1.942..91287.495 rows=97831 loops=1) 
-[ RECORD 3
]-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
QUERY PLAN |         Index Cond: (epoch = 1288440000)
-[ RECORD 4
]-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
QUERY PLAN | Total runtime: 91307.534 ms


EXPLAIN  analyze
SELECT count(o.*)
FROM
met_vaer_wisline.nora_bc25_observation o,
met_vaer_wisline.new_data n
WHERE o.point_uid_ref = n.id_point AND o.epoch = n.epoch;
-[ RECORD 1
]----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
QUERY PLAN | Aggregate  (cost=131857.71..131857.72 rows=1 width=42) (actual time=182.459..182.459 rows=1 loops=1)
-[ RECORD 2
]----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
QUERY PLAN |   ->  Nested Loop  (cost=0.58..131727.00 rows=52283 width=42) (actual time=0.114..177.420 rows=50000
loops=1)
-[ RECORD 3
]----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
QUERY PLAN |         ->  Seq Scan on new_data n  (cost=0.00..1136.00 rows=50000 width=8) (actual time=0.050..7.873
rows=50000loops=1) 
-[ RECORD 4
]----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
QUERY PLAN |         ->  Index Scan using idx_met_vaer_wisline_nora_bc25_observation_test on nora_bc25_observation o
(cost=0.58..2.60rows=1 width=50) (actual time=0.003..0.003 rows=1 loops=50000) 
-[ RECORD 5
]----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
QUERY PLAN |               Index Cond: ((point_uid_ref = n.id_point) AND (epoch = n.epoch))
-[ RECORD 6
]----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
QUERY PLAN | Total runtime: 182.536 ms

Time: 3095.618 ms


Lars


________________________________________
Fra: pgsql-performance-owner@postgresql.org <pgsql-performance-owner@postgresql.org> på vegne av Tom Lane
<tgl@sss.pgh.pa.us>
Sendt: 24. oktober 2016 14:52
Til: Lars Aksel Opsahl
Kopi: pgsql-performance@postgresql.org
Emne: Re: [PERFORM] Fast insert, but slow join and updates for table with 4 billion rows

Lars Aksel Opsahl <Lars.Opsahl@nibio.no> writes:
> In this example I have two tables one with 4 billion rows and another with 50000 rows and then I try to do a standard
simplejoin between this two tables and this takes 397391  ms. with this SQL (the query plan is added is further down) 

This particular query would work a lot better if you had an index on
nora_bc25_observation (point_uid_ref, epoch), ie both join columns
in one index.  I get the impression that that ought to be the primary
key of the table, which would be an even stronger reason to have a
unique index on it.

                        regards, tom lane


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

В списке pgsql-performance по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Fast insert, but slow join and updates for table with 4 billion rows
Следующее
От: Scott Marlowe
Дата:
Сообщение: Re: Fast insert, but slow join and updates for table with 4 billion rows