Обсуждение: Data caching

Поиск
Список
Период
Сортировка

Data caching

От
Martin Chlupac
Дата:
Hello everybody,
I have a simple query which selects data from not very large table (
434161 rows) and takes far more time than I'd expect. I believe it's
due to a poor disk performance because when I execute the very same
query for a second time I get much better results (caching kicks in?).
Can you please confirm my theory or do you see any other possible
explanation?

Thank you in advance

Martin


# explain analyze select * from
"records_f4f23ca0-9c35-43ac-bb0d-1ef3784399ac" where variable_id=7553
and ts > '2009-07-01 17:00:00' and ts < now() order by ts limit 20000;

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=3924.13..3928.91 rows=1912 width=206) (actual
time=3687.661..3705.546 rows=2161 loops=1)
   ->  Sort  (cost=3924.13..3928.91 rows=1912 width=206) (actual
time=3687.654..3693.864 rows=2161 loops=1)
         Sort Key: ts
         Sort Method:  quicksort  Memory: 400kB
         ->  Bitmap Heap Scan on
"records_f4f23ca0-9c35-43ac-bb0d-1ef3784399ac"  (cost=76.75..3819.91
rows=1912 width=206) (actual time=329.416..3677.521 rows=2161 loops=1)
               Recheck Cond: ((variable_id = 7553) AND (ts >
'2009-07-01 17:00:00'::timestamp without time zone) AND (ts < now()))
               ->  Bitmap Index Scan on pokusny_index
(cost=0.00..76.27 rows=1912 width=0) (actual time=304.160..304.160
rows=2687 loops=1)
                     Index Cond: ((variable_id = 7553) AND (ts >
'2009-07-01 17:00:00'::timestamp without time zone) AND (ts < now()))
 Total runtime: 3711.488 ms
(9 rows)

# explain analyze select * from
"records_f4f23ca0-9c35-43ac-bb0d-1ef3784399ac" where variable_id=7553
and ts > '2009-07-01 17:00:00' and ts < now() order by ts limit 20000;

        QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=3924.13..3928.91 rows=1912 width=206) (actual
time=18.135..35.140 rows=2161 loops=1)
   ->  Sort  (cost=3924.13..3928.91 rows=1912 width=206) (actual
time=18.127..24.064 rows=2161 loops=1)
         Sort Key: ts
         Sort Method:  quicksort  Memory: 400kB
         ->  Bitmap Heap Scan on
"records_f4f23ca0-9c35-43ac-bb0d-1ef3784399ac"  (cost=76.75..3819.91
rows=1912 width=206) (actual time=1.616..10.369 rows=2161 loops=1)
               Recheck Cond: ((variable_id = 7553) AND (ts >
'2009-07-01 17:00:00'::timestamp without time zone) AND (ts < now()))
               ->  Bitmap Index Scan on pokusny_index
(cost=0.00..76.27 rows=1912 width=0) (actual time=1.352..1.352
rows=2687 loops=1)
                     Index Cond: ((variable_id = 7553) AND (ts >
'2009-07-01 17:00:00'::timestamp without time zone) AND (ts < now()))
 Total runtime: 40.971 ms
(9 rows)

Re: Data caching

От
Richard Huxton
Дата:
Martin Chlupac wrote:
> Hello everybody,
> I have a simple query which selects data from not very large table (
> 434161 rows) and takes far more time than I'd expect. I believe it's
> due to a poor disk performance because when I execute the very same
> query for a second time I get much better results (caching kicks in?).
> Can you please confirm my theory or do you see any other possible
> explanation?

Yep - it's the difference between fetching from memory and from disk.

>          ->  Bitmap Heap Scan on
> "records_f4f23ca0-9c35-43ac-bb0d-1ef3784399ac"  (cost=76.75..3819.91
> rows=1912 width=206) (actual time=329.416..3677.521 rows=2161 loops=1)

>          ->  Bitmap Heap Scan on
> "records_f4f23ca0-9c35-43ac-bb0d-1ef3784399ac"  (cost=76.75..3819.91
> rows=1912 width=206) (actual time=1.616..10.369 rows=2161 loops=1)

The plan scans the index, and builds up a bitmap of which disk-blocks
contain (potential) matches. It then has to read the blocks (the heap
scan above), confirm they match and then return the rows. If you look at
the "actual time" above you can see about 90% of the slow query is spent
doing this.

--
   Richard Huxton
   Archonet Ltd