Re: [PoC] pgstattuple2: block sampling to reduce physical read

Поиск
Список
Период
Сортировка
От Mark Kirkwood
Тема Re: [PoC] pgstattuple2: block sampling to reduce physical read
Дата
Msg-id 5257259E.2070103@catalyst.net.nz
обсуждение исходный текст
Ответ на Re: [PoC] pgstattuple2: block sampling to reduce physical read  (Satoshi Nagayasu <snaga@uptime.jp>)
Ответы Re: [PoC] pgstattuple2: block sampling to reduce physical read
Список pgsql-hackers
On 16/09/13 16:20, Satoshi Nagayasu wrote:
(2013/09/15 11:07), Peter Eisentraut wrote:
On Sat, 2013-09-14 at 16:18 +0900, Satoshi Nagayasu wrote:
I'm looking forward to seeing more feedback on this approach,
in terms of design and performance improvement.
So, I have submitted this for the next CF.

Your patch fails to build:

pgstattuple.c: In function ‘pgstat_heap_sample’:
pgstattuple.c:737:13: error: ‘SnapshotNow’ undeclared (first use in this function)
pgstattuple.c:737:13: note: each undeclared identifier is reported only once for each function it appears in

Thanks for checking. Fixed to eliminate SnapshotNow.


This seems like a cool idea! I took a quick look, and initally replicated the sort of improvement you saw:


bench=# explain analyze select * from pgstattuple('pgbench_accounts');
                                                  QUERY PLAN                   
                              
--------------------------------------------------------------------------------
 Function Scan on pgstattuple  (cost=0.00..0.01 rows=1 width=72) (actual time=786.368..786.369 rows=1 loops=1)
 Total runtime: 786.384 ms
(2 rows)

bench=# explain analyze select * from pgstattuple2('pgbench_accounts');
NOTICE:  pgstattuple2: SE tuple_count 0.00, tuple_len 0.00, dead_tuple_count 0.00, dead_tuple_len 0.00, free_space 0.00
                                                  QUERY PLAN                   
                             
--------------------------------------------------------------------------------
 Function Scan on pgstattuple2  (cost=0.00..0.01 rows=1 width=72) (actual time=12.004..12.005 rows=1 loops=1)
 Total runtime: 12.019 ms
(2 rows)



I wondered what sort of difference eliminating caching would make:

$ sudo sysctl -w vm.drop_caches=3

Repeating the above queries:


bench=# explain analyze select * from pgstattuple('pgbench_accounts');
                                                   QUERY PLAN                  
                                
--------------------------------------------------------------------------------
 Function Scan on pgstattuple  (cost=0.00..0.01 rows=1 width=72) (actual time=9503.774..9503.776 rows=1 loops=1)
 Total runtime: 9504.523 ms
(2 rows)

bench=# explain analyze select * from pgstattuple2('pgbench_accounts');
NOTICE:  pgstattuple2: SE tuple_count 0.00, tuple_len 0.00, dead_tuple_count 0.00, dead_tuple_len 0.00, free_space 0.00
                                                     QUERY PLAN                
                                   
--------------------------------------------------------------------------------
 Function Scan on pgstattuple2  (cost=0.00..0.01 rows=1 width=72) (actual time=12330.630..12330.631 rows=1 loops=1)
 Total runtime: 12331.353 ms
(2 rows)



So the sampling code seems *slower* when the cache is completely cold - is that expected? (I have not looked at how the code works yet - I'll dive in later if I get a chance)!

Regards

Mark

В списке pgsql-hackers по дате отправления: