So, is COUNT(*) fast now?
От | Robert Haas |
---|---|
Тема | So, is COUNT(*) fast now? |
Дата | |
Msg-id | CA+Tgmoa+E4i5+um8RXFyciDYk7U+7nz+h1p8hzHx6cUf+_i3rw@mail.gmail.com обсуждение исходный текст |
Ответы |
Re: So, is COUNT(*) fast now?
Re: So, is COUNT(*) fast now? |
Список | pgsql-hackers |
Laments at: http://wiki.postgresql.org/wiki/FAQ#Why_is_.22SELECT_count.28.2A.29_FROM_bigtable.3B.22_slow.3F http://wiki.postgresql.org/wiki/Slow_Counting I tried this on my MacBook Pro this morning, using pgbench -i -s 500 to create a database about 7.5GB in size, and then using "SELECT sum(1) FROM pgbench_accounts" as a test query, on a build WITHOUT --enable-cassert. This machine has 4GB of memory, and I set shared_buffers = 400MB. (No, I'm not sure whether that's the optimal setting for shared_buffers for this machine.) With enable_indexonlyscan = false, times for this query are: 96799.747 ms, 89108.987 ms, 114433.664 ms. With enable_indexonlyscan = true, times were: 16779.325 ms, 16537.726 ms, 16703.229 ms. That's about six times faster. It's worth noting that the pgbench_accounts table has relatively narrow rows. On a table with wider rows (but not so wide that they get toasted and become narrow again), the benefit might be more. On the other hand, this is also a table that's just been vacuumed, and you've got the happy case where the table (6404 MB) does not fit in memory but but the index (1071 MB) does. What happens on a smaller test case? Here are the results at scale factor = 100: enable_indexonlyscan = false times: 1774.945 ms, 1784.985 ms, 1836.099 ms enable_indexonlyscan = true times: 1450.445 ms, 1452.407 ms, 1452.426 ms That's about a 23% speedup. At this scale factor, everything fits into memory, but the index by itself (214 MB) fits into memory while the table (1281 MB) does not. Let's crank the scale factor down some more. Here are the results at scale_factor = 20: enable_indexonlyscan = false times: 352.213 ms, 353.988 ms, 350.859 ms enable_indexonlyscan = true times: 300.623 ms, 301.355 ms, 302.346 ms Now the entire database fits into shared_buffers, but we're still seeing a 17% speedup. But this turns out to misleading. The ring-buffer logic is actually preventing shared_buffers from getting all of the heap blocks in cache quickly. If I run the query over and over again until the whole table actually makes it into shared buffers, the sequential scan gets much faster: enable_indexonlyscan = false times after lots and lots of prewarming: 215.487 ms, 219.006 ms, 218.490 ms That's a bit disappointing - it's now more than a third faster to do the sequential scan, even though the sequential scan has to touch six times as many blocks (at scale factor 20, index is 43 MB, table is 256 MB) all of which are in cache. Of course, touching that many fewer blocks does have some advantages if there is concurrent activity on the system, but it still seems unfortunate that the ratio of runtime to blocks touched is more than 8x higher for the index-only case. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
В списке pgsql-hackers по дате отправления:
Следующее
От: Kohei KaiGaiДата:
Сообщение: Re: [v9.2] Object access hooks with arguments support (v1)