Re: COUNT(*) and index-only scans

Поиск
Список
Период
Сортировка
От Jeff Janes
Тема Re: COUNT(*) and index-only scans
Дата
Msg-id CAMkU=1xeG9M4cYgoMTGrY1GCF==JKn4mm+D-oTT4u-n=MDKdcA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: COUNT(*) and index-only scans  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
Список pgsql-hackers
On Mon, Oct 10, 2011 at 9:48 PM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:
>> Jeff Janes  wrote:
>> Kevin Grittner  wrote:
>
>>> create table t (id int not null primary key);
>>> insert into t select generate_series(1, 1000000);
>>> vacuum freeze analyze;
>>> explain analyze select count(*) from t
>>> where id between 500000 and 500010;
>>>
>>> That gives you an index-only scan; but without the WHERE clause it
>>> uses a seq scan.
>>
>> If you convert the where clause to "where id is not null" it uses
>> the index only scan again, but only if you nudge it too with
>> enable_seqscan=off.

With a recent commit from (I assume) Tom, the "where id is not null"
is no longer needed.

> Clever way to get a full-table test.
>
> It turns out that for the above, with your trick to use the index
> only scan, it comes out 12% faster to do a seqscan, even when the
> table and index are fully cached (based on the average time of ten
> runs each way).  There's very little overlap, so the difference looks
> real.  But that's on a very narrow record, having just the one column
> used in the index.  I added one wide column like this:
>
> alter table t add column x text;
> update t set x = (repeat(random()::text, (random() * 100)::int));
> cluster t USING t_pkey;
> vacuum freeze analyze;
>
> With that change the index-only scan time remained unchanged, while
> the seqscan time grew to about 2.6 times the index only scan time.
> That was mildly surprising for me, considering it was all still
> cached.

I used the pgbench_accounts table from pgbench -i -s 50, where all
data fits in shared_buffers, using the -f switch with either

set enable_seqscan=off;
select count(*) from pgbench_accounts;

or

set enable_indexonlyscan=off;
select count(*) from pgbench_accounts;


With just a single client, it was a toss-up.  But with 8 concurrent
clients on a 8 CPU machine, the index-only scan was 50% faster.  So
that is a nice win, even if well-designed apps probably shouldn't be
endlessly counting rows of an unchanging table using all available
CPUs in the first place.

Cheers,

Jeff


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Noah Misch
Дата:
Сообщение: Re: Pushing ScalarArrayOpExpr support into the btree index AM
Следующее
От: Jan Urbański
Дата:
Сообщение: plpython SPI cursors