Обсуждение: pgsql: Consider index-only scans even when there is no matching qual or

Поиск
Список
Период
Сортировка

pgsql: Consider index-only scans even when there is no matching qual or

От
Tom Lane
Дата:
Consider index-only scans even when there is no matching qual or ORDER BY.

By popular demand.

Branch
------
master

Details
-------
http://git.postgresql.org/pg/commitdiff/600d3206d1b3f8b540397b79905486a536ac7f78

Modified Files
--------------
src/backend/optimizer/path/indxpath.c |   25 ++++++++++++-------------
1 files changed, 12 insertions(+), 13 deletions(-)


Re: pgsql: Consider index-only scans even when there is no matching qual or

От
Bruce Momjian
Дата:
Tom Lane wrote:
> Consider index-only scans even when there is no matching qual or ORDER BY.
>
> By popular demand.

Is this the COUNT(*) optimization?

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +

Re: pgsql: Consider index-only scans even when there is no matching qual or

От
Tom Lane
Дата:
Bruce Momjian <bruce@momjian.us> writes:
> Tom Lane wrote:
>> Consider index-only scans even when there is no matching qual or ORDER BY.
>>
>> By popular demand.

> Is this the COUNT(*) optimization?

Yeah, among other cases.

            regards, tom lane

Re: pgsql: Consider index-only scans even when there is no matching qual or

От
Thom Brown
Дата:
On 11 October 2011 20:11, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Bruce Momjian <bruce@momjian.us> writes:
>> Tom Lane wrote:
>>> Consider index-only scans even when there is no matching qual or ORDER BY.
>>>
>>> By popular demand.
>
>> Is this the COUNT(*) optimization?
>
> Yeah, among other cases.

This is unexpected:

test=# explain analyse select count(*) from stuff;
                                                        QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=309724.57..309724.58 rows=1 width=0) (actual
time=5622.932..5622.932 rows=1 loops=1)
   ->  Seq Scan on stuff  (cost=0.00..263974.46 rows=18300046 width=0)
(actual time=0.052..3960.289 rows=18300000 loops=1)
 Total runtime: 5623.076 ms
(3 rows)

-- postgres restarted here

test=# set random_page_cost = 1.0;
SET
test=# set seq_page_cost = 5.0;
SET
test=# explain analyse select count(*) from stuff;

 QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=437191.32..437191.33 rows=1 width=0) (actual
time=171652.106..171652.106 rows=1 loops=1)
   ->  Index Only Scan using idx_stuff_thing on stuff
(cost=0.00..393933.31 rows=17303202 width=0) (actual
time=0.248..169062.893 rows=18300000 loops=1)
 Total runtime: 171652.179 ms
(3 rows)

So an index-only scan is 30 times slower in this particular test case.

If you're curious, it was set up as so:

test=# create table stuff (id serial, thing int);
NOTICE:  CREATE TABLE will create implicit sequence "stuff_id_seq" for
serial column "stuff.id"
CREATE TABLE
test=# insert into stuff (thing) select ceil(random()*50) from
generate_series(1,900000);
INSERT 0 900000
test=# insert into stuff (thing) select ceil(random()*350) from
generate_series(1,1200000);
INSERT 0 1200000
test=# insert into stuff (thing) select ceil(random()*50) from
generate_series(1,2200000);
INSERT 0 2200000
test=# create index idx_stuff_thing on stuff (thing);
CREATE INDEX
test=# vacuum analyse;
VACUUM
test=# insert into stuff (thing) select ceil(random()*50) from
generate_series(1,5000000);
INSERT 0 5000000
test=# insert into stuff (thing) select ceil(random()*70) from
generate_series(1,9000000);
INSERT 0 9000000
test=# vacuum analyse;
VACUUM

If I drop the index used here, and recreate it, I get:

test=# explain analyse select count(*) from stuff;

QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=443955.17..443955.18 rows=1 width=0) (actual
time=4920.709..4920.709 rows=1 loops=1)
   ->  Index Only Scan using idx_stuff_thing on stuff
(cost=0.00..398205.06 rows=18300046 width=0) (actual
time=0.330..3353.140 rows=18300000 loops=1)
 Total runtime: 4920.846 ms
(3 rows)

--
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: pgsql: Consider index-only scans even when there is no matching qual or

От
Tom Lane
Дата:
Thom Brown <thom@linux.com> writes:
> So an index-only scan is 30 times slower in this particular test case.

Don't see why you'd find that unexpected.  If you have to visit all the
rows, a seqscan is usually going to be the best way.  An indexscan only
has a chance of winning when the index is much smaller than the table,
which isn't the case in your example, even if you hadn't seen to it that
the index wasn't particularly nicely physically ordered.

            regards, tom lane

Re: pgsql: Consider index-only scans even when there is no matching qual or

От
Thom Brown
Дата:
On 11 October 2011 21:45, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Thom Brown <thom@linux.com> writes:
>> So an index-only scan is 30 times slower in this particular test case.
>
> Don't see why you'd find that unexpected.  If you have to visit all the
> rows, a seqscan is usually going to be the best way.  An indexscan only
> has a chance of winning when the index is much smaller than the table,
> which isn't the case in your example, even if you hadn't seen to it that
> the index wasn't particularly nicely physically ordered.

Ah okay, understood.

--
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company