Обсуждение: Why is indexonlyscan so darned slow?

Поиск
Список
Период
Сортировка

Why is indexonlyscan so darned slow?

От
Joshua Berkus
Дата:
So, I set up a test which should have been ideal setup for index-only scan.  The index was 1/10 the size of the table,
andfit in RAM (1G) which the table does not:
 

bench2=# select pg_size_pretty(pg_relation_size('pgbench_accounts_pkey'));pg_size_pretty
----------------428 MB  
(1 row)  

bench2=# select pg_size_pretty(pg_relation_size('pgbench_accounts'));pg_size_pretty
----------------5768 MB 
(1 row)

The table was just VACUUM ANALYZED and had no subsequent updates.  So, what's going on here?

bench2=# explain ( analyze on, buffers on ) select count(*) from pgbench_accounts;
                       QUERY PLAN
 


-----------------------------------------------------------------------------------------------------------------------------
------------Aggregate  (cost=855069.99..855070.00 rows=1 width=0) (actual time=64014.573..64014.574 rows=1 loops=1)
Buffers:shared hit=33 read=738289  I/O Timings: read=27691.314  ->  Seq Scan on pgbench_accounts  (cost=0.00..831720.39
rows=9339839width=0) (actual time=6790.669..46530.408 rows=200000
 
00 loops=1)        Buffers: shared hit=33 read=738289        I/O Timings: read=27691.314Total runtime: 64014.626 ms
(7 rows) 

bench2=# explain ( analyze on, buffers on ) select count(*) from pgbench_accounts;
                                        QUERY PLAN
 


-----------------------------------------------------------------------------------------------------------------------------
---------------------------------------------Aggregate  (cost=382829.37..382829.38 rows=1 width=0) (actual
time=38325.026..38325.027rows=1 loops=1)  Buffers: shared hit=1 read=54653  I/O Timings: read=907.202  ->  Index Only
Scanusing pgbench_accounts_pkey on pgbench_accounts  (cost=0.00..359479.77 rows=9339839 width=0) (actual t
 
ime=33.459..20110.908 rows=20000000 loops=1)        Heap Fetches: 0        Buffers: shared hit=1 read=54653        I/O
Timings:read=907.202Total runtime: 38333.536 ms
 


As you can see, the indexonlyscan version of the query spends 5% as much time reading the data as the seq scan version,
anddoesn't have to read the heap at all.  Yet it spends 20 seconds doing ... what, exactly?  
 

BTW, kudos on the new explain analyze reporting ... works great!

--Josh

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco


Re: Why is indexonlyscan so darned slow?

От
Ants Aasma
Дата:
On Thu, May 17, 2012 at 6:08 AM, Joshua Berkus <josh@agliodbs.com> wrote:
> As you can see, the indexonlyscan version of the query spends 5% as much time reading the data as the seq scan
version,and doesn't have to read the heap at all.  Yet it spends 20 seconds doing ... what, exactly? 
>
> BTW, kudos on the new explain analyze reporting ... works great!

Looks like timing overhead. Timing is called twice per tuple which
gives around 950ns per timing call for your index only result. This is
around what is expected of hpet based timing. If you are on Linux you
can check what clocksource you are using by running cat
/sys/devices/system/clocksource/clocksource0/current_clocksource

You can verify that it is due to timing overhead by adding timing off
to the explain clause. Or use the pg_test_timing utility to check the
timing overhead on your system. With hpet based timing I'm seeing
660ns timing overhead and 26.5s execution for your query, with timing
off execution time falls to 2.1s. For reference, tsc based timing
gives 19.2ns overhead and 2.3s execution time with timing.

Ants Aasma
--
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de


Re: Why is indexonlyscan so darned slow?

От
Joshua Berkus
Дата:
Ants,

Well, that's somewhat better, but again hardly the gain in performance I'd expect to see ... especially since this is
idealcircumstances for index-only scan.   

bench2=# select count(*) from pgbench_accounts; count
----------20000000
(1 row)

Time: 3827.508 ms

bench2=# set enable_indexonlyscan=off;
SET
Time: 0.241 ms
bench2=# select count(*) from pgbench_accounts; count
----------20000000
(1 row)

Time: 16012.444 ms

For some reason counting tuples in an index takes 5X as long (per tuple) as counting them in a table.  Why?

----- Original Message -----
> On Thu, May 17, 2012 at 6:08 AM, Joshua Berkus <josh@agliodbs.com>
> wrote:
> > As you can see, the indexonlyscan version of the query spends 5% as
> > much time reading the data as the seq scan version, and doesn't
> > have to read the heap at all.  Yet it spends 20 seconds doing ...
> > what, exactly?
> >
> > BTW, kudos on the new explain analyze reporting ... works great!
>
> Looks like timing overhead. Timing is called twice per tuple which
> gives around 950ns per timing call for your index only result. This
> is
> around what is expected of hpet based timing. If you are on Linux you
> can check what clocksource you are using by running cat
> /sys/devices/system/clocksource/clocksource0/current_clocksource
>
> You can verify that it is due to timing overhead by adding timing off
> to the explain clause. Or use the pg_test_timing utility to check the
> timing overhead on your system. With hpet based timing I'm seeing
> 660ns timing overhead and 26.5s execution for your query, with timing
> off execution time falls to 2.1s. For reference, tsc based timing
> gives 19.2ns overhead and 2.3s execution time with timing.
>
> Ants Aasma
> --
> Cybertec Schönig & Schönig GmbH
> Gröhrmühlgasse 26
> A-2700 Wiener Neustadt
> Web: http://www.postgresql-support.de
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: Why is indexonlyscan so darned slow?

От
Jeff Janes
Дата:
On Thu, May 17, 2012 at 5:22 AM, Joshua Berkus <josh@agliodbs.com> wrote:
> Ants,
>
> Well, that's somewhat better, but again hardly the gain in performance I'd expect to see ... especially since this is
idealcircumstances for index-only scan. 
>
> bench2=# select count(*) from pgbench_accounts;
>  count
> ----------
>  20000000
> (1 row)
>
> Time: 3827.508 ms
>
> bench2=# set enable_indexonlyscan=off;
> SET
> Time: 0.241 ms
> bench2=# select count(*) from pgbench_accounts;
>  count
> ----------
>  20000000
> (1 row)
>
> Time: 16012.444 ms
>
> For some reason counting tuples in an index takes 5X as long (per tuple) as counting them in a table.  Why?
>

It looks like the IOS is taking 4x less time, not more time.

Anyway, the IOS follows the index logical structure, not the physical
structure, so if the index is not in RAM it will really be hurt by the
lack of sequential reads.

Cheers,

Jeff


Re: Why is indexonlyscan so darned slow?

От
Joshua Berkus
Дата:
Jeff,

That's in-RAM speed ... I ran the query twice to make sure the index was cached, and it didn't get any better.  And I
meant5X per byte rather than 5X per tuple. 

I talked this over with Haas, and his opinion is that we have a LOT of overhead in the way we transverse indexes,
especiallylookups which happen once per leaf node instead of in bulk.    Certainly the performance I'm seeing would be
consistentwith that idea. 

I'll try some multi-column covering indexes next to see how it looks.

----- Original Message -----
> On Thu, May 17, 2012 at 5:22 AM, Joshua Berkus <josh@agliodbs.com>
> wrote:
> > Ants,
> >
> > Well, that's somewhat better, but again hardly the gain in
> > performance I'd expect to see ... especially since this is ideal
> > circumstances for index-only scan.
> >
> > bench2=# select count(*) from pgbench_accounts;
> >  count
> > ----------
> >  20000000
> > (1 row)
> >
> > Time: 3827.508 ms
> >
> > bench2=# set enable_indexonlyscan=off;
> > SET
> > Time: 0.241 ms
> > bench2=# select count(*) from pgbench_accounts;
> >  count
> > ----------
> >  20000000
> > (1 row)
> >
> > Time: 16012.444 ms
> >
> > For some reason counting tuples in an index takes 5X as long (per
> > tuple) as counting them in a table.  Why?
> >
>
> It looks like the IOS is taking 4x less time, not more time.
>
> Anyway, the IOS follows the index logical structure, not the physical
> structure, so if the index is not in RAM it will really be hurt by
> the
> lack of sequential reads.
>
> Cheers,
>
> Jeff
>


Re: Why is indexonlyscan so darned slow?

От
Jeff Janes
Дата:
On Thu, May 17, 2012 at 11:35 AM, Joshua Berkus <josh@agliodbs.com> wrote:
> Jeff,
>
> That's in-RAM speed ... I ran the query twice to make sure the index was cached, and it didn't get any better.  And I
meant5X per byte rather than 5X per tuple. 

Ah, OK that makes more sense.  I played around with this, specifically
count(*), quite a bit when IOS first came out, and I attributed a
large part of the time to the code that forms a tuple out of raw
bytes, and the code that advances the aggregate.  The first one is
probably more a per-tuple cost than per byte, and the second
definitely is per tuple cost.

I can't find my detailed notes from this work, so this is just from memory.

Cheers,

Jeff


Re: Why is indexonlyscan so darned slow?

От
Tom Lane
Дата:
Jeff Janes <jeff.janes@gmail.com> writes:
> On Thu, May 17, 2012 at 11:35 AM, Joshua Berkus <josh@agliodbs.com> wrote:
>> That's in-RAM speed ... I ran the query twice to make sure the index was cached, and it didn't get any better.  And
Imeant 5X per byte rather than 5X per tuple.
 

> Ah, OK that makes more sense.  I played around with this, specifically
> count(*), quite a bit when IOS first came out, and I attributed a
> large part of the time to the code that forms a tuple out of raw
> bytes, and the code that advances the aggregate.  The first one is
> probably more a per-tuple cost than per byte, and the second
> definitely is per tuple cost.
> I can't find my detailed notes from this work, so this is just from memory.

I did a quick investigation of this example with oprofile, and found
that there's not going to be any easy large improvement available.
It's basically all per-tuple CPU costs, breaking down like this:

samples  %        symbol name
15513    13.4664  IndexOnlyNext
10886     9.4498  index_getnext_tid
7526      6.5331  visibilitymap_test
7116      6.1772  ExecClearTuple
7054      6.1234  _bt_checkkeys
6804      5.9064  _bt_next
6344      5.5070  ExecProject
6033      5.2371  advance_aggregates
5619      4.8777  ExecScan
5331      4.6277  advance_transition_function
5202      4.5157  btgettuple
4928      4.2779  _bt_saveitem
4653      4.0391  ExecProcNode
4473      3.8829  int8inc
3404      2.9549  MemoryContextReset
3125      2.7127  _bt_readpage
2768      2.4028  FunctionCall2Coll
2278      1.9775  ExecAgg
1502      1.3038  ExecStoreVirtualTuple
1198      1.0399  BufferGetBlockNumber
1105      0.9592  ExecIndexOnlyScan
946       0.8212  hash_search_with_hash_value

A fair chunk of what's being attributed to IndexOnlyNext is actually the
inlined code for StoreIndexTuple, and that in turn is mostly the inlined
code for index_getattr.  We might possibly save a bit here if we noticed
that the query doesn't actually need us to fetch the indexed columns,
but it looks like that would only buy a couple percent overall --- and
testing for that would add useless cycles to every case where we *do*
need the indexed value.  So I'm afraid that it might amount to optimizing
SELECT COUNT(*) at the expense of everything else, which I'm not for.

Another possibility is to try to reduce the costs of index_getnext_tid
and FunctionCall2Coll, which are basically just trampolines to reach
btgettuple.  It's not immediately obvious how to make that much better
though.

Anyway, on my machine it seems that the per-tuple CPU costs for SELECT
COUNT(*) with an index-only scan are something like 10% higher than the
per-tuple costs with a heap scan.  We might get that down to roughly par
with some hacking, but it's never going to be vastly better.  The
argument in favor of index-only scans is mainly about reducing I/O costs
anyway.
        regards, tom lane


Re: Why is indexonlyscan so darned slow?

От
Robert Haas
Дата:
On Sun, May 20, 2012 at 3:24 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Another possibility is to try to reduce the costs of index_getnext_tid
> and FunctionCall2Coll, which are basically just trampolines to reach
> btgettuple.  It's not immediately obvious how to make that much better
> though.

Hmm... this seems awfully similar to the problem we tried to solve
with the sortsupport infrastructure.  Maybe something similar would be
warranted here, to save the overhead of repeated argument packing and
unpacking.

Here's some 'perf' results from the IBM POWER7 box:
10.01%     postgres  postgres           [.] visibilitymap_test 8.78%     postgres  postgres           [.] IndexOnlyNext
7.85%    postgres  postgres           [.] btgettuple 5.67%     postgres  postgres           [.] ExecProject 5.56%
postgres postgres           [.] ExecProcNode 5.51%     postgres  postgres           [.] advance_transition_function
5.06%    postgres  postgres           [.] advance_aggregates 5.02%     postgres  postgres           [.] ExecScan 4.43%
  postgres  postgres           [.] FunctionCall2Coll 4.11%     postgres  postgres           [.] _bt_checkkeys 3.54%
postgres postgres           [.] ExecClearTuple 3.42%     postgres  postgres           [.] int8inc 3.25%     postgres
postgres          [.] _bt_next 3.19%     postgres  postgres           [.] MemoryContextReset 2.95%     postgres
postgres          [.] index_getnext_tid 2.81%     postgres  postgres           [.] _bt_readpage 2.43%     postgres
postgres          [.] _bt_saveitem 2.42%     postgres  postgres           [.] ExecIndexOnlyScan 2.32%     postgres
libc-2.14.90.so   [.] memcpy 

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Why is indexonlyscan so darned slow?

От
Josh Berkus
Дата:
> Anyway, on my machine it seems that the per-tuple CPU costs for SELECT
> COUNT(*) with an index-only scan are something like 10% higher than the
> per-tuple costs with a heap scan.  We might get that down to roughly par
> with some hacking, but it's never going to be vastly better.  The
> argument in favor of index-only scans is mainly about reducing I/O costs
> anyway.

Well, if it's not CPU costs, then something else is eating the time,
since I'm seeing per-tuple COUNT counts on indexes being 400% more than
on heap.

In the airport you said something about index-only scan not scanning the
tuples in leaf page order.   Can you elaborate on that?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


Re: Why is indexonlyscan so darned slow?

От
Tom Lane
Дата:
Josh Berkus <josh@agliodbs.com> writes:
> Well, if it's not CPU costs, then something else is eating the time,
> since I'm seeing per-tuple COUNT counts on indexes being 400% more than
> on heap.

Well, I'm not: as I said, it looks like about 10% here.  Perhaps you're
testing a cassert-enabled build?

> In the airport you said something about index-only scan not scanning the
> tuples in leaf page order.   Can you elaborate on that?

If the index is too big to fit in RAM, you'd be looking at random
fetches of the index pages in most cases (since logical ordering of the
index pages is typically different from physical ordering), leading to
it likely being a lot slower per page than a heapscan.  Not sure this
has anything to do with your test case though, since you said you'd
sized the index to fit in RAM.
        regards, tom lane


Re: Why is indexonlyscan so darned slow?

От
Josh Berkus
Дата:
On 5/21/12 10:41 AM, Tom Lane wrote:
> Josh Berkus <josh@agliodbs.com> writes:
>> Well, if it's not CPU costs, then something else is eating the time,
>> since I'm seeing per-tuple COUNT counts on indexes being 400% more than
>> on heap.
> 
> Well, I'm not: as I said, it looks like about 10% here.  Perhaps you're
> testing a cassert-enabled build?

Oh, right, now I get you.  It's not the per-tuple costs which matter,
it's the per-size costs.  Per-tuple costs are fairly similar, right.

> If the index is too big to fit in RAM, you'd be looking at random
> fetches of the index pages in most cases (since logical ordering of the
> index pages is typically different from physical ordering), leading to
> it likely being a lot slower per page than a heapscan.  Not sure this
> has anything to do with your test case though, since you said you'd
> sized the index to fit in RAM.

Right.  So what I'm trying to figure out is why counting an index which
fits in ram (and I've confirmed via EXPLAIN ( buffers on ) ) is not
being heap-fetched or read from disk would take 25% as long as counting
a table which is 80% on disk.

I'll try comparting on-disk to on-disk speeds, as well as in-memory to
in-memory speeds, and some non-count tests, as well as multicolumn
covering indexes.  I just need to generate more complex test cases than
I can get from pgbench first.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


Re: Why is indexonlyscan so darned slow?

От
Simon Riggs
Дата:
On 21 May 2012 13:41, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Josh Berkus <josh@agliodbs.com> writes:
>> Well, if it's not CPU costs, then something else is eating the time,
>> since I'm seeing per-tuple COUNT counts on indexes being 400% more than
>> on heap.
>
> Well, I'm not: as I said, it looks like about 10% here.  Perhaps you're
> testing a cassert-enabled build?
>
>> In the airport you said something about index-only scan not scanning the
>> tuples in leaf page order.   Can you elaborate on that?
>
> If the index is too big to fit in RAM, you'd be looking at random
> fetches of the index pages in most cases (since logical ordering of the
> index pages is typically different from physical ordering), leading to
> it likely being a lot slower per page than a heapscan.  Not sure this
> has anything to do with your test case though, since you said you'd
> sized the index to fit in RAM.

As you point out, this is still an IndexScan even if the heap access is zero.

Surely the way to solve this is by having a new plan node that does a
physical SeqScan of the index relation. It means we wouldn't preserve
the sort order of the rows from the index, but that is just a plan
cost issue.

This is exactly what we do for VACUUM and it works faster there.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: Why is indexonlyscan so darned slow?

От
Tom Lane
Дата:
Simon Riggs <simon@2ndQuadrant.com> writes:
> Surely the way to solve this is by having a new plan node that does a
> physical SeqScan of the index relation. It means we wouldn't preserve
> the sort order of the rows from the index, but that is just a plan
> cost issue.

> This is exactly what we do for VACUUM and it works faster there.

The reason that's okay for vacuum is that vacuum doesn't care if it
visits the same index tuple multiple times.  It will not work for real
queries, unless you would like to lock out all concurrent inserts.
        regards, tom lane


Re: Why is indexonlyscan so darned slow?

От
Jeff Janes
Дата:
On Mon, May 21, 2012 at 10:44 AM, Josh Berkus <josh@agliodbs.com> wrote:
>
> Right.  So what I'm trying to figure out is why counting an index which
> fits in ram (and I've confirmed via EXPLAIN ( buffers on ) ) is not
> being heap-fetched or read from disk would take 25% as long as counting
> a table which is 80% on disk.

Sequential disk reads are fast.  Parsing the data after it has been
read from disk is also fast, but not infinitely so.  If you can get
your IO system to be about 4 times faster, then you would start being
limited by CPU even on disk-based sequential scans.

Earlier you said that this should be an ideal setup for IOS.  But it
isn't really--the ideal set up is one in which the alternative to an
IOS is a regular index scan which makes many uncached scattered reads
into the heap.  I don't think that that situation can't really be
engineered with a where-less query.

Iterating over any non-trivial data structure with 20,000,000 entries
is going to take some time.  As way of comparison, iterating over a
Perl hash doing nothing but a counter increment takes several times
longer than a same-sized IOS count does.  (Of course you don't need to
iterate over a Perl hash to get the size, but just directly fetching
the size would not be a fair comparison)

Cheers,

Jeff


Re: Why is indexonlyscan so darned slow?

От
Simon Riggs
Дата:
On 21 May 2012 16:02, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Simon Riggs <simon@2ndQuadrant.com> writes:
>> Surely the way to solve this is by having a new plan node that does a
>> physical SeqScan of the index relation. It means we wouldn't preserve
>> the sort order of the rows from the index, but that is just a plan
>> cost issue.
>
>> This is exactly what we do for VACUUM and it works faster there.
>
> The reason that's okay for vacuum is that vacuum doesn't care if it
> visits the same index tuple multiple times.  It will not work for real
> queries, unless you would like to lock out all concurrent inserts.

I checked a little more and Oracle supports something called a Fast
Index Scan. Maybe there is a way.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: Why is indexonlyscan so darned slow?

От
Josh Berkus
Дата:
> Earlier you said that this should be an ideal setup for IOS.  But it
> isn't really--the ideal set up is one in which the alternative to an
> IOS is a regular index scan which makes many uncached scattered reads
> into the heap.  I don't think that that situation can't really be
> engineered with a where-less query.

Can you give me some suggested comparisons which *would* be ideal, then?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


Re: Why is indexonlyscan so darned slow?

От
Simon Riggs
Дата:
On 21 May 2012 16:42, Josh Berkus <josh@agliodbs.com> wrote:
>
>> Earlier you said that this should be an ideal setup for IOS.  But it
>> isn't really--the ideal set up is one in which the alternative to an
>> IOS is a regular index scan which makes many uncached scattered reads
>> into the heap.  I don't think that that situation can't really be
>> engineered with a where-less query.
>
> Can you give me some suggested comparisons which *would* be ideal, then?

Presumably the use case we were tuning for was established prior to
development of the patch?

Surely the performance test results that demonstrate the gain have
been posted to hackers, so we just need to refer back to them?

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: Why is indexonlyscan so darned slow?

От
Jeff Janes
Дата:
On Mon, May 21, 2012 at 1:42 PM, Josh Berkus <josh@agliodbs.com> wrote:
>
>> Earlier you said that this should be an ideal setup for IOS.  But it
>> isn't really--the ideal set up is one in which the alternative to an
>> IOS is a regular index scan which makes many uncached scattered reads
>> into the heap.  I don't think that that situation can't really be
>> engineered with a where-less query.
>
> Can you give me some suggested comparisons which *would* be ideal, then?

Are you looking for vaguely real-life examples, or highly contrived
examples used to dissect the server?

For vaguely real life, take your example of pgbench -i -s200 -F 50,
and I have 2Gig RAM, which seems to be the same as you do.

With select only work load (pgbench -S -M prepared -T 30), I get

tps = 193

But now enable index-only scans:

psql -c "create index on pgbench_accounts(aid, abalance);"

and it goes up to.

tps = 10137

Cheers,

Jeff


Re: Why is indexonlyscan so darned slow?

От
Merlin Moncure
Дата:
On Mon, May 21, 2012 at 4:17 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
> On Mon, May 21, 2012 at 1:42 PM, Josh Berkus <josh@agliodbs.com> wrote:
>>
>>> Earlier you said that this should be an ideal setup for IOS.  But it
>>> isn't really--the ideal set up is one in which the alternative to an
>>> IOS is a regular index scan which makes many uncached scattered reads
>>> into the heap.  I don't think that that situation can't really be
>>> engineered with a where-less query.
>>
>> Can you give me some suggested comparisons which *would* be ideal, then?
>
> Are you looking for vaguely real-life examples, or highly contrived
> examples used to dissect the server?
>
> For vaguely real life, take your example of pgbench -i -s200 -F 50,
> and I have 2Gig RAM, which seems to be the same as you do.
>
> With select only work load (pgbench -S -M prepared -T 30), I get
>
> tps = 193
>
> But now enable index-only scans:
>
> psql -c "create index on pgbench_accounts(aid, abalance);"
>
> and it goes up to.
>
> tps = 10137

Right -- the main driver here is that your index fits neatly in ram
and the heap does not -- so you're effectively measuring the
difference between a buffered and non-buffered access.  That's highly
contrived as you noted and unlikely to come up all *that* often in the
real world.

Generally though the real world wins (although the gains will be
generally less spectacular) are heavily i/o bound queries where the
indexed subset of data you want is nicely packed and the (non
clustered) heap records are all over the place.  By skipping the semi
random heap lookups you can see enormous speedups.  I figure 50-90%
improvement would be the norm there, but this is against queries that
are taking forever, being i/o bound.

See here: http://www.devheads.net/database/postgresql/performance/index-all-necessary-columns-postgres-vs-mssql.htm
for a 'in the wild' gripe about about not having index scans.

merlin


Re: Why is indexonlyscan so darned slow?

От
Ants Aasma
Дата:
On Tue, May 22, 2012 at 12:29 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
> Generally though the real world wins (although the gains will be
> generally less spectacular) are heavily i/o bound queries where the
> indexed subset of data you want is nicely packed and the (non
> clustered) heap records are all over the place.  By skipping the semi
> random heap lookups you can see enormous speedups.  I figure 50-90%
> improvement would be the norm there, but this is against queries that
> are taking forever, being i/o bound.

The heap fetch/visibility check overhead is also a problem for CPU
bound workloads. Example:

CREATE TABLE test AS SELECT x, (RANDOM()*1000000000) AS value FROM
generate_series(1,10000000) AS x;
CREATE INDEX ON test(value, x);
VACUUM ANALYZE test;

Then running the following pgbench script with 4G buffers:

\setrandom rangemin 1 1000000000
\set rangemax :rangemin + 1000000
SELECT MIN(x) FROM test WHERE value BETWEEN :rangemin AND :rangemax;

I get the following results:

bitmap scan: 106 tps
index scan: 146 tps
index only scan: 653 tps

Ants Aasma
--
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de


Re: Why is indexonlyscan so darned slow?

От
Jeff Janes
Дата:
On Mon, May 21, 2012 at 2:29 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
> On Mon, May 21, 2012 at 4:17 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
>>
>> For vaguely real life, take your example of pgbench -i -s200 -F 50,
>> and I have 2Gig RAM, which seems to be the same as you do.
>>
>> With select only work load (pgbench -S -M prepared -T 30), I get
>>
>> tps = 193
>>
>> But now enable index-only scans:
>>
>> psql -c "create index on pgbench_accounts(aid, abalance);"
>>
>> and it goes up to.
>>
>> tps = 10137
>
> Right -- the main driver here is that your index fits neatly in ram
> and the heap does not -- so you're effectively measuring the
> difference between a buffered and non-buffered access.  That's highly
> contrived as you noted and unlikely to come up all *that* often in the
> real world.

I don't think this one is highly contrived.  With the index being 10
fold smaller than the table, there is plenty of window for one to fit
in RAM and the other one not to in a variety of real world situations.(Although I only get 10 fold window because of -F
50. I don't why 
Josh had that big of a different in his original, unless he also used
a nondefault fill factor setting.  But of course many real world
tables will be wider than pgbench_accounts is.)

The highly contrived example useful for dissecting the implementation
would be to do:

set enable_seqscan=off;
set enable_indexonlyscan=off;
select count(*) from pgbench_accounts where aid is not null;

The key that I keep forgetting is the "where aid is not null'.
Without that it uses the full table scan, even with enable_seqscan
off, rather than doing an ordinary index scan.


> Generally though the real world wins (although the gains will be
> generally less spectacular) are heavily i/o bound queries where the
> indexed subset of data you want is nicely packed and the (non
> clustered) heap records are all over the place.  By skipping the semi
> random heap lookups you can see enormous speedups.  I figure 50-90%
> improvement would be the norm there, but this is against queries that
> are taking forever, being i/o bound.
>
> See here: http://www.devheads.net/database/postgresql/performance/index-all-necessary-columns-postgres-vs-mssql.htm
> for a 'in the wild' gripe about about not having index scans.

But without scripts to recreate the data with the right selectivities
and correlations, and to generate a long stream of appropriate query
parameterizations so that they don't become cached, that is just a
gripe and not an example.

I tried to reproduce the problem as stated, and couldn't make IOS be
useful because I couldn't make it be slow even without them.
Presumably I'm doing something wrong, but how could I tell what?  Have
we heard back on whether IOS was tried and proved useful to the
originator of that thread?

If you want an example where neither index nor table fit in memory,
then just bump up the scale to 2000---and get a machine with only 2G
of memory if you don't already have one :)

With the extra index in place:

with enable_indexonlyscan=off
tps = 155.1
with enable_indexonlyscan=on
tps = 453.7

It seems like that should have only doubled, I'm not sure why it did
more than double.  Maybe the index became better cached when the table
stopped competing with it for buffers.

If you leave the select-only world and go back to doing updates, then
that extra index is doing to hurt you somewhat, but if the dominant
bottleneck is rpm of your WAL drive, it might not be noticeable.

Cheers,

Jeff


Re: Why is indexonlyscan so darned slow?

От
Greg Stark
Дата:
On Mon, May 21, 2012 at 9:37 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>>> This is exactly what we do for VACUUM and it works faster there.
>>
>> The reason that's okay for vacuum is that vacuum doesn't care if it
>> visits the same index tuple multiple times.  It will not work for real
>> queries, unless you would like to lock out all concurrent inserts.

Even if you didn't care about seeing duplicates (such as for instance
if you keep a hash table of seen tids to dedup) Vacuum's approach
wouldn't work because the technique it uses to detect concurrent page
splits to know that it has to go back and look for resulting child
pages only works if there's only one scanner at a time. Vacuum
actually marks each page with a vacuum generation number -- that
doesn't work if you want to allow multiple concurrent *scans*. Locking
out concurrent inserts just might even be conceivably tolerable for
some use cases but locking out concurrent read-only scans would really
be beyond the pale.

> I checked a little more and Oracle supports something called a Fast
> Index Scan. Maybe there is a way.

Oracle maintains their indexes differently. Since they do
transactional consistency at the block level and it applies to all
relations -- even indexes -- they see a consistent view of the entire
index. This is engineering. There's always a way but there's no free
lunch. They incur some overhead when they find a block in the index
and have to look up the old version of the block in the rollback
segment. In Postgres I suspect the kind of change that would be needed
would cost concurrency on inserts/updates in exchange for more
flexibility scanning the index.

--
greg


Re: Why is indexonlyscan so darned slow?

От
Merlin Moncure
Дата:
On Tue, May 22, 2012 at 11:33 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
> On Mon, May 21, 2012 at 2:29 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
>> See here: http://www.devheads.net/database/postgresql/performance/index-all-necessary-columns-postgres-vs-mssql.htm
>> for a 'in the wild' gripe about about not having index scans.
>
> But without scripts to recreate the data with the right selectivities
> and correlations, and to generate a long stream of appropriate query
> parameterizations so that they don't become cached, that is just a
> gripe and not an example.
>
> I tried to reproduce the problem as stated, and couldn't make IOS be
> useful because I couldn't make it be slow even without them.
> Presumably I'm doing something wrong, but how could I tell what?  Have
> we heard back on whether IOS was tried and proved useful to the
> originator of that thread?

nope. but the damning evidence was that non-IOS on sql server
performed on par with postgres on the OP's data.  (i also tried to
reproduce with similar results as you).

I bet i/o bound IOS will do better than 50% most of the time because
the 'tuples' are packed better than on a typical heap page unless the
heap is well clustered around that particular index resulting in less
random I/O.  This will directly translate to cpu efficiencies as
storage gets faster.  It's just an all around fabulous feature and
like HOT is something to really consider carefully when laying out
schema.

merlin