Обсуждение: planner costs in "warm cache" tests

Поиск
Список
Период
Сортировка

planner costs in "warm cache" tests

От
Jesper Krogh
Дата:
Hi.

I'm trying to get the planner to do sort of the correct thing
when choosing between index-scans on btree indices and
bitmap heap scans.

There has been several things going on in parallel. One
is that the statistics data is off:
http://thread.gmane.org/gmane.comp.db.postgresql.devel.general/141420/focus=141735

The other one that the costestimates (number of pages
to read) is inaccurate on gin indices:
http://archives.postgresql.org/pgsql-performance/2009-10/msg00393.php
which there also is coming a solution to that I'm testing out.

I was trying to nullify the problem with the wrongly estimated number
of pages to read and see if "the rest" seems to work as expected.

The theory was that if I set "seq_page_cost" and "random_page_cost" to
something "really low" (0 is not permitted) and ran tests
on a fully cached query (where both costs indeed is "really low").
then the "cheapest" query should indeed also be the fastest one.
Let me know if the logic is flawed.

The test dataset is 1365462 documents, running pg9.0b1, both queries run
twice to
see that the data actually is fully cached as expected.

testdb=# set seq_page_cost = 0.00001;
SET
testdb=# set random_page_cost = 0.00001;
SET
testdb=# set enable_indexscan = on;
SET
testdb=# explain analyze select id from testdb.reference where
document_tsvector @@ to_tsquery('literature') order by accession_number
limit 200;
                                                                  QUERY
PLAN

---------------------------------------------------------------------------------------------------------------------------------------------
  Limit  (cost=0.00..432.82 rows=200 width=11) (actual
time=831.456..2167.302 rows=200 loops=1)
    ->  Index Scan using ref_acc_idx on reference  (cost=0.00..61408.12
rows=28376 width=11) (actual time=831.451..2166.434 rows=200 loops=1)
          Filter: (document_tsvector @@ to_tsquery('literature'::text))
  Total runtime: 2167.982 ms
(4 rows)

testdb=# explain analyze select id from testdb.reference where
document_tsvector @@ to_tsquery('literature') order by accession_number
limit 200;
                                                                  QUERY
PLAN

---------------------------------------------------------------------------------------------------------------------------------------------
  Limit  (cost=0.00..432.82 rows=200 width=11) (actual
time=842.990..2187.393 rows=200 loops=1)
    ->  Index Scan using ref_acc_idx on reference  (cost=0.00..61408.12
rows=28376 width=11) (actual time=842.984..2186.540 rows=200 loops=1)
          Filter: (document_tsvector @@ to_tsquery('literature'::text))
  Total runtime: 2188.083 ms
(4 rows)

testdb=# set enable_indexscan = off;
SET
testdb=# explain analyze select id from testdb.reference where
document_tsvector @@ to_tsquery('literature') order by accession_number
limit 200;

QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------
  Limit  (cost=2510.68..2511.18 rows=200 width=11) (actual
time=270.016..270.918 rows=200 loops=1)
    ->  Sort  (cost=2510.68..2581.62 rows=28376 width=11) (actual
time=270.011..270.321 rows=200 loops=1)
          Sort Key: accession_number
          Sort Method:  top-N heapsort  Memory: 34kB
          ->  Bitmap Heap Scan on reference  (cost=219.94..1284.29
rows=28376 width=11) (actual time=13.897..216.700 rows=21613 loops=1)
                Recheck Cond: (document_tsvector @@
to_tsquery('literature'::text))
                ->  Bitmap Index Scan on reference_fts_idx
(cost=0.00..212.85 rows=28376 width=0) (actual time=10.053..10.053
rows=21613 loops=1)
                      Index Cond: (document_tsvector @@
to_tsquery('literature'::text))
  Total runtime: 271.323 ms
(9 rows)

testdb=# explain analyze select id from testdb.reference where
document_tsvector @@ to_tsquery('literature') order by accession_number
limit 200;

QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------
  Limit  (cost=2510.68..2511.18 rows=200 width=11) (actual
time=269.881..270.782 rows=200 loops=1)
    ->  Sort  (cost=2510.68..2581.62 rows=28376 width=11) (actual
time=269.876..270.182 rows=200 loops=1)
          Sort Key: accession_number
          Sort Method:  top-N heapsort  Memory: 34kB
          ->  Bitmap Heap Scan on reference  (cost=219.94..1284.29
rows=28376 width=11) (actual time=14.113..216.173 rows=21613 loops=1)
                Recheck Cond: (document_tsvector @@
to_tsquery('literature'::text))
                ->  Bitmap Index Scan on reference_fts_idx
(cost=0.00..212.85 rows=28376 width=0) (actual time=10.360..10.360
rows=21613 loops=1)
                      Index Cond: (document_tsvector @@
to_tsquery('literature'::text))
  Total runtime: 271.533 ms
(9 rows)


So in the situation where i have tried to "nullify" the actual
disc-cost, hopefully leaving only the
cpu and other cost back and running the query in fully cached mode (two
runs). the bitmap-heap-scan
is still hugely favorable in actual runtime. (which isn't that much a
suprise) but it seems strange that the
index-scan is still favored in the cost calculations?

I have tried to alter the cost of ts_match_vq but even setting it to
1000 does not change the overall picture.

Is the approach simply too naive?

--
Jesper


Re: planner costs in "warm cache" tests

От
Tom Lane
Дата:
Jesper Krogh <jesper@krogh.cc> writes:
> testdb=# set seq_page_cost = 0.00001;
> SET
> testdb=# set random_page_cost = 0.00001;
> SET

Well, hmm, I really doubt that that represents reality either.  A page
access is by no means "free" even when the page is already in cache.
I don't recall anyone suggesting that you set these numbers to less
than perhaps 0.01.

In the case at hand, the problem is that the planner is preferring using
an indexscan to an after-the-fact sort to obtain the specified result
ordering.  Making page fetches look too cheap definitely plays into
that.  There may also be a statistical problem, if the location of the
desired records isn't independent of the accession_number ordering, but
you're not doing yourself any favors by pushing the planner cost
parameters several orders of magnitude outside the design envelope.

            regards, tom lane

Re: planner costs in "warm cache" tests

От
Jesper Krogh
Дата:
On 2010-05-30 20:34, Tom Lane wrote:
> Jesper Krogh<jesper@krogh.cc>  writes:
>
>> testdb=# set seq_page_cost = 0.00001;
>> SET
>> testdb=# set random_page_cost = 0.00001;
>> SET
>>
> Well, hmm, I really doubt that that represents reality either.  A page
> access is by no means "free" even when the page is already in cache.
> I don't recall anyone suggesting that you set these numbers to less
> than perhaps 0.01.
>
>
Thank you for the prompt response. Is it a "false assumption" that the
cost should in some metric between different plans be a measurement
of actual run-time in a dead-disk run?

It should most likely be matching a typical workload situation, but that
it really hard to tell anything about, so my "feeling" would be that the
dead disk case is the one closest?

--
Jesper

Re: planner costs in "warm cache" tests

От
Tom Lane
Дата:
Jesper Krogh <jesper@krogh.cc> writes:
> On 2010-05-30 20:34, Tom Lane wrote:
>> Well, hmm, I really doubt that that represents reality either.  A page
>> access is by no means "free" even when the page is already in cache.
>> I don't recall anyone suggesting that you set these numbers to less
>> than perhaps 0.01.
>>
> Thank you for the prompt response. Is it a "false assumption" that the
> cost should in some metric between different plans be a measurement
> of actual run-time in a dead-disk run?

Well, the default cost parameters (seq_page_cost=1, random_page_cost=4)
are intended to model the non-cached state where most page fetches
actually do require a disk access.  They are definitely too large
relative to the cpu_xxx_cost parameters when you have a fully-cached
database, but what I've seen people recommending for that condition
is to set them both to the same value in the vicinity of 0.1 or 0.01
or so.  If it's only mostly cached you might try intermediate settings.

            regards, tom lane

Re: planner costs in "warm cache" tests

От
Scott Carey
Дата:
It is still best to have random_page_cost to be slightly larger (~50%) than sequential_page_cost, because even when
entirelyin RAM, sequential reads are faster than random reads.  Today's CPU's do memory prefetching on sequential
access. Perhaps try something like 0.3 and 0.2, or half that.  You still don't want it to gratuitously scan a lot of
RAM-- reading a page is not free and can kick out other pages from shared_buffers. 


On May 31, 2010, at 12:55 PM, Tom Lane wrote:

> Jesper Krogh <jesper@krogh.cc> writes:
>> On 2010-05-30 20:34, Tom Lane wrote:
>>> Well, hmm, I really doubt that that represents reality either.  A page
>>> access is by no means "free" even when the page is already in cache.
>>> I don't recall anyone suggesting that you set these numbers to less
>>> than perhaps 0.01.
>>>
>> Thank you for the prompt response. Is it a "false assumption" that the
>> cost should in some metric between different plans be a measurement
>> of actual run-time in a dead-disk run?
>
> Well, the default cost parameters (seq_page_cost=1, random_page_cost=4)
> are intended to model the non-cached state where most page fetches
> actually do require a disk access.  They are definitely too large
> relative to the cpu_xxx_cost parameters when you have a fully-cached
> database, but what I've seen people recommending for that condition
> is to set them both to the same value in the vicinity of 0.1 or 0.01
> or so.  If it's only mostly cached you might try intermediate settings.
>
>             regards, tom lane
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance


Re: planner costs in "warm cache" tests

От
Tom Lane
Дата:
Scott Carey <scott@richrelevance.com> writes:
> It is still best to have random_page_cost to be slightly larger (~50%)
> than sequential_page_cost, because even when entirely in RAM,
> sequential reads are faster than random reads.  Today's CPU's do
> memory prefetching on sequential access.

Do you have any actual evidence of that?  Because I don't believe it.
Neither PG nor any kernel that I've ever heard of makes any effort to
ensure that logically sequential blocks occupy physically sequential
buffers, so even if the CPU tries to do some prefetching, it's not
going to help at all.

Now, if the database isn't entirely cached, then indeed it's a good
idea to keep random_page_cost higher than seq_page_cost.  But that's
because of the actual disk fetches, not anything that happens in RAM.

            regards, tom lane

Re: planner costs in "warm cache" tests

От
Robert Haas
Дата:
On Mon, May 31, 2010 at 3:55 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Jesper Krogh <jesper@krogh.cc> writes:
>> On 2010-05-30 20:34, Tom Lane wrote:
>>> Well, hmm, I really doubt that that represents reality either.  A page
>>> access is by no means "free" even when the page is already in cache.
>>> I don't recall anyone suggesting that you set these numbers to less
>>> than perhaps 0.01.
>>>
>> Thank you for the prompt response. Is it a "false assumption" that the
>> cost should in some metric between different plans be a measurement
>> of actual run-time in a dead-disk run?
>
> Well, the default cost parameters (seq_page_cost=1, random_page_cost=4)
> are intended to model the non-cached state where most page fetches
> actually do require a disk access.  They are definitely too large
> relative to the cpu_xxx_cost parameters when you have a fully-cached
> database, but what I've seen people recommending for that condition
> is to set them both to the same value in the vicinity of 0.1 or 0.01
> or so.  If it's only mostly cached you might try intermediate settings.

I have had to set it as low as .005 to get the right things to happen.
 Could have been a fluke, I suppose.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company