Обсуждение: How to track down inconsistent performance?

Поиск
Список
Период
Сортировка

How to track down inconsistent performance?

От
Ron Snyder
Дата:
We've got some queries that occasionally experience long run times (> 1
minute), and sometimes they're very quick (< 1 second).  Our theory is that
when the queries are fast it's because all the "right stuff" is already in
memory, but we don't know how to actually prove that. (We think we've
successfully ruled out disk contention -- iostat shows disk activity during
these queries, but nothing that's excessive.)

We've turned on statistics collection-- does anybody have any hints about
what things we should specifically be looking for?

Here's what explain says:
bash-2.05$ time psql quickview -c "explain select distinct
os,compiler,stdlibtype,threadlib from builds where product='sourcepro_db'
and visible=true order by 1 asc;"
NOTICE:  QUERY PLAN:

Unique  (cost=197093.74..197588.02 rows=4943 width=50)
  ->  Sort  (cost=197093.74..197093.74 rows=49428 width=50)
        ->  Index Scan using builds_visible_product on builds
(cost=0.00..192225.34 rows=49428 width=50)

quickview=> \d builds
                      Table "builds"
      Column       |           Type           | Modifiers
-------------------+--------------------------+-----------
 id                | integer                  | not null
 visible           | boolean                  |
 state             | character(1)             |
 evaluated         | boolean                  |
 product           | character varying(30)    |
 compiler          | character varying(30)    |
 os                | character varying(30)    |
 stdlibtype        | character varying(30)    |
 linktype          | character varying(30)    |
 threadlib         | character varying(30)    |
 exportlevel       | character varying(30)    |
 usermode          | character varying(30)    |
 postbuildclean    | character varying(30)    |
 prebuildclean     | character varying(30)    |
 submitted         | timestamp with time zone |
 started           | timestamp with time zone |
 finished          | timestamp with time zone |
 machine           | character varying(100)   |
 errors            | integer                  |
 warnings          | integer                  |
 testsattempted    | integer                  |
 testspassed       | integer                  |
 testsfailed       | integer                  |
 examplesattempted | integer                  |
 examplespassed    | integer                  |
 examplesfailed    | integer                  |
 ping              | timestamp with time zone |
 start_count       | integer                  |
 user1             | character varying(50)    |
 user2             | character varying(50)    |
 user3             | character varying(50)    |
 user4             | character varying(50)    |
 user5             | character varying(50)    |
 user6             | character varying(50)    |
 debug             | character varying(30)    |
Indexes: builds_compiler,
         builds_compiler_finished,
         builds_compiler_state,
         builds_compiler_submitted,
         builds_machine,
         builds_machine_finished,
         builds_machine_state,
         builds_machine_submitted,
         builds_os,
         builds_os_finished,
         builds_os_state,
         builds_os_submitted,
         builds_ping_finished,
         builds_pr_os_comp_std_thr_u1,
         builds_product,
         builds_product_finished,
         builds_product_state,
         builds_product_submitted,
         builds_started,
         builds_state,
         builds_visible_finished_product,
         builds_visible_product
Primary key: builds_pkey
Triggers: RI_ConstraintTrigger_25192277,
          RI_ConstraintTrigger_25192279



Re: How to track down inconsistent performance?

От
Ron Snyder
Дата:
Well, perhaps I'm on the right track--
quickview=> select indexrelname,idx_blks_read,idx_blks_hit from
pg_statio_user_indexes where indexrelname='builds_visible_product';
      indexrelname      | idx_blks_read | idx_blks_hit
------------------------+---------------+--------------
 builds_visible_product |        625760 |       522572
(1 row)

Based on the numbers that the other indices report, I'm guessing that
idx_blks_read is "cache misses"-- if so, it's missing over 50% of the time.
If this is in fact what's causing the problems, then maybe there's a way to
reduce the size of the index?

And the index definition:
quickview=> select * from pg_indexes where
indexname='builds_visible_product';
 tablename |       indexname        |
indexdef
-----------+------------------------+---------------------------------------
---------------------------------------
 builds    | builds_visible_product | CREATE INDEX builds_visible_product ON
builds USING btree (visible, product)
(1 row)

'visible' is a boolean, 'product' is a varchar(30), and there are about 210K
records in the builds table. (I don't know if it's relevant, but there are
about 39 distinct product values.

-ron

> -----Original Message-----
> From: Ron Snyder [mailto:snyder@roguewave.com]
> Sent: Saturday, April 27, 2002 5:01 PM
> To: pgsql General List
> Subject: [GENERAL] How to track down inconsistent performance?
>
>
> We've got some queries that occasionally experience long run
> times (> 1
> minute), and sometimes they're very quick (< 1 second).  Our
> theory is that
> when the queries are fast it's because all the "right stuff"
> is already in
> memory, but we don't know how to actually prove that. (We think we've
> successfully ruled out disk contention -- iostat shows disk
> activity during
> these queries, but nothing that's excessive.)
>
> We've turned on statistics collection-- does anybody have any
> hints about
> what things we should specifically be looking for?
>
> Here's what explain says:
> bash-2.05$ time psql quickview -c "explain select distinct
> os,compiler,stdlibtype,threadlib from builds where
> product='sourcepro_db'
> and visible=true order by 1 asc;"
> NOTICE:  QUERY PLAN:
>
> Unique  (cost=197093.74..197588.02 rows=4943 width=50)
>   ->  Sort  (cost=197093.74..197093.74 rows=49428 width=50)
>         ->  Index Scan using builds_visible_product on builds
> (cost=0.00..192225.34 rows=49428 width=50)
>
> quickview=> \d builds
>                       Table "builds"
>       Column       |           Type           | Modifiers
> -------------------+--------------------------+-----------
>  id                | integer                  | not null
>  visible           | boolean                  |
>  state             | character(1)             |
>  evaluated         | boolean                  |
>  product           | character varying(30)    |
>  compiler          | character varying(30)    |
>  os                | character varying(30)    |
>  stdlibtype        | character varying(30)    |
>  linktype          | character varying(30)    |
>  threadlib         | character varying(30)    |
>  exportlevel       | character varying(30)    |
>  usermode          | character varying(30)    |
>  postbuildclean    | character varying(30)    |
>  prebuildclean     | character varying(30)    |
>  submitted         | timestamp with time zone |
>  started           | timestamp with time zone |
>  finished          | timestamp with time zone |
>  machine           | character varying(100)   |
>  errors            | integer                  |
>  warnings          | integer                  |
>  testsattempted    | integer                  |
>  testspassed       | integer                  |
>  testsfailed       | integer                  |
>  examplesattempted | integer                  |
>  examplespassed    | integer                  |
>  examplesfailed    | integer                  |
>  ping              | timestamp with time zone |
>  start_count       | integer                  |
>  user1             | character varying(50)    |
>  user2             | character varying(50)    |
>  user3             | character varying(50)    |
>  user4             | character varying(50)    |
>  user5             | character varying(50)    |
>  user6             | character varying(50)    |
>  debug             | character varying(30)    |
> Indexes: builds_compiler,
>          builds_compiler_finished,
>          builds_compiler_state,
>          builds_compiler_submitted,
>          builds_machine,
>          builds_machine_finished,
>          builds_machine_state,
>          builds_machine_submitted,
>          builds_os,
>          builds_os_finished,
>          builds_os_state,
>          builds_os_submitted,
>          builds_ping_finished,
>          builds_pr_os_comp_std_thr_u1,
>          builds_product,
>          builds_product_finished,
>          builds_product_state,
>          builds_product_submitted,
>          builds_started,
>          builds_state,
>          builds_visible_finished_product,
>          builds_visible_product
> Primary key: builds_pkey
> Triggers: RI_ConstraintTrigger_25192277,
>           RI_ConstraintTrigger_25192279
>
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
http://archives.postgresql.org

Re: How to track down inconsistent performance?

От
Tom Lane
Дата:
Ron Snyder <snyder@roguewave.com> writes:
> 'visible' is a boolean, 'product' is a varchar(30), and there are about 210K
> records in the builds table. (I don't know if it's relevant, but there are
> about 39 distinct product values.

Hmm, a rough guess would suggest that the query will be retrieving about
1/78th of the rows (unless there are statistics you haven't mentioned that
affect that).  I would think that a seqscan would be faster for this.
I would definitely think that the planner would think so --- are you
forcing enable_seqscan off?

Since you're evidently running 7.2, I'd ask for EXPLAIN ANALYZE rather
than just EXPLAIN output.  It'd be even more helpful if you are able
to catch both the fast and slow cases under EXPLAIN ANALYZE.

            regards, tom lane

Re: How to track down inconsistent performance?

От
Ron Snyder
Дата:
I learned last night that 'visible' is false in about 95% of the records.
The products are not evenly distributed either-- sourcepro_db (the previous
example) accounts for 1/2 of the records, and math (today's example) account
for 3%.

Here's the output of explain analyze-- these were executed less than 30
seconds apart.
These are: "explain analyze" the query; do the query; "explain analyze" the
query again.

bash-2.05$ time psql quickview -c "explain analyze select distinct
os,compiler,stdlibtype,threadlib from builds where product='math' and
visible=true order by 1 asc;"
NOTICE:  QUERY PLAN:

Unique  (cost=33579.50..33664.14 rows=846 width=50) (actual
time=27453.33..27478.75 rows=202 loops=1)
  ->  Sort  (cost=33579.50..33579.50 rows=8463 width=50) (actual
time=27453.32..27462.24 rows=5934 loops=1)
        ->  Index Scan using builds_visible_product on builds
(cost=0.00..32868.90 rows=8463 width=50) (actual time=0.30..27266.32
rows=5934 loops=1)
Total runtime: 27491.93 msec

EXPLAIN

real    0m27.528s
user    0m0.000s
sys     0m0.000s
bash-2.05$ time psql quickview -c "select distinct
os,compiler,stdlibtype,threadlib from builds where product='math' and
visible=true order by 1 asc;" > /dev/null

real    0m0.330s
user    0m0.000s
sys     0m0.010s
bash-2.05$ time psql quickview -c "explain analyze select distinct
os,compiler,stdlibtype,threadlib from builds where product='math' and
visible=true order by 1 asc;"
NOTICE:  QUERY PLAN:

Unique  (cost=33579.50..33664.14 rows=846 width=50) (actual
time=270.59..297.45 rows=202 loops=1)
  ->  Sort  (cost=33579.50..33579.50 rows=8463 width=50) (actual
time=270.58..280.75 rows=5934 loops=1)
        ->  Index Scan using builds_visible_product on builds
(cost=0.00..32868.90 rows=8463 width=50) (actual time=0.08..90.87 rows=5934
loops=1)
Total runtime: 307.74 msec

EXPLAIN

real    0m0.336s
user    0m0.000s
sys     0m0.010s

Thanks for your help!
-ron

> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> Sent: Sunday, April 28, 2002 7:59 AM
> To: Ron Snyder
> Cc: pgsql General List
> Subject: Re: [GENERAL] How to track down inconsistent performance?
>
>
> Ron Snyder <snyder@roguewave.com> writes:
> > 'visible' is a boolean, 'product' is a varchar(30), and
> there are about 210K
> > records in the builds table. (I don't know if it's
> relevant, but there are
> > about 39 distinct product values.
>
> Hmm, a rough guess would suggest that the query will be
> retrieving about
> 1/78th of the rows (unless there are statistics you haven't
> mentioned that
> affect that).  I would think that a seqscan would be faster for this.
> I would definitely think that the planner would think so --- are you
> forcing enable_seqscan off?
>
> Since you're evidently running 7.2, I'd ask for EXPLAIN ANALYZE rather
> than just EXPLAIN output.  It'd be even more helpful if you are able
> to catch both the fast and slow cases under EXPLAIN ANALYZE.
>
>             regards, tom lane
>

Re: How to track down inconsistent performance?

От
Ron Snyder
Дата:

> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> affect that).  I would think that a seqscan would be faster for this.
> I would definitely think that the planner would think so --- are you
> forcing enable_seqscan off?

Forgot to answer this-- no, we're not forcing it off.  Maybe we need to
force it to use a seqscan for this query?


> Since you're evidently running 7.2, I'd ask for EXPLAIN ANALYZE rather

We are still running 7.2 on this database-- we're in the middle of
acceptance testing for the 7.2.1 upgrade.

-ron


Re: How to track down inconsistent performance?

От
Tom Lane
Дата:
Ron Snyder <snyder@roguewave.com> writes:
>> affect that).  I would think that a seqscan would be faster for this.
>> I would definitely think that the planner would think so --- are you
>> forcing enable_seqscan off?

> Forgot to answer this-- no, we're not forcing it off.  Maybe we need to
> force it to use a seqscan for this query?

It would be interesting to try that just for comparison purposes.
I'd like to know the difference in the planner's cost estimate,
as well as the actual runtime.

Your results do make it look like the difference between "fast" and
"slow" cases is just whether the table and index data are in disk buffer
cache or not.  What shared_buffers setting are you using for Postgres?
Have you tried experimenting with other values?  What's the total RAM
in the box?  The ultimate answer might just be "you need to buy more RAM" ...

            regards, tom lane

Re: How to track down inconsistent performance?

От
Ron Snyder
Дата:

> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]

> >> I would definitely think that the planner would think so
> --- are you
> >> forcing enable_seqscan off?
>
> > Forgot to answer this-- no, we're not forcing it off.
> > Maybe we need to
> > force it to use a seqscan for this query?
>
> It would be interesting to try that just for comparison purposes.
> I'd like to know the difference in the planner's cost estimate,
> as well as the actual runtime.

Well, assuming that I did this correctly (just turning off indexscan), the
results are below (I ran it twice to see what benefits from having some it
it in memory might have, but the results were almost identical):

bash-2.05$ time psql quickview -c "set enable_indexscan=off; explain analyze
select distinct os,compiler,stdlibtype,threadlib from builds where
product='math' and visible=true order by 1 asc;"
NOTICE:  QUERY PLAN:

Unique  (cost=273438.18..273522.81 rows=846 width=50) (actual
time=140567.87..140593.49 rows=202 loops=1)
  ->  Sort  (cost=273438.18..273438.18 rows=8463 width=50) (actual
time=140567.86..140576.85 rows=5934 loops=1)
        ->  Seq Scan on builds  (cost=0.00..272727.58 rows=8463 width=50)
(actual time=56.38..140347.16 rows=5934 loops=1)
Total runtime: 140620.56 msec

EXPLAIN

real    2m20.672s
user    0m0.010s
sys     0m0.000s

>
> Your results do make it look like the difference between "fast" and
> "slow" cases is just whether the table and index data are in
> disk buffer
> cache or not.  What shared_buffers setting are you using for Postgres?
> Have you tried experimenting with other values?  What's the total RAM
> in the box?  The ultimate answer might just be "you need to
> buy more RAM" ...

Shared_buffers is set to 8096, the machine has 1 gig of ram. I don't know if
it matters, but there are about 170 connections at any one time, although
most of them are idle at any moment.  About 10 of them are typically idle
inside of a transaction (if that matters).  Does the following information
help?:

bash-2.05$ psql quickview -c "select relid,indexrelname from
pg_statio_user_indexes where indexrelname='builds_visible_product';"
  relid   |      indexrelname
----------+------------------------
 25192249 | builds_visible_product
(1 row)

bash-2.05$ find . -type f -name 25192249 -exec ls -ald {} \;
-rw-------    1 pgqv     postgres 1073741824 Apr 28 15:43
./base/23527426/25192249

We do have more memory on order, but I'd like to know of a more definite
method of predicting what the settings and memory need to be. (based on the
size of the builds table in the filesystem above, I'm guessing that I've got
my answer.)

If we can segment the data so that the builds table isn't so big, could that
lead to a more efficient use of memory? What about deleting some indices
that exist but are not being used?

-ron

>
>             regards, tom lane
>