Query runs fast or slow

Поиск
Список
Период
Сортировка
От felix-accts-pgsql@crowfix.com
Тема Query runs fast or slow
Дата
Msg-id 20060416164932.GA18947@crowfix.com
обсуждение исходный текст
Ответы Re: Query runs fast or slow  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-general
I have a benchmark test which runs a query very slowly under certain
circumstances.  I used Ethereal to capture the packet traffic, and
also enabled debug5 logging out of curiousity.  While the slow query
is in progress, there is no log or packet activity, but the cpu is
busy.  These packets are below; look for SLOW PACKETS HERE to skip all
this explanatory drudge.

This WHERE clause is fast under all conditions:

    ... AND (val.data = $2) AND ...

This WHERE clause is fast as a simple query, but is excruciatingly
slow as prepare / execute / fetch:

    ... AND (val.data > $2 AND val.data < $3) AND ...

My test program is in Perl and uses DBI/DBD::Pg.  Postgresql version
is 8.0.3 on a dual core dual opteron with 2G of RAM.  DBI is version
1.48.  DBD::Pg is version 1.42.  The OS is rPath Linux 2.6.15.

The test runs each SQL statement three times, first as a simple query
to preload caches, then as prepare / execute / fetch, and lastly as a
simple query again.

    $sth = $dbh->prepare(sql_with_placeholders);
    $dbh->selectall_arrayref(sql_with_values_substituted);
    $sth->execute(@values);
    $sth->fetchall_arrayref();
    $dbh->selectall_arrayref(sql_with_values_substituted);

I captured packet traffic and tailed the log while these were running.

Everything is fine except one query, which took 75 seconds to run,
when the others took 3 milliseconds.  During this 75 seconds, the
postmaster log showed no activity, but top showed the postmaster quite
busy.

75 seconds!  That's an eternity.  I can't imagine any circumstances
where it makes sense.  EXPLAIN ANALYZE doesn't show the slow timing
because it requires values, not $n placeholders, and it is the prepare
/ execute operation which is so slow.  I will be glad to send the log,
the packet capture file, the test program itself, and anything else
which helps.  Here are the EXPLAIN statements in case it helps.

EXPLAIN for the equality WHERE clause:

    felix=> explain analyze SELECT sid.data, glue.key, glue.val, glue.sid FROM key, val, sid, glue WHERE (key.data =
'x6ATArB_k1cgLp1mD5x2nzVVf2DQw4Lw1-Ow5NCzzs5Pupg6K'AND key.id = glue.key) AND (val.data = 357354306) AND val.id =
glue.valAND glue.sid = sid.id; 
                                                                       QUERY PLAN
                            

------------------------------------------------------------------------------------------------------------------------------------------------
     Nested Loop  (cost=5.82..1119.29 rows=1 width=60) (actual time=2.271..36.184 rows=1 loops=1)
       ->  Hash Join  (cost=5.82..1116.27 rows=1 width=16) (actual time=2.079..35.976 rows=1 loops=1)
             Hash Cond: ("outer"."key" = "inner".id)
             ->  Nested Loop  (cost=0.00..1105.43 rows=1001 width=16) (actual time=0.315..31.820 rows=1000 loops=1)
                   ->  Index Scan using val_data_key on val  (cost=0.00..6.01 rows=1 width=4) (actual time=0.119..0.123
rows=1loops=1) 
                         Index Cond: (data = 357354306)
                   ->  Index Scan using glue_val_idx on glue  (cost=0.00..702.58 rows=31747 width=16) (actual
time=0.181..24.438rows=1000 loops=1) 
                         Index Cond: ("outer".id = glue.val)
             ->  Hash  (cost=5.82..5.82 rows=1 width=4) (actual time=0.292..0.292 rows=0 loops=1)
                   ->  Index Scan using key_data_key on "key"  (cost=0.00..5.82 rows=1 width=4) (actual
time=0.266..0.271rows=1 loops=1) 
                         Index Cond: (data = 'x6ATArB_k1cgLp1mD5x2nzVVf2DQw4Lw1-Ow5NCzzs5Pupg6K'::text)
       ->  Index Scan using sid_pkey on sid  (cost=0.00..3.01 rows=1 width=52) (actual time=0.179..0.183 rows=1
loops=1)
             Index Cond: ("outer".sid = sid.id)
     Total runtime: 37.880 ms
    (14 rows)

EXPLAIN for the range WHERE clause:

    felix=> explain analyze SELECT sid.data, glue.key, glue.val, glue.sid FROM key, val, sid, glue WHERE (key.data =
'kOSkZ5iN6sz-KqGo51aTwqZnvCKQRUH2SZ8k'AND key.id = glue.key) AND (val.data > 183722006 AND val.data < 183722206) AND
val.id= glue.val AND glue.sid = sid.id; 
                                                                      QUERY PLAN
                           

-----------------------------------------------------------------------------------------------------------------------------------------------
     Nested Loop  (cost=5.82..1119.30 rows=1 width=60) (actual time=15.016..15.525 rows=1 loops=1)
       ->  Hash Join  (cost=5.82..1116.27 rows=1 width=16) (actual time=14.879..15.374 rows=1 loops=1)
             Hash Cond: ("outer"."key" = "inner".id)
             ->  Nested Loop  (cost=0.00..1105.43 rows=1001 width=16) (actual time=0.211..11.666 rows=1000 loops=1)
                   ->  Index Scan using val_data_key on val  (cost=0.00..6.01 rows=1 width=4) (actual time=0.071..0.076
rows=1loops=1) 
                         Index Cond: ((data > 183722006) AND (data < 183722206))
                   ->  Index Scan using glue_val_idx on glue  (cost=0.00..702.58 rows=31747 width=16) (actual
time=0.124..4.311rows=1000 loops=1) 
                         Index Cond: ("outer".id = glue.val)
             ->  Hash  (cost=5.82..5.82 rows=1 width=4) (actual time=0.119..0.119 rows=0 loops=1)
                   ->  Index Scan using key_data_key on "key"  (cost=0.00..5.82 rows=1 width=4) (actual
time=0.101..0.105rows=1 loops=1) 
                         Index Cond: (data = 'kOSkZ5iN6sz-KqGo51aTwqZnvCKQRUH2SZ8k'::text)
       ->  Index Scan using sid_pkey on sid  (cost=0.00..3.01 rows=1 width=52) (actual time=0.123..0.127 rows=1
loops=1)
             Index Cond: ("outer".sid = sid.id)
     Total runtime: 15.620 ms
    (14 rows)

Here are the relevant packets.  The program had pauses inserted so I
could take notes and more easily distinguish which packets went with
which operations, thus the packet times are only relevant within each
operation.  The database is on a test machine which was doing nothing
else during this test.

First the simple query to preload the cache.  It takes 120
milliseconds.  This is the equality WHERE clause.

    Packet 1 Time 0.000000
    Type: Simple query
    Query: SELECT sid.data, glue.key, glue.val, glue.sid FROM key, val, sid, glue WHERE (key.data =
'x6ATArB_k1cgLp1mD5x2nzVVf2DQw4Lw1-Ow5NCzzs5Pupg6K'AND key.id = glue.key) AND (val.data = 357354306) AND val.id =
glue.valAND glue.sid = sid.id 

    Packet 3 Time 0.120243
    Type: Data row
        Data: 4551474B5473374B744B64475A52676D694B3841476F4152...
        Data: 3738
        Data: 3339393435
        Data: 3339393435303738

Then as prepare / execute / fetch.  It takes 3 milliseconds.

    Packet 5 Time 29.986702
    Type: Parse
    Query: SELECT sid.data, glue.key, glue.val, glue.sid FROM key, val, sid, glue WHERE (key.data = $1 AND key.id =
glue.key)AND (val.data = $2) AND val.id = glue.val AND glue.sid = sid.id 

    Packet 7 Time 29.987439
    Type: Parse completion

    Packet 9 Time 29.987509
    Type: Bind
    Data:
    783641544172425F6B3163674C70316D443578326E7A5656... (x6ATArB_...)
    Data: 333537333534333036 (357354306)

    Packet 10 Time 29.989138
    Type: Bind completion
    Type: Data row
        Data: 4551474B5473374B744B64475A52676D694B3841476F4152...
        Data: 3738
        Data: 3339393435
        Data: 3339393435303738

And finally a repeat as a simple query, for timing with cached data.
This also takes 3 milliseconds.

    Packet 12 Time 70.139935
    Type: Simple query
    Query: SELECT sid.data, glue.key, glue.val, glue.sid FROM key, val, sid, glue WHERE (key.data =
'x6ATArB_k1cgLp1mD5x2nzVVf2DQw4Lw1-Ow5NCzzs5Pupg6K'AND key.id = glue.key) AND (val.data = 357354306) AND val.id =
glue.valAND glue.sid = sid.id 

    Packet 13 Time 70.142515
    Type: Data row
        Data: 4551474B5473374B744B64475A52676D694B3841476F4152...
        Data: 3738
        Data: 3339393435
        Data: 3339393435303738

So far so good.  Then I switched to the range WHERE clause, and things
don't look right.

================ SLOW PACKETS HERE ================

First, the simple query to preload the cache.  It takes 117 milliseconds.

    Packet 18 Time 100.342829
    Type: Simple query
    Query: SELECT sid.data, glue.key, glue.val, glue.sid FROM key, val, sid, glue WHERE (key.data =
'kOSkZ5iN6sz-KqGo51aTwqZnvCKQRUH2SZ8k'AND key.id = glue.key) AND (val.data > 183722006 AND val.data < 183722206) AND
val.id= glue.val AND glue.sid = sid.id 

    Packet 20 Time 100.460555
    Type: Data row
        Data: 4E396538585A3338455A35574642786A5339347266515161...
        Data: 393637
        Data: 3539373539
        Data: 3539373539393637

Then the prepare / execute / fetch query, which takes 75 seconds.

    Packet 22 Time 119.563438
    Type: Parse
    Query: SELECT sid.data, glue.key, glue.val, glue.sid FROM key, val, sid, glue WHERE (key.data = $1 AND key.id =
glue.key)AND (val.data > $2 AND val.data < $3) AND val.id = glue.val AND glue.sid = sid.id 

    Packet 24 Time 119.564242
    Type: Parse completion

    Packet 26 Time 119.564306
    Type: Bind
        Data: 6B4F536B5A35694E36737A2D4B71476F3531615477715A6E...
        Data: 313833373232303036
        Data: 313833373232323036

    Packet 28 Time 194.052914
    Type: Bind completion
    Type: Data row
        Data: 4E396538585A3338455A35574642786A5339347266515161...
        Data: 393637
        Data: 3539373539
        Data: 3539373539393637

And finally the simple query again, which takes 3 milliseconds.

    Packet 30 Time 219.762523
    Type: Simple query
    Query: SELECT sid.data, glue.key, glue.val, glue.sid FROM key, val, sid, glue WHERE (key.data =
'kOSkZ5iN6sz-KqGo51aTwqZnvCKQRUH2SZ8k'AND key.id = glue.key) AND (val.data > 183722006 AND val.data < 183722206) AND
val.id= glue.val AND glue.sid = sid.id 

    Packet 32 Time 219.765267
    Type: Data row
        Data: 4E396538585A3338455A35574642786A5339347266515161...
        Data: 393637
        Data: 3539373539
        Data: 3539373539393637

--
            ... _._. ._ ._. . _._. ._. ___ .__ ._. . .__. ._ .. ._.
     Felix Finch: scarecrow repairman & rocket surgeon / felix@crowfix.com
  GPG = E987 4493 C860 246C 3B1E  6477 7838 76E9 182E 8151 ITAR license #4933
I've found a solution to Fermat's Last Theorem but I see I've run out of room o

В списке pgsql-general по дате отправления:

Предыдущее
От: felix-accts-pgsql@crowfix.com
Дата:
Сообщение: Re: Asking advice on speeding up a big table
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Query runs fast or slow