Обсуждение: CLUSTER and a problem

Поиск
Список
Период
Сортировка

CLUSTER and a problem

От
Andrzej Zawadzki
Дата:
Hi!

Yesterday I Clustered one big table (# CLUSTER kredyty USING kredyty_pkey;)
and today one query is extremely slow.

query:
SELECT telekredytid FROM kredytyag
WHERE TRUE
AND kredytyag.id = 3064776
AND NOT EXISTS
    (
    SELECT 1 FROM
        (
            SELECT * FROM kredyty kr
            where telekredytid = 328652
            ORDER BY kr.datazaw DESC LIMIT 1
    )
    kred where kred.bank = 2)

Plan looks strange for me:

"Result  (cost=701.54..709.84 rows=1 width=4)"
"  One-Time Filter: (NOT $0)"
"  InitPlan"
"    ->  Subquery Scan kred  (cost=0.00..701.54 rows=1 width=0)"
"          Filter: (kred.bank = 2)"
"          ->  Limit  (cost=0.00..701.52 rows=1 width=3902)"
"                ->  Index Scan Backward using kredyty_datazaw on
kredyty kr  (cost=0.00..1067719.61 rows=1522 width=3902)"
"                      Filter: (telekredytid = 328652)"
"  ->  Index Scan using kredytyag_pkey on kredytyag  (cost=0.00..8.30
rows=1 width=4)"
"        Index Cond: (id = 3064776)"

This Index skan on kredyty_datazaw and filter telekredytid cost a lot
of... but why not use kredyty_telekredytid_idx?

Before Cluster was (or similar):

"Result  (cost=78.98..85.28 rows=1 width=4)"
"  One-Time Filter: (NOT $0)"
"  InitPlan 1 (returns $0)"
"    ->  Subquery Scan kred  (cost=78.97..78.98 rows=1 width=0)"
"          Filter: (kred.bank = 2)"
"          ->  Limit  (cost=78.97..78.97 rows=1 width=3910)"
"                ->  Sort  (cost=78.97..79.20 rows=94 width=3910)"
"                      Sort Key: kr.datazaw"
"                      ->  Index Scan using kredyty_telekredytid_idx on
kredyty kr  (cost=0.00..78.50 rows=94 width=3910)"
"                            Index Cond: (telekredytid = 328652)"
"  ->  Index Scan using kredytyag_pkey on kredytyag  (cost=0.00..6.30
rows=1 width=4)"
"        Index Cond: (id = 3064776)"

I've chosen bad index?

--
Andrzej Zawadzki

Re: CLUSTER and a problem

От
Josh Berkus
Дата:
Andrzej,

Please post a table & index schema, and an EXPLAIN ANALYZE rather than
just an EXPLAIN.  Thanks!

--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com

Re: CLUSTER and a problem

От
Andrzej Zawadzki
Дата:
Josh Berkus wrote:
> Andrzej,
>
> Please post a table & index schema, and an EXPLAIN ANALYZE rather than
> just an EXPLAIN.  Thanks!
>
EXPLAIN ANALYZE is taking too much time ;-) but now database is free so:

# EXPLAIN ANALYZE SElect telekredytid from kredytyag
WHERE TRUE
AND kredytyag.id = 3064776
AND NOT EXISTS
(SELECT 1 FROM
( SELECT * FROM kredyty kr
where telekredytid = 328650
ORDER BY kr.datazaw DESC LIMIT 1 )
kred where kred.bank = 2);

QUERY PLAN


----------------------------------------------------------------------------------------------------------------------------------------------------------------------
-----------
 Result  (cost=778.06..786.36 rows=1 width=4) (actual
time=2045567.930..2045567.930 rows=0 loops=1)
   One-Time Filter: (NOT $0)
   InitPlan
     ->  Subquery Scan kred  (cost=0.00..778.06 rows=1 width=0) (actual
time=2045556.496..2045556.496 rows=0 loops=1)
           Filter: (kred.bank = 2)
           ->  Limit  (cost=0.00..778.05 rows=1 width=3873) (actual
time=2045556.492..2045556.492 rows=0 loops=1)
                 ->  Index Scan Backward using kredyty_datazaw on
kredyty kr  (cost=0.00..1088490.39 rows=1399 width=3873) (actual
time=2045556.487..2045556.487 rows=
0 loops=1)
                       Filter: (telekredytid = 328650)
   ->  Index Scan using kredytyag_pkey on kredytyag  (cost=0.00..8.30
rows=1 width=4) (actual time=11.424..11.424 rows=0 loops=1)
         Index Cond: (id = 3064776)
 Total runtime: 2045568.420 ms
(11 rows)

Like you can see below - disks are very busy

# sar -d -p
21:36:01          DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz
avgqu-sz     await     svctm     %util
21:38:01          sdd    219.58   3345.82    790.14     18.84
1.10      5.01      4.52     99.20

# vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system--
----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy
id wa
 0  1   3976  93696  58452 14737524    1    1   455    84    0    0  8
1 90  2
 0  1   3976 106532  58384 14723812    0    0  1792     0  545  906  0
0 87 12
 0  1   3976 105452  58488 14725536    0    0  1708  2297  596  549  0
0 87 12
 0  1   3976 102924  58492 14727568    0    0  1996     0  554  566  0
0 87 12
 0  1   3976 102268  58492 14729028    0    0  1744     0  528  540  0
0 87 12
 0  1   3976  99828  58492 14730936    0    0  1624     0  507  492  0
0 87 12
 1  0   3976  98972  58492 14732688    0    0  1720     0  518  507  0
0 87 12
 0  1   3976  96756  58560 14734276    0    0  1636  2020  557  521  0
0 87 12


SCHEMA: this is big table (too big ;-) too wide ~250 columns so I've
trimmed schema - (old database without refactor :-( )
I hope this is enough?

                                                       Table
"public.kredyty"
                Column                 |            Type
|                          Modifiers

---------------------------------------+-----------------------------+--------------------------------------------------------------
 id                                    | integer                     |
not null default nextval(('kredyty_id_seq'::text)::regclass)
 linia                                 | integer                     |
default (-1)
 sklep                                 | integer                     |
default (-1)
 agent                                 | integer                     |
default (-1)
 przedst                               | integer                     |
default (-1)
 oddzial                               | integer                     |
default (-1)
 datazaw                               | date                        |
 datauruch                             | date                        |
 telekredytid                          | integer                     |
default (-1)
Indexes:
    "kredyty_pkey" PRIMARY KEY, btree (id) CLUSTER
    "kredyty_kredytagid_id_idx" UNIQUE, btree (kredytagid, id)
    "kredyty_datazaw" btree (datazaw)
    "kredyty_telekredytid_idx" btree (telekredytid)


--
Andrzej Zawadzki

Re: CLUSTER and a problem

От
Tom Lane
Дата:
Andrzej Zawadzki <zawadaa@wp.pl> writes:
> # EXPLAIN ANALYZE SElect telekredytid from kredytyag
> WHERE TRUE
> AND kredytyag.id = 3064776
> AND NOT EXISTS
> (SELECT 1 FROM
> ( SELECT * FROM kredyty kr
> where telekredytid = 328650
> ORDER BY kr.datazaw DESC LIMIT 1 )
> kred where kred.bank = 2);

So this is the slow bit:

>      ->  Subquery Scan kred  (cost=0.00..778.06 rows=1 width=0) (actual
> time=2045556.496..2045556.496 rows=0 loops=1)
>            Filter: (kred.bank = 2)
>            ->  Limit  (cost=0.00..778.05 rows=1 width=3873) (actual
> time=2045556.492..2045556.492 rows=0 loops=1)
>                  ->  Index Scan Backward using kredyty_datazaw on
> kredyty kr  (cost=0.00..1088490.39 rows=1399 width=3873) (actual
> time=2045556.487..2045556.487 rows=0 loops=1)
>                        Filter: (telekredytid = 328650)

It's doing a scan in descending datazaw order and hoping to find a row
that has both telekredytid = 328650 and bank = 2.  Evidently there isn't
one, so the indexscan runs clear to the end before it can report that the
NOT EXISTS is true.  Unfortunately, you've more or less forced this
inefficient query plan by wrapping some of the search conditions inside a
LIMIT and some outside.  Try phrasing the NOT EXISTS query differently.
Or, if you do this type of query a lot, a special-purpose index might be
worthwhile.  It would probably be fast as-is if you had an index on
(telekredytid, datazaw) (in that order).

            regards, tom lane

Re: CLUSTER and a problem

От
Andrzej Zawadzki
Дата:
Tom Lane wrote:
> Andrzej Zawadzki <zawadaa@wp.pl> writes:
>
>> # EXPLAIN ANALYZE SElect telekredytid from kredytyag
>> WHERE TRUE
>> AND kredytyag.id = 3064776
>> AND NOT EXISTS
>> (SELECT 1 FROM
>> ( SELECT * FROM kredyty kr
>> where telekredytid = 328650
>> ORDER BY kr.datazaw DESC LIMIT 1 )
>> kred where kred.bank = 2);
>>
>
> So this is the slow bit:
>
>
>>      ->  Subquery Scan kred  (cost=0.00..778.06 rows=1 width=0) (actual
>> time=2045556.496..2045556.496 rows=0 loops=1)
>>            Filter: (kred.bank = 2)
>>            ->  Limit  (cost=0.00..778.05 rows=1 width=3873) (actual
>> time=2045556.492..2045556.492 rows=0 loops=1)
>>                  ->  Index Scan Backward using kredyty_datazaw on
>> kredyty kr  (cost=0.00..1088490.39 rows=1399 width=3873) (actual
>> time=2045556.487..2045556.487 rows=0 loops=1)
>>                        Filter: (telekredytid = 328650)
>>
>
> It's doing a scan in descending datazaw order and hoping to find a row
> that has both telekredytid = 328650 and bank = 2.  Evidently there isn't
> one, so the indexscan runs clear to the end before it can report that the
> NOT EXISTS is true.  Unfortunately, you've more or less forced this
> inefficient query plan by wrapping some of the search conditions inside a
> LIMIT and some outside.  Try phrasing the NOT EXISTS query differently.
> Or, if you do this type of query a lot, a special-purpose index might be
> worthwhile.  It would probably be fast as-is if you had an index on
> (telekredytid, datazaw) (in that order).
>
That's no problem - we already has changed this query:
SELECT * FROM kredyty kr
            where kr.telekredytid = 328652
            and kr.bank = 2
            AND NOT EXISTS (SELECT * from kredyty k2 WHERE k2.bank<>2
and k2.creationdate > kr.creationdate)
Works good.

But in fact this wasn't my point.
My point was: why operation CLUSTER has such a big and bad impact on
planer for this query?
Like I sad: before CLUSTER query was run in xx milliseconds :-)

--
Andrzej Zawadzki

Re: CLUSTER and a problem

От
Andrzej Zawadzki
Дата:
Andrzej Zawadzki wrote:
> Tom Lane wrote:
>
>> Andrzej Zawadzki <zawadaa@wp.pl> writes:
>>
>>
>>> # EXPLAIN ANALYZE SElect telekredytid from kredytyag
>>> WHERE TRUE
>>> AND kredytyag.id = 3064776
>>> AND NOT EXISTS
>>> (SELECT 1 FROM
>>> ( SELECT * FROM kredyty kr
>>> where telekredytid = 328650
>>> ORDER BY kr.datazaw DESC LIMIT 1 )
>>> kred where kred.bank = 2);
>>>
>>>
>> So this is the slow bit:
>>
>>
>>
>>>      ->  Subquery Scan kred  (cost=0.00..778.06 rows=1 width=0) (actual
>>> time=2045556.496..2045556.496 rows=0 loops=1)
>>>            Filter: (kred.bank = 2)
>>>            ->  Limit  (cost=0.00..778.05 rows=1 width=3873) (actual
>>> time=2045556.492..2045556.492 rows=0 loops=1)
>>>                  ->  Index Scan Backward using kredyty_datazaw on
>>> kredyty kr  (cost=0.00..1088490.39 rows=1399 width=3873) (actual
>>> time=2045556.487..2045556.487 rows=0 loops=1)
>>>                        Filter: (telekredytid = 328650)
>>>
>>>
>> It's doing a scan in descending datazaw order and hoping to find a row
>> that has both telekredytid = 328650 and bank = 2.  Evidently there isn't
>> one, so the indexscan runs clear to the end before it can report that the
>> NOT EXISTS is true.  Unfortunately, you've more or less forced this
>> inefficient query plan by wrapping some of the search conditions inside a
>> LIMIT and some outside.  Try phrasing the NOT EXISTS query differently.
>> Or, if you do this type of query a lot, a special-purpose index might be
>> worthwhile.  It would probably be fast as-is if you had an index on
>> (telekredytid, datazaw) (in that order).
>>
>>
> That's no problem - we already has changed this query:
> SELECT * FROM kredyty kr
>             where kr.telekredytid = 328652
>             and kr.bank = 2
>             AND NOT EXISTS (SELECT * from kredyty k2 WHERE k2.bank<>2
> and k2.creationdate > kr.creationdate)
> Works good.
>
> But in fact this wasn't my point.
> My point was: why operation CLUSTER has such a big and bad impact on
> planer for this query?
> Like I sad: before CLUSTER query was run in xx milliseconds :-)
>
>
Before CLUSTER was:

# EXPLAIN ANALYZE SELECT telekredytid FROM kredytyag
WHERE TRUE
AND kredytyag.id = 3064776
AND NOT EXISTS
    (
    SELECT 1 FROM
        (
            SELECT * FROM kredyty kr
            where telekredytid = 328652
            ORDER BY kr.datazaw DESC LIMIT 1
    )
    kred where kred.bank = 2)
;

QUERY
PLAN


------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Result  (cost=1317.25..1325.55 rows=1 width=4) (actual
time=0.235..0.235 rows=0 loops=1)
   One-Time Filter: (NOT $0)
   InitPlan
     ->  Subquery Scan kred  (cost=1317.24..1317.25 rows=1 width=0)
(actual time=0.188..0.188 rows=0 loops=1)
           Filter: (kred.bank = 2)
           ->  Limit  (cost=1317.24..1317.24 rows=1 width=4006) (actual
time=0.172..0.172 rows=0 loops=1)
                 ->  Sort  (cost=1317.24..1320.27 rows=1212 width=4006)
(actual time=0.069..0.069 rows=0 loops=1)
                       Sort Key: kr.datazaw
                       Sort Method:  quicksort  Memory: 25kB
                       ->  Index Scan using kredyty_telekredytid_idx on
kredyty kr  (cost=0.00..1311.18 rows=1212 width=4006) (actual
time=0.029..0.029 rows=0 loops=1)
                             Index Cond: (telekredytid = 328652)
   ->  Index Scan using kredytyag_pkey on kredytyag  (cost=0.00..8.29
rows=1 width=4) (actual time=0.018..0.018 rows=0 loops=1)
         Index Cond: (id = 3064776)
 Total runtime: 1.026 ms
(14 rows)

and that's clear for me.
Probably bad index for CLUSTER - Investigating ;-)

--
Andrzej Zawadzki

Re: CLUSTER and a problem

От
Andrzej Zawadzki
Дата:
Andrzej Zawadzki wrote:
> Tom Lane wrote:
>
>> Andrzej Zawadzki <zawadaa@wp.pl> writes:
>>
>>
>>> # EXPLAIN ANALYZE SElect telekredytid from kredytyag
>>> WHERE TRUE
>>> AND kredytyag.id = 3064776
>>> AND NOT EXISTS
>>> (SELECT 1 FROM
>>> ( SELECT * FROM kredyty kr
>>> where telekredytid = 328650
>>> ORDER BY kr.datazaw DESC LIMIT 1 )
>>> kred where kred.bank = 2);
>>>
>>>
>> So this is the slow bit:
>>
>>
>>
>>>      ->  Subquery Scan kred  (cost=0.00..778.06 rows=1 width=0) (actual
>>> time=2045556.496..2045556.496 rows=0 loops=1)
>>>            Filter: (kred.bank = 2)
>>>            ->  Limit  (cost=0.00..778.05 rows=1 width=3873) (actual
>>> time=2045556.492..2045556.492 rows=0 loops=1)
>>>                  ->  Index Scan Backward using kredyty_datazaw on
>>> kredyty kr  (cost=0.00..1088490.39 rows=1399 width=3873) (actual
>>> time=2045556.487..2045556.487 rows=0 loops=1)
>>>                        Filter: (telekredytid = 328650)
>>>
>>>
>> It's doing a scan in descending datazaw order and hoping to find a row
>> that has both telekredytid = 328650 and bank = 2.  Evidently there isn't
>> one, so the indexscan runs clear to the end before it can report that the
>> NOT EXISTS is true.  Unfortunately, you've more or less forced this
>> inefficient query plan by wrapping some of the search conditions inside a
>> LIMIT and some outside.  Try phrasing the NOT EXISTS query differently.
>> Or, if you do this type of query a lot, a special-purpose index might be
>> worthwhile.  It would probably be fast as-is if you had an index on
>> (telekredytid, datazaw) (in that order).
>>
>>
> That's no problem - we already has changed this query:
> SELECT * FROM kredyty kr
>             where kr.telekredytid = 328652
>             and kr.bank = 2
>             AND NOT EXISTS (SELECT * from kredyty k2 WHERE k2.bank<>2
> and k2.creationdate > kr.creationdate)
> Works good.
>
> But in fact this wasn't my point.
> My point was: why operation CLUSTER has such a big and bad impact on
> planer for this query?
> Like I sad: before CLUSTER query was run in xx milliseconds :-)
>
OK I've got it :-)
I've prepared test database (on fast disks - CLUSTER took 2h anyway ;-)

Step 1:
qstest=# CREATE UNIQUE INDEX kredyty_desc_pkey ON kredyty using btree
(id desc);
CREATE
INDEX
Step 2:
qstest=# CLUSTER kredyty USING kredyty_desc_pkey;
CLUSTER
Step 3:
qstest=# ANALYZE kredyty;
ANALYZE
Step 4:
qstest=# EXPLAIN ANALYZE SELECT telekredytid FROM kredytyag
WHERE TRUE
AND kredytyag.id = 3064776
AND NOT EXISTS
    (
    SELECT 1 FROM
        (
            SELECT * FROM kredyty kr
            where telekredytid = 328652
            ORDER BY kr.datazaw DESC LIMIT 1
    )
    kred where kred.bank = 2)
;

QUERY
PLAN

     


-------------------------------------------------------------------------------------------------------------------------------------------------------------------------

                                        

 Result  (cost=833.09..841.38 rows=1 width=4) (actual
time=70.050..70.050 rows=0
loops=1)

   One-Time Filter: (NOT
$0)
                    


InitPlan
                                          

     ->  Subquery Scan kred  (cost=833.07..833.09 rows=1 width=0)
(actual time=48.223..48.223 rows=0
loops=1)
           Filter: (kred.bank =
2)
             

           ->  Limit  (cost=833.07..833.08 rows=1 width=3975) (actual
time=48.206..48.206 rows=0
loops=1)
                 ->  Sort  (cost=833.07..835.66 rows=1035 width=3975)
(actual time=48.190..48.190 rows=0
loops=1)
                       Sort Key: kr.datazaw
                       Sort Method:  quicksort  Memory: 25kB
                       ->  Index Scan using kredyty_telekredytid_idx on
kredyty kr  (cost=0.00..827.90 rows=1035 width=3975) (actual
time=48.163..48.163 rows=0 loops=1)
                             Index Cond: (telekredytid = 328652)
   ->  Index Scan using kredytyag_pkey on kredytyag  (cost=0.00..8.29
rows=1 width=4) (actual time=21.798..21.798 rows=0 loops=1)
         Index Cond: (id = 3064776)
 Total runtime: 70.550 ms
(14 rows)

qstest=#

So, I was close - bad index... DESCending is much better.
Thanks to Grzegorz Ja\skiewicz  hi has strengthened me in the conjecture.

I'm posting this - maybe someone will find something useful in that case.

ps. query was and is good :-)

--
Andrzej Zawadzki

Re: CLUSTER and a problem

От
Grzegorz Jaśkiewicz
Дата:
On Tue, Sep 15, 2009 at 9:10 PM, Andrzej Zawadzki <zawadaa@wp.pl> wrote:

> So, I was close - bad index... DESCending is much better.
> Thanks to Grzegorz Ja\skiewicz  hi has strengthened me in the conjecture.
>
> I'm posting this - maybe someone will find something useful in that case.
>
> ps. query was and is good :-)


Sure, This was talked about a lot on -hackers. The cost of 'back-walk'
index fetch is a lot.
So for anyone who thought it isn't back then, well - here's real life proof.



--
GJ