Обсуждение: Function scan/Index scan to nested loop

Поиск
Список
Период
Сортировка

Function scan/Index scan to nested loop

От
"Carlo Stonebanks"
Дата:
Hello all,

A query ran twice in succession performs VERY poorly the first time as it
iterates through the nested loop. The second time, it rips. Please see SQL,
SLOW PLAN and FAST PLAN below.

I don't know why these nested loops are taking so long to execute.
"  ->  Nested Loop  (cost=0.00..42866.98 rows=77 width=18) (actual
time=126.354..26301.027 rows=9613 loops=1)"
"        ->  Nested Loop  (cost=0.00..42150.37 rows=122 width=18) (actual
time=117.369..15349.533 rows=13247 loops=1)"

The loop members appear to be finished quickly. I suspect that the results
for the function aren't really as fast as reported, and are actually taking
much longer to comeplete returning results.
"              ->  Function Scan on zips_in_mile_range  (cost=0.00..52.50
rows=67 width=40) (actual time=104.196..104.417 rows=155 loops=1)"
"                    Filter: (zip > ''::text)"

Is this possible? I can't see what other delay there could be.

The second time the query runs, the loops are fast:
"  ->  Nested Loop  (cost=0.00..42866.98 rows=77 width=18) (actual
time=97.073..266.826 rows=9613 loops=1)"
"        ->  Nested Loop  (cost=0.00..42150.37 rows=122 width=18) (actual
time=97.058..150.172 rows=13247 loops=1)"

Since it is fast the second time, I wonder if this is related at all to the
function being IMMUTABLE? (Even though it's IMMUTABLE it reads a very static
table)

This DB is a copy of another DB, on the same server host, same drive but
different tablespace. The original query has good performance, and is hit
often by the live web server. With the copy - which performs poorly - the
query is hit infrequently.

Is there any evidence for why the nested loop is slow?

Code and plans follow - regards and thanks!

Carlo

SQL:
select
      pp.provider_practice_id,
      p.provider_id,
      distance,
      pp.is_principal,
      p.provider_id as sort_order
   from mdx_core.provider as p
   join mdx_core.provider_practice as pp
   on pp.provider_id = p.provider_id
   join (select * from mdx_core.zips_in_mile_range('75203', 15::numeric)
where zip > '') as nearby
   on nearby.zip = substr(pp.default_postal_code, 1, 5)
   where
      pp.default_country_code = 'US'
      and p.provider_status_code = 'A' and p.is_visible = 'Y'
      and pp.is_principal = 'Y'
      and coalesce(pp.record_status, 'A') = 'A'
   order by sort_order, distance

SLOW PLAN:
"Sort  (cost=42869.40..42869.59 rows=77 width=18) (actual
time=26316.495..26322.102 rows=9613 loops=1)"
"  Sort Key: p.provider_id, zips_in_mile_range.distance"
"  Sort Method:  quicksort  Memory: 1136kB"
"  ->  Nested Loop  (cost=0.00..42866.98 rows=77 width=18) (actual
time=126.354..26301.027 rows=9613 loops=1)"
"        ->  Nested Loop  (cost=0.00..42150.37 rows=122 width=18) (actual
time=117.369..15349.533 rows=13247 loops=1)"
"              ->  Function Scan on zips_in_mile_range  (cost=0.00..52.50
rows=67 width=40) (actual time=104.196..104.417 rows=155 loops=1)"
"                    Filter: (zip > ''::text)"
"              ->  Index Scan using
provider_practice_default_base_zip_country_idx on provider_practice pp
(cost=0.00..628.30 rows=2 width=19) (actual time=1.205..98.231 rows=85
loops=155)"
"                    Index Cond: ((pp.default_country_code = 'US'::bpchar)
AND (substr((pp.default_postal_code)::text, 1, 5) = zips_in_mile_range.zip)
AND (pp.is_principal = 'Y'::bpchar))"
"                    Filter: (COALESCE(pp.record_status, 'A'::bpchar) =
'A'::bpchar)"
"        ->  Index Scan using provider_provider_id_provider_status_code_idx
on provider p  (cost=0.00..5.86 rows=1 width=4) (actual time=0.823..0.824
rows=1 loops=13247)"
"              Index Cond: ((p.provider_id = pp.provider_id) AND
(p.provider_status_code = 'A'::bpchar))"
"              Filter: (p.is_visible = 'Y'::bpchar)"
"Total runtime: 26327.329 ms"

FAST PLAN:
"Sort  (cost=42869.40..42869.59 rows=77 width=18) (actual
time=278.722..284.326 rows=9613 loops=1)"
"  Sort Key: p.provider_id, zips_in_mile_range.distance"
"  Sort Method:  quicksort  Memory: 1136kB"
"  ->  Nested Loop  (cost=0.00..42866.98 rows=77 width=18) (actual
time=97.073..266.826 rows=9613 loops=1)"
"        ->  Nested Loop  (cost=0.00..42150.37 rows=122 width=18) (actual
time=97.058..150.172 rows=13247 loops=1)"
"              ->  Function Scan on zips_in_mile_range  (cost=0.00..52.50
rows=67 width=40) (actual time=97.013..97.161 rows=155 loops=1)"
"                    Filter: (zip > ''::text)"
"              ->  Index Scan using
provider_practice_default_base_zip_country_idx on provider_practice pp
(cost=0.00..628.30 rows=2 width=19) (actual time=0.017..0.236 rows=85
loops=155)"
"                    Index Cond: ((pp.default_country_code = 'US'::bpchar)
AND (substr((pp.default_postal_code)::text, 1, 5) = zips_in_mile_range.zip)
AND (pp.is_principal = 'Y'::bpchar))"
"                    Filter: (COALESCE(pp.record_status, 'A'::bpchar) =
'A'::bpchar)"
"        ->  Index Scan using provider_provider_id_provider_status_code_idx
on provider p  (cost=0.00..5.86 rows=1 width=4) (actual time=0.006..0.007
rows=1 loops=13247)"
"              Index Cond: ((p.provider_id = pp.provider_id) AND
(p.provider_status_code = 'A'::bpchar))"
"              Filter: (p.is_visible = 'Y'::bpchar)"
"Total runtime: 289.582 ms"


Re: Function scan/Index scan to nested loop

От
Scott Marlowe
Дата:
On Mon, May 10, 2010 at 11:32 PM, Carlo Stonebanks
<stonec.register@sympatico.ca> wrote:
> Hello all,
>
> A query ran twice in succession performs VERY poorly the first time as it
> iterates through the nested loop. The second time, it rips. Please see SQL,
> SLOW PLAN and FAST PLAN below.

This is almost always due to caching.  First time the data aren't in
the cache, second time they are.

> I don't know why these nested loops are taking so long to execute.
> "  ->  Nested Loop  (cost=0.00..42866.98 rows=77 width=18) (actual
> time=126.354..26301.027 rows=9613 loops=1)"
> "        ->  Nested Loop  (cost=0.00..42150.37 rows=122 width=18) (actual
> time=117.369..15349.533 rows=13247 loops=1)"

Your row estimates are WAY off.  A nested loop might now be the best choice.

Also note that some platforms add a lot of time to some parts of an
explain analyze due to slow time function response.  Compare the run
time of the first run with and without explain analyze.

Re: Function scan/Index scan to nested loop

От
Craig Ringer
Дата:
On 11/05/10 13:32, Carlo Stonebanks wrote:
> Hello all,
>
> A query ran twice in succession performs VERY poorly the first time as
> it iterates through the nested loop. The second time, it rips. Please
> see SQL, SLOW PLAN and FAST PLAN below.

I haven't looked at the details, but the comment you made about it being
fast on the live server which hits this query frequently tends to
suggest that this is a caching issue.

Most likely, the first time Pg has to read the data from disk. The
second time, it's in memory-based disk cache or even in Pg's
shared_buffers, so it can be accessed vastly quicker.

--
Craig Ringer

Tech-related writing: http://soapyfrogs.blogspot.com/

Re: Function scan/Index scan to nested loop

От
"Carlo Stonebanks"
Дата:
Thanks Scott,

>> This is almost always due to caching.  First time the data aren't in the
>> cache, second time they are.
<<

I had assumed that it was caching, but I don't know from where because of
the inexplicable delay. Hardware? O/S (Linux)? DB? From the function, which
is IMMUTABLE?

I am concerned that there is such a lag between all the index and function
scans start/complete times and and the nested loops starting. I have
reformatted the SLOW PLAN results below to make them easier to read. Can you
tell me if this makes any sense to you?

I can understand that EXPLAIN might inject some waste, but the delay being
shown here is equivalent to the delay in real query times - I don't think
EXPLAIN components would inject 15 second waits... would they?

>> Your row estimates are WAY off.  A nested loop might now be the best
>> choice.
<<

I tried to run this with set enable_nestloop to off and it built this truly
impressively complex plan! However, the cache had already spun up. The thing
that makes testing so difficult is that once the caches are loaded, you have
to flail around trying to find query parameters that DON'T hit the cache,
making debugging difficult.

The row estimates being off is a chronic problem with our DB. I don't think
the 3000 row ANALYZE is getting a proper sample set and would love to change
the strategy, even if at the expense of speed of execution of ANALYZE. I
don't know what it is about our setup that makes our PG servers so hard to
tune, but I think its time to call the cavalry (gotta find serious PG server
tuning experts in NJ).

Carlo


SLOW PLAN
Sort  (cost=42869.40..42869.59 rows=77 width=18) (actual
time=26316.495..26322.102 rows=9613 loops=1)
  Sort Key: p.provider_id, zips_in_mile_range.distance
  Sort Method:  quicksort  Memory: 1136kB
  ->  Nested Loop
     (cost=0.00..42866.98 rows=77 width=18)
     (actual time=126.354..26301.027 rows=9613 loops=1)
        ->  Nested Loop
            (cost=0.00..42150.37 rows=122 width=18)
            (actual time=117.369..15349.533 rows=13247 loops=1)
              ->  Function Scan on zips_in_mile_range
                  (cost=0.00..52.50 rows=67 width=40)
                  (actual time=104.196..104.417 rows=155 loops=1)
                  Filter: (zip > ''::text)
              ->  Index Scan using
provider_practice_default_base_zip_country_idx on provider_practice pp
                  (cost=0.00..628.30 rows=2 width=19)
                  (actual time=1.205..98.231 rows=85 loops=155)
                  Index Cond: ((pp.default_country_code = 'US'::bpchar)
                     AND (substr((pp.default_postal_code)::text, 1, 5) =
zips_in_mile_range.zip)
                     AND (pp.is_principal = 'Y'::bpchar))
                  Filter: (COALESCE(pp.record_status, 'A'::bpchar) =
'A'::bpchar)
        ->  Index Scan using provider_provider_id_provider_status_code_idx
on provider p
            (cost=0.00..5.86 rows=1 width=4)
            (actual time=0.823..0.824 rows=1 loops=13247)
            Index Cond: ((p.provider_id = pp.provider_id)
               AND (p.provider_status_code = 'A'::bpchar))
            Filter: (p.is_visible = 'Y'::bpchar)



Re: Function scan/Index scan to nested loop

От
Robert Haas
Дата:
On Tue, May 11, 2010 at 2:00 PM, Carlo Stonebanks
<stonec.register@sympatico.ca> wrote:
> I am concerned that there is such a lag between all the index and function
> scans start/complete times and and the nested loops starting. I have
> reformatted the SLOW PLAN results below to make them easier to read. Can you
> tell me if this makes any sense to you?

I think you want to run EXPLAIN ANALYZE on the queries that are being
executed BY mdx_core.zips_in_mile_range('75203', 15::numeric) rather
than the query that calls that function.  You should be able to see
the same caching effect there and looking at that plan might give you
a better idea what is really happening.

(Note that you might need to use PREPARE and EXPLAIN EXECUTE to get
the same plan the function is generating internally, rather than just
EXPLAIN.)

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company