Re: Troubleshooting query performance issues - Resolved (sort of)

Поиск
Список
Период
Сортировка
От Jim Garrison
Тема Re: Troubleshooting query performance issues - Resolved (sort of)
Дата
Msg-id 0C723FEB5B4E5642B25B451BA57E27303EE06E44@S1P5DAG3C.EXCHPROD.USA.NET
обсуждение исходный текст
Список pgsql-performance
We have traced this to the *addition* of a two-column index.

The two tables in question both have single-column indexes on two foreign keys, say columns A and B.  The query joins
thetwo large tables on A and B.   

With only the two indexes, the query plan does a bitmap AND on the index scan results and performance is stable.

I added an index on (A,B), and this caused the planner to use the new index, but I was never able to get the query to
complete. In one instance I let it run 18 hours.   

The onlly difference was the addition of the index

Summary:

- With index on (A,B) -- query time is "infinite"

- Without index on (A,B), relying on individual indexes and bitmap AND -- query time is about 4 minutes (as expected
giventhe data volume) 

Does this sound like a bug in the query planner?

> -----Original Message-----
> From: pgsql-performance-owner@postgresql.org [mailto:pgsql-
> performance-owner@postgresql.org] On Behalf Of Jim Garrison
> Sent: Wednesday, September 25, 2013 8:58 AM
> To: pgsql-performance@postgresql.org
> Subject: [PERFORM] Troubleshooting query performance issues
>
> I spent about a week optimizing a query in our performance-testing
> environment, which has hardware similar to production.
>
> I was able to refactor the query and reduce the runtime from hours to about
> 40 seconds, through the use of CTEs and a couple of new indexes.
>
> The database was rebuilt and refreshed with the very similar data from
> production, but now the query takes hours again.
>
> In the query plan, it is clear that the row count estimates are WAY too low,
> even though the statistics are up to date.  Here's a sample query plan:
>
> CTE Scan on stef  (cost=164.98..165.00 rows=1 width=38)
>   CTE terms
>     ->  Nested Loop  (cost=0.00..62.40 rows=1 width=12)
>           ->  Index Scan using term_idx1 on term t  (cost=0.00..52.35 rows=1
> width=12)
>                 Index Cond: (partner_id = 497)
>                 Filter: (recalculate_district_averages_yn AND (NOT is_deleted_yn))
>           ->  Index Scan using growth_measurement_window_fk1 on
> growth_measurement_window gw  (cost=0.00..10.04 rows=1 width=4)
>                 Index Cond: (term_id = t.term_id)
>                 Filter: (test_window_complete_yn AND (NOT is_deleted_yn) AND
> ((growth_window_type)::text = 'DISTRICT'::text))
>   CTE stef
>     ->  Nested Loop  (cost=0.00..102.58 rows=1 width=29)
>           Join Filter: ((ssef.student_id = terf.student_id) AND (ssef.grade_id =
> terf.grade_id))
>           ->  Nested Loop  (cost=0.00..18.80 rows=3 width=28)
>                 ->  CTE Scan on terms t  (cost=0.00..0.02 rows=1 width=8)
>                 ->  Index Scan using student_school_enrollment_fact_idx2 on
> student_school_enrollment_fact ssef  (cost=0.00..18.74 rows=3 width=20)
>                       Index Cond: ((partner_id = t.partner_id) AND (term_id =
> t.term_id))
>                       Filter: primary_yn
>           ->  Index Scan using test_event_result_fact_idx3 on
> test_event_result_fact terf  (cost=0.00..27.85 rows=4 width=25)
>                 Index Cond: ((partner_id = t.partner_id) AND (term_id = t.term_id))
>                 Filter: growth_event_yn
>
> The estimates in the first CTE are correct, but in the second, the scan on
> student_school_enrollment_fact will return about 1.5 million rows, and the
> scan on test_event_result_fact actually returns about 1.1 million.  The top
> level join should return about 900K rows.  I believe the fundamental issue is
> that the CTE stef outer nested loop should be a merge join instead, but I
> cannot figure out why the optimizer is estimating one row when it has the
> statistics to correctly estimate the count.
>
> What would cause PG to so badly estimate the row counts?
>
> I've already regenerated the indexes and re-analyzed the tables involved.
>
> What else can I do to find out why it's running so slowly?
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance


В списке pgsql-performance по дате отправления:

Предыдущее
От: Merlin Moncure
Дата:
Сообщение: Re: Cpu usage 100% on slave. s_lock problem.
Следующее
От: Jim Garrison
Дата:
Сообщение: Re: Troubleshooting query performance issues - resolved (sort of)