Re: Poor performance problem with Materialize, 8.4 -> 9.1 (enable_material)

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Poor performance problem with Materialize, 8.4 -> 9.1 (enable_material)
Дата
Msg-id 5524.1342201257@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Poor performance problem with Materialize, 8.4 -> 9.1 (enable_material)  (Eoghan Murray <eoghan@qatano.com>)
Список pgsql-performance
Eoghan Murray <eoghan@qatano.com> writes:
> I'm upgrading from 8.4 to 9.1 and experiencing a performance degradation on
> a key query with 2 views and 2 tables.

I think the core of the problem is the lousy rowcount estimate for the
result of the edited_stop_2 view: when you've got 1 row estimated and
almost 10000 rows actual, it's almost guaranteed that the rest of the
plan is going to be bad.  It's pure luck that 8.4 chooses a plan that
fails to suck, because it's optimizing for the wrong case.  9.1 isn't
so lucky, but that doesn't make 9.1 broken, just less lucky.

I'm not terribly disappointed that that rowcount estimate is bad,
because this seems like a rather weird and inefficient way to do "get
the rows with the maximal "updated" values".  I'd suggest experimenting
with some other definitions for edited_stop_2, such as using a subquery:

 SELECT ...
   FROM stop o
   WHERE updated = (select max(updated) from stop i
                    where o.node_id = i.node_id and ...);

This might be reasonably efficient given your pkey index for "stop".

Or if you don't mind using a Postgres-ism, you could try DISTINCT ON:

 SELECT DISTINCT ON (node_id, org_id, edge_id, stop_pos) ...
   FROM stop
   ORDER BY node_id DESC, org_id DESC, edge_id DESC, stop_pos DESC, updated DESC;

See the "weather reports" example in our SELECT reference page for
some explanation of how that works.  Again, the ORDER BY is chosen
to match your pkey index; I'm not sure that the planner will think
a full-index scan beats a seqscan-and-sort, but you may as well
give it the option.

Of these, I would bet that the first will work better if your typical
usage is such that only a few rows need to be fetched from the view.
I believe the DISTINCT ON is likely to act as an optimization fence
forcing the whole view to be evaluated when using the second definition.

            regards, tom lane

В списке pgsql-performance по дате отправления:

Предыдущее
От: Claudio Freire
Дата:
Сообщение: Re: Poor performance problem with Materialize, 8.4 -> 9.1 (enable_material)
Следующее
От: Sergey Konoplev
Дата:
Сообщение: Re: Any tool/script available which can be used to measure scalability of an application's database.