queries with lots of UNIONed relations

Поиск

Список

Период

Сортировка

От	Jon Nelson
Тема	queries with lots of UNIONed relations
Дата	13 января 2011 г. 16:57:12
Msg-id	AANLkTinVjco+Q=9C3Fd8vFxmYY=LiUKFFg4rm=e2yFoX@mail.gmail.com обсуждение исходный текст
Список	pgsql-performance

Дерево обсуждения

I was recently asked to look into why one particular set of queries
was taking a long time.
The queries are all of the same form.  They select the UNION of a few
columns from around 100 tables.
The query in particular was taking some 7-8 minutes to run.
On a whim, I changed the query from this form:

SELECT a, b FROM FOO_a WHERE <conditions>
UNION
SELECT a,b FROM FOO_b WHERE <conditions>
....

to:

SELECT DISTINCT a,b FROM FOO_a WHERE <conditions>
UNION
SELECT DISTINCT a,b FROM FOO_b WHERE <conditions>

and the query time dropped to under a minute.

In the former case, the query plan was a bitmap heap scan for each
table. Then those results were Appended, Sorted, Uniqued, Sorted
again, and then returned.

In the latter, before Appending, each table's results were run through
HashAggregate.

The total number of result rows is in the 500K range. Each table holds
approximately 150K matching rows (but this can vary a bit).

What I'm asking is this: since adding DISTINCT to each participating
member of the UNION query reduced the total number of appended rows,
is there some sort of heuristic that postgresql could use to do this
automatically?  The 12x speedup was quite nice.


--
Jon

В списке pgsql-performance по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

queries with lots of UNIONed relations