queries with lots of UNIONed relations

Поиск
Список
Период
Сортировка
От Jon Nelson
Тема queries with lots of UNIONed relations
Дата
Msg-id AANLkTinVjco+Q=9C3Fd8vFxmYY=LiUKFFg4rm=e2yFoX@mail.gmail.com
обсуждение исходный текст
Список pgsql-performance
I was recently asked to look into why one particular set of queries
was taking a long time.
The queries are all of the same form.  They select the UNION of a few
columns from around 100 tables.
The query in particular was taking some 7-8 minutes to run.
On a whim, I changed the query from this form:

SELECT a, b FROM FOO_a WHERE <conditions>
UNION
SELECT a,b FROM FOO_b WHERE <conditions>
....

to:

SELECT DISTINCT a,b FROM FOO_a WHERE <conditions>
UNION
SELECT DISTINCT a,b FROM FOO_b WHERE <conditions>

and the query time dropped to under a minute.

In the former case, the query plan was a bitmap heap scan for each
table. Then those results were Appended, Sorted, Uniqued, Sorted
again, and then returned.

In the latter, before Appending, each table's results were run through
HashAggregate.

The total number of result rows is in the 500K range. Each table holds
approximately 150K matching rows (but this can vary a bit).

What I'm asking is this: since adding DISTINCT to each participating
member of the UNION query reduced the total number of appended rows,
is there some sort of heuristic that postgresql could use to do this
automatically?  The 12x speedup was quite nice.


--
Jon

В списке pgsql-performance по дате отправления:

Предыдущее
От: Jon Nelson
Дата:
Сообщение: Re: queries with lots of UNIONed relations
Следующее
От: kakarukeys
Дата:
Сообщение: Re: adding foreign key constraint locks up table