Re: BUG #14107: Major query planner bug regarding subqueries and indices

Поиск

Список

Период

Сортировка

От	David Rowley
Тема	Re: BUG #14107: Major query planner bug regarding subqueries and indices
Дата	11 мая 2016 г. 13:56:34
Msg-id	CAKJS1f_798-+aewz9GS=OSYqk2mxQTuT7_9pWedSJbDoRzZnVg@mail.gmail.com обсуждение исходный текст
Ответ на	Re: BUG #14107: Major query planner bug regarding subqueries and indices (Mathias Kunter <mathiaskunter@gmail.com>)
Ответы	Re: BUG #14107: Major query planner bug regarding subqueries and indices
Список	pgsql-bugs

Дерево обсуждения

On 11 May 2016 at 23:55, Mathias Kunter <mathiaskunter@gmail.com> wrote:
> Sorry for bumping this one more time, but I'd like to share some more
> real-life performance test results of using ANY(ARRAY(...)) instead of
> IN(...), hoping that you'd maybe still consider implementing such an
> optimization into the query planner. Since the test results indicate that
> the performance boost can really be massive on certain query types (factor
> 1000), I think that it'd really be worth the work.
>
>
> ===== Test setup =====
>
> The tables "mb.release" and "mb.release_group" both contain about 1.5
> million rows of real data, taken from the MusicBrainz database, and are of
> course properly indexed. All performance tests have been repeated a few
> times to be comparable.
>
> The test covers subqueries which return just a few rows and also subqueries
> which return more than 100000 rows. The queries test the performance of IN
> vs. ANY(ARRAY()) when used in different scenarios.
>
> For reference, the full query plans of all used queries are linked below.
>
>
> ===== Tested queries =====
>
> 1) SELECT id FROM mb.release WHERE release_group IN (SELECT id FROM
> mb.release_group WHERE name = 'Bear');
>
> 2) SELECT id FROM mb.release WHERE release_group IN (SELECT id FROM
> mb.release_group WHERE name < 'Bear');
>
> 3) SELECT id FROM mb.release WHERE name = 'Tiger' OR release_group IN
> (SELECT id FROM mb.release_group WHERE name = 'Bear');
>
> 4) SELECT id FROM mb.release WHERE name = 'Tiger' OR release_group IN
> (SELECT id FROM mb.release_group WHERE name < 'Bear');
>
> 5) SELECT id FROM mb.release WHERE name = 'Tiger' AND release_group IN
> (SELECT id FROM mb.release_group WHERE name = 'Bear');
>
> 6) SELECT id FROM mb.release WHERE name = 'Tiger' AND release_group IN
> (SELECT id FROM mb.release_group WHERE name < 'Bear');
>
>
> ===== Test results =====
>
> All numbers are given in milliseconds and show the total query time
> (planning + execution).
>
> -------------------------------------------
> | Query |    IN (...) | = ANY(ARRAY(...)) |
> -------------------------------------------
> |     1 |         0.7 |               0.4 |
> |     2 |      6001.1 |            2517.8 |
> |     3 |       711.3 |               0.5 |
> |     4 | > 1000000.0 |            1962.6 |
> |     5 |         0.8 |               0.5 |
> |     6 |         0.9 |             492.7 |
> -------------------------------------------
>
> Note: Query 4 using the IN operator has been canceled after running for more
> than 15 minutes.

How do you find the ANY(ARRAY(...)) version performs with say 10
million records in the array?

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

В списке pgsql-bugs по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: BUG #14107: Major query planner bug regarding subqueries and indices