Re: [BUGS] Query planner skipping index depending on DISTINCTparameter order (2)

Поиск
Список
Период
Сортировка
От Dilyan Palauzov
Тема Re: [BUGS] Query planner skipping index depending on DISTINCTparameter order (2)
Дата
Msg-id a68737c6-601f-6846-2ba0-3795c942c81f@aegee.org
обсуждение исходный текст
Ответ на Re: [BUGS] Query planner skipping index depending on DISTINCTparameter order (2)  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Ответы Re: [BUGS] Query planner skipping index depending on DISTINCTparameter order (2)  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Список pgsql-bugs
Hello Tomas,

thanks for your answer.

While adding an ORDER BY uses the index, it does not help, as the user 
has to take care of the order of provided columns.  Whether the care is 
is taken in DISTINCT ON or in ORDER BY is secondary.

The behaviour behind DISTINCT and indexes surprises  me, as the query 
planner does reorder the columns for SELECT to determine the most 
suitable index.

My proposal to reflect this:

diff --git a/doc/src/sgml/ref/select.sgml b/doc/src/sgml/ref/select.sgml
--- a/doc/src/sgml/ref/select.sgml
+++ b/doc/src/sgml/ref/select.sgml
@@ -1069,7 +1069,10 @@ SELECT DISTINCT ON (location) location, time, report    <para>     Currently, <literal>FOR NO
KEYUPDATE</>, <literal>FOR UPDATE</>,     <literal>FOR SHARE</> and <literal>FOR KEY SHARE</> cannot be
 
-    specified with <literal>DISTINCT</literal>.
+    specified with <literal>DISTINCT</literal>.  Contrary to
+    <literal>SELECT</>, which reorders its parameters to find a best,
+    matching index, DISTINCT ON constructs an expression, e.g. from the
+    provided rows, and checks then if an index can serve the expression.    </para>   </refsect2>


Please ignore my email from Friday, I have forgotten an ON after DISTINCT.


Greetings  Dilyan


On 09/16/2017 10:23 AM, Tomas Vondra wrote:
> Hello Dilyan,
>
> You're right - we're currently not able to use the index if it's not
> consistent with the DISTINCT ordering. That is, if you have index on
> (a,b) and DISTINCT ON (b,a) we fail to leverage the index.
>
> The reason for this simple - if you look at create_distinct_paths [1],
> which is where the Unique path comes from, you'll see it iterates over
> all paths and compares the ordering using pathkeys_is_contained [2].
>
> That however only ensures the path matches the expected Unique ordering
> (determined by the column list in DISTINCT ON clause), we don't try to
> re-shuffle the columns in any way at this point.
>
> So this is more a missing optimization than a bug, I'd guess. But it
> seems worthwhile and possibly not extremely difficult to implement, so I
> may look into it - but that's PG11 at the earliest.
>
> But, looking at the code in create_distinct_paths, ISTM you can easily
> convince the planner to use the index by simply adding a matching ORDER
> BY clause. That is
>
>     SELECT DISTINCT ON(token, id) token  FROM bayes_token
>     ORDER BY id, token;
>
> should be able to use the index on (id,token).
>
>
> [1]
> https://github.com/postgres/postgres/blob/master/src/backend/optimizer/plan/planner.c#L4725
>
> [2]
> https://github.com/postgres/postgres/blob/master/src/backend/optimizer/plan/planner.c#L4811
>
>
> regards
>


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: [BUGS] BUG #14798: postgres user superuser changed
Следующее
От: Tomas Vondra
Дата:
Сообщение: Re: [BUGS] Query planner skipping index depending on DISTINCTparameter order (2)