Re: [Fwd: Index Advisor]

Поиск
Список
Период
Сортировка
От Gurjeet Singh
Тема Re: [Fwd: Index Advisor]
Дата
Msg-id 65937bea0611192135q2f9e8a74he5b84b5b0880f5be@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [Fwd: Index Advisor]  (Kai-Uwe Sattler <kus@tu-ilmenau.de>)
Ответы Re: [Fwd: Index Advisor]  ("Gurjeet Singh" <singh.gurjeet@gmail.com>)
Re: [Fwd: Index Advisor]  (Kai-Uwe Sattler <kus@tu-ilmenau.de>)
Список pgsql-hackers


On 11/19/06, Kai-Uwe Sattler <kus@tu-ilmenau.de> wrote:
Hi,
> .) The SELECTs in the pg_advise are returning wrong results, when
> the same index is suggested twice, because of the SUM() aggregates.
I don't think that this is a bug. If the same index is recommended
for two different queries it will appear  two times in
pg_indexadvisor. So, if you want to calculate the overall benefit of
this index, then you have to sum up the local benefits for each query.

If this is intended behaviour, then its okay.

> .) I doubt that on a table t(a,b), for a suggestion of idx(b,a),
> pg_advise will
>     suggest idx(a,b);
?? Not sure, if I understand you right. idx(b,a) and idx(a,b) are
completely different indexes. Why should pg_advise suggest idx(a,b).

I am referring to the way get_column_names() is coded. First, the SQL for the portal does not guarantee any order of the result; secondly, the 'for' loops that follow, will always output the columns in their increasing order of attribute number. Here's a small way to reproduce the bug, that I cooked up just now:

Change the SQL in read_advisor_output() to:

  res = PQexec(conn, "DECLARE myportal CURSOR FOR "
            "SELECT    relname,"
                    "int2vector_to_string(index_attrs) AS colids,"
                    "MAX(index_pages) AS size_in_pages,"
                    "SUM(profit) AS benefit,"
                    "SUM(profit)/MAX(index_pages) AS gain "
            "FROM    pg_indexadvisor,"
                    "pg_class "
            "WHERE    backend_pid = 0 "
            "AND        rel_oid = pg_class.oid "
            "GROUP BY    relname, colids "
            "ORDER BY gain DESC");

Notice the backend_pid = 0. Now insert the following into pg_indexadvisor:

insert into pg_indexadvisor values( 1259, '2 1', 2, 1000, 20,0,0 );

This should prompt the advisor to generate the statement:

create index advidx_1 on pg_class ( relnamespace,relname);

But instead, it will output this:

create index advidx_1 on pg_class ( relname,relnamespace );

Now run the advisor with any workload, and inspect the output.

$ pg_advise.exe -d postgres -h localhost -p 5432 -U gsk -o create_index.sql workload.sql

We should tokenize the list of attribute numbers (column_ids variable) in get_column_names() and query them individually.


But there is another bug: if there are recommendations like idx
(a,b,c), idx(a,b) and idx(a) it would be a good idea to create just
idx(a). I will add this to pg_advise as an optional feature.

I'd say it's a new feature request and not a bug :) But I don't understand why would you want to not build idx(a,b,c) in such a situation? idx(a,b,c) can be useful in places where idx(a,b) or idx(a) is required, but the same can't be said about idx(a) or idx(a,b) being useful where idx(a,b,c) is required!

Best regards,

--
gurjeet[.singh]@EnterpriseDB.com
singh.gurjeet@{ gmail | hotmail | yahoo }.com

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: [GENERAL] Allowing SYSDATE to Work
Следующее
От: Michael Fuhr
Дата:
Сообщение: Statistics visibility in SERIALIZABLE transactions