Re: optimizing a query

Поиск
Список
Период
Сортировка
От Jonathan Vanasco
Тема Re: optimizing a query
Дата
Msg-id D6AF3833-B022-4DE6-B1C1-E9290E639A9F@2xlp.com
обсуждение исходный текст
Ответ на Re: optimizing a query  (Erik Gustafson <gustafson.erik@gmail.com>)
Ответы Re: optimizing a query  ("David G. Johnston" <david.g.johnston@gmail.com>)
Re: optimizing a query  ("David G. Johnston" <david.g.johnston@gmail.com>)
Список pgsql-general
On Jun 22, 2016, at 4:25 AM, Erik Gustafson wrote:

> don't you want an index on t_a2b.col_a, maybe partial where col_a=1 ?

that table has indexes on all columns.  they're never referenced because the rows are so short.  this was just an
examplequery too, col_a has 200k variations  

After a lot of testing, I think I found a not-bug but possible area-for-improvement in the planner when joining against
atable for filtering (using my production 9.5.2 box) 

I checked a query against multiple possible indexes using the related columns.  only one of indexes was on the table
foreach series of tests, and I analyzed the table after the drop/create of indexes. 


Note 1: The only time an index-only scan is used, is on this form:

        CREATE INDEX idx_partial_fkey_id_partial ON table_a(fkey_1, id, col_partial) WHERE col_partial IS NOT FALSE;

    Omitting the col_partial from being indexed will trigger a Bitmap Heap Scan on the full table with a recheck
condition:

        CREATE INDEX idx_partial_fkey_id ON table_a(fkey_1, id) WHERE col_partial IS NOT FALSE;

    This shouldn't be necessary.  the planner knew that `col_partial` fulfilled the WHERE clause when it used the
index,but scanned the table to check it anyways. 

    On most tables the heap scan was negligible, but on a few larger tables it accounted a 20% increase in execution.

Note 2:

    This is odd, but this index is used by the planner:
        CREATE INDEX idx_partial_fkey_id ON table_a(fkey_1, id) WHERE col_partial IS NOT FALSE;

    but this index is never used:
        CREATE INDEX idx_partial_id_fkey ON table_a(id, fkey_1) WHERE col_partial IS NOT FALSE;

    I honestly don't know why the second index would not be used.  The query time doubled without it when run on a
tablewith 6million rows and about 20 columns. 

-------

The indexes I tested on:

    CREATE INDEX idx_fkey_1 ON table_a(fkey_1);
    CREATE INDEX idx_partial_fkey ON table_a(fkey_1) WHERE col_partial IS NOT FALSE;
    CREATE INDEX idx_partial_fkey_id ON table_a(fkey_1, id) WHERE col_partial IS NOT FALSE;
    CREATE INDEX idx_partial_id_fkey ON table_a(id, fkey_1) WHERE col_partial IS NOT FALSE;
    CREATE INDEX idx_partial_fkey_partial ON table_a(fkey_1, col_partial) WHERE col_partial IS NOT FALSE;
    CREATE INDEX idx_partial_fkey_id_partial ON table_a(fkey_1, id, col_partial) WHERE col_partial IS NOT FALSE;



В списке pgsql-general по дате отправления:

Предыдущее
От: John R Pierce
Дата:
Сообщение: Re: Protect a table against concurrent data changes while allowing to vacuum it
Следующее
От: Michelle Schwan
Дата:
Сообщение: Postgres 9.5.2 upgrade to 9.6