Re: [GENERAL] Yet another btree gotcha

Поиск
Список
Период
Сортировка
От Mike Mascari
Тема Re: [GENERAL] Yet another btree gotcha
Дата
Msg-id 19991006071831.4990.rocketmail@web2102.mail.yahoo.com
обсуждение исходный текст
Ответы Re: [GENERAL] Yet another btree gotcha  ("Gene Selkov, Jr." <selkovjr@mcs.anl.gov>)
Список pgsql-general
I realize that this is somewhat off topic, but I
have seen far better improvments in query speeds
when using a multifield index instead of individual
ones. So, for the given select below:

SELECT km.km, su.su
FROM km, su
WHERE km.id = su.id
AND km.rel = su.rel
AND km.item = su.item
AND su ~ '^ethanol';

multifield indexes should be created:

CREATE INDEX k_km on km(id, rel, item);
CREATE INDEX k_su on su(id, rel, item);

and the query should be rewritten as:

SELECT km.km, su.su
FROM km, su
WHERE
(km.id,km.rel,km.item) = (su.id,su.rel,su.item)
AND su ~ '^ethanol';

Perhaps the distribution of your data is more
"normal" using multikey indexes (I hope).  Note
that if you have a multifield index, and you still
submit your original query, the planner/optimizer
doesn't appear smart enough to use the multikey
index except for the first join condition, so to
realize all of the benefits, you have to use the

WHERE
(t1.field1,t1.field2) = (t2.field1,t2.field2)

construction instead of:

WHERE
t1.field1 = t2.field1 AND
t1.field2 = t2.field2

On another note, looking through the backend
sources reveals that, at one point, partial indices
were once allowed:

CREATE INDEX k_km on km(item) WHERE item <> 1;

but the grammer has been removed and is now
illegal (although the code to handle it still
exists). Perhaps someday this feature will be
restored.

Hope the above helps some,

Mike Mascari
(mascarim@yahoo.com)

--- "Gene Selkov, Jr." <selkovjr@mcs.anl.gov> wrote:
> I am wondering whether it's normal to see dropping a
> btree improve the
> query which could never complete enough that it
> completes in a blink?
>
> I realize the data I have here represent the worst
> possible input to
> btree, and I'm probably better off without any index
> at all, but I
> guess it is something that the optimizer or the
> access method itself
> should be able to decide.
>
> I am joining two tables, "km" and "su" on an int2
> attribute, "item".
> Just take a look at the value histograms for item:
>
> emp=> SELECT item, count (*) AS count FROM km GROUP
> BY item;
> item|count
> ----+-----
>    1|31262
>    2|  110
>    3|    3
>    4|    1
> (4 rows)
>
> emp=> SELECT item, count (*) AS count FROM su GROUP
> BY item;
> item|count
> ----+-----
>    1|94108
>    2| 1697
>    3|  773
>    4|  482
>    5|  237
>    6|  146
>    7|  105
>    8|   68
>    9|   41
>   10|   29
>   11|   22
>   12|   15
>   13|   13
>   14|    8
>   15|    7
>   16|    6
>   17|    5
>   18|    4
>   19|    4
>   20|    4
>   21|    4
>   22|    3
>   23|    3
>   24|    3
>   25|    1
>   26|    1
>   27|    1
>   28|    1
>   29|    1
>   30|    1
>   31|    1
>
> As a default rule, I used to create the btree
> indices for all integer
> types, regardless of their values. Not anymore. It
> took me quite a
> while to figure that the following query did not
> work because of the
> faulty btree index on "item" (other joined
> attributes are char()):
>
> SELECT km.km, su.su
>   FROM km, su
>   WHERE km.id = su.id
>     AND km.rel = su.rel
>     AND km.item = su.item
>     AND su ~ '^ethanol';
>
> Can the btree or any other AM be smart enough and
> bail out from CREATE
> INDEX saying, "your data isn't worth indexing"?
>
> --Gene


=====

__________________________________________________
Do You Yahoo!?
Bid and sell for free at http://auctions.yahoo.com

В списке pgsql-general по дате отправления:

Предыдущее
От: "Gene Selkov, Jr."
Дата:
Сообщение: Yet another btree gotcha
Следующее
От: Mike Mascari
Дата:
Сообщение: Re: [GENERAL] You are really hosed.