Обсуждение: Multiple indexes or multi-column index?

Поиск
Список
Период
Сортировка

Multiple indexes or multi-column index?

От
Jeff Boes
Дата:
Given a large (100,000+ rows) table such as:

CREATE TABLE foo (
  a INTEGER,
  b INTEGER,
  c TIMESTAMP,
  d TEXT
  /* possibly several more columns in here */
);

What are the pros and cons of these two different approaches to indexes?

CREATE huge_index ON foo (a,b,c,d);

vs.

CREATE a_index ON foo (a);
CREATE b_index ON foo (b);
CREATE c_index ON foo (c);
CREATE d_index ON foo (d);

--
Jeff Boes                                      vox 269.226.9550 ext 24
Database Engineer                                     fax 269.349.9076
Nexcerpt, Inc.                                 http://www.nexcerpt.com
           ...Nexcerpt... Extend your Expertise

Re: Multiple indexes or multi-column index?

От
Oleg Bartunov
Дата:
On Tue, 13 Aug 2002, Jeff Boes wrote:

> Given a large (100,000+ rows) table such as:
>
> CREATE TABLE foo (
>   a INTEGER,
>   b INTEGER,
>   c TIMESTAMP,
>   d TEXT
>   /* possibly several more columns in here */
> );
>
> What are the pros and cons of these two different approaches to indexes?
>
> CREATE huge_index ON foo (a,b,c,d);
>
> vs.
>
> CREATE a_index ON foo (a);
> CREATE b_index ON foo (b);
> CREATE c_index ON foo (c);
> CREATE d_index ON foo (d);

(a,b,c,d) *would be* probably fast for queries(a,b,c,d),
but note it will not work for

queries(NULL,b,c,d)
queries(NULL,NULL,c,d)
queries(NULL,NULL,NULL,d)

and should still works for

queries(a,b,c,NULL)
queries(a,b,MULL,NULL)
queries(a,NULL,MULL,NULL)

>
>

    Regards,
        Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83


Re: Multiple indexes or multi-column index?

От
"Marc Mitchell"
Дата:
As has already been said, in order to use the index, the left-most index
columns must be supplied as part of your qualification.  But, beyond the
basics of when an index can or cannot be used, I would say you should
consider three additional issues: the distribution of values for any of
these columns, the effects of joining from index to base table, and the
volatility of the columns being indexed.

For example, if foo.a has potentially 100,000 different values then there
would seem to be little value in indexing the remaining three columns since
with a value for foo.a alone, you'll quickly narrow your request to a
handful of rows.  Extra columns just make the index entries themselves
wider.  The one benefit is that, by including columns within an index, even
if not used as part of the index access strategy, can eliminate the need to
join back to the base table itself if all the columns requested are within
the index itself.  HOWEVER: while I know this is the case with other RDBMS
products, I can't verify this within Postgres.  If I do a

"SELECT foo.index_column FROM foo WHERE foo.indexed_column LIKE 'ABC*';"

it runs the exact EXPLAIN results (aside from result with) as:

"SELECT foo.index_column FROM foo WHERE foo.indexed_column LIKE 'ABC*';"

so I'm not sure if the join from index to base table is being avoided or
not in the former case.

Furthermore, balance the ordering of columns between what is required for
index applicability based on the queries you run AND the distribution of
values for the columns you are indexing.  If, for example, all queries will
supply at least foo.a and foo.b, whichever of these two columns has the
fewer rows per given value should be listed first.

You should also consider the volatility of the columns you are including
within an index.  While the effect on inserts is basically the same (aside
from the width issue), adding extra columns to an index that are updated a
lot is not as desirable as extra columns that don't change.

Hope this is useful.

Marc Mitchell - Senior Application Architect
Enterprise Information Solutions, Inc.
Downers Grove, IL 60515
marcm@eisolution.com


----- Original Message -----
From: "Jeff Boes" <jboes@nexcerpt.com>
To: <pgsql-admin@postgresql.org>
Sent: Tuesday, August 13, 2002 3:14 PM
Subject: [ADMIN] Multiple indexes or multi-column index?


> Given a large (100,000+ rows) table such as:
>
> CREATE TABLE foo (
>   a INTEGER,
>   b INTEGER,
>   c TIMESTAMP,
>   d TEXT
>   /* possibly several more columns in here */
> );
>
> What are the pros and cons of these two different approaches to indexes?
>
> CREATE huge_index ON foo (a,b,c,d);
>
> vs.
>
> CREATE a_index ON foo (a);
> CREATE b_index ON foo (b);
> CREATE c_index ON foo (c);
> CREATE d_index ON foo (d);
>
> --
> Jeff Boes                                      vox 269.226.9550 ext 24
> Database Engineer                                     fax 269.349.9076
> Nexcerpt, Inc.                                 http://www.nexcerpt.com
>            ...Nexcerpt... Extend your Expertise
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org