Re: I'd like to learn a bit more about how indexes work

Поиск
Список
Период
Сортировка
От Dann Corbit
Тема Re: I'd like to learn a bit more about how indexes work
Дата
Msg-id 87F42982BF2B434F831FCEF4C45FC33E5074CB14@EXCHANGE.corporate.connx.com
обсуждение исходный текст
Ответ на Re: I'd like to learn a bit more about how indexes work  (Mike Christensen <mike@kitchenpc.com>)
Список pgsql-general
-----Original Message-----
From: Mike Christensen [mailto:mike@kitchenpc.com]
Sent: Tuesday, June 05, 2012 4:28 PM
To: Dann Corbit
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] I'd like to learn a bit more about how indexes work

I'm aware of how "B-Trees" work, but I only understand them on the level of traversing a single tree at a time.  I'm
curioushow Postgres combines multiple indexes with OR and AND clauses. 

I've done some Google searches, however I can't find anything basic.
Everything I've found assumes you already have knowledge of terms such as "hash join, aggregate join, etc".

At this point I'm not looking at learning how the optimizer works..
>>
"How the optimizer works" is the answer to your question.
The plan of attack for forming a query is a function of the optimizer.

One possible plan  for " WHERE data = key1 OR data = key2 " is something along the lines of:
SEEK("key1")
While key == key1 accumulate rows into the result set
   GetNextRow()
SEEK("key2")
While key == key2 accumulate rows into the result set
   GetNextRow()

However, if the table is tiny (suppose it is ten rows and fits into memory) then a table scan might be cheaper.

Here at CONNX, I have written a hashed btree search that tends to be cheaper than using a clustered index if there are
noqualifiers on the join. 
For instance
SELECT a.*, b.* from table1 a, table2 b WHERE a.unique_index = b.foreign_key

It will be faster to actually not use the index.  Whereas if there are additional where clause criteria such as:
SELECT a.*, b.* from table1 a, table2 b WHERE a.unique_index = b.foreign_key AND a.unique_index IN (k1, k2, k3, k4,...,
kn,kn+1) 
It will probably be faster to use the index unless the list of items is a substantial proportion of the possible data
values.

The point is that there is not a simple formula that describes how data values are retrieved from the database.  The
methodof collection is decided by the optimizer. 

It isn't always cheaper to use an index.  In fact, sometimes building an index is a complete waste of time.
For instance, suppose that you have a column named 'sex' that can contain the values 'F', 'M', and 'U'
If you built an index on that column it won't help you to find all the males faster than a table scan because the data
isnot specific enough so that the total number of pages of disk that are read would be more with the index than if the
indexwere not used. 
So, I suggest that possibly the articles you do not want to read are the very ones that will answer your questions.

On the other hand, it is not unlikely that I simply do not understand the questions that you are asking.
<<

В списке pgsql-general по дате отправления:

Предыдущее
От: Rich Shepard
Дата:
Сообщение: Re: Populate Table From Two Other Tables
Следующее
От: Adrian Klaver
Дата:
Сообщение: Re: Populate Table From Two Other Tables