Re: Index scan optimization

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: Index scan optimization
Дата
Msg-id 541FCA2D.5090508@vmware.com
обсуждение исходный текст
Ответ на Index scan optimization  (Rajeev rastogi <rajeev.rastogi@huawei.com>)
Ответы Re: Index scan optimization  (Rajeev rastogi <rajeev.rastogi@huawei.com>)
Re: Index scan optimization  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On 09/22/2014 07:47 AM, Rajeev rastogi wrote:
> I have observed a scope of considerable performance improvement in-case of index by a very minor code change.
> Consider the below schema:
>
> create table tbl2(id1 int, id2 varchar(10), id3 int);
> create index idx2 on tbl2(id2, id3);
>
> Query as:
>                  select count(*) from tbl2 where id2>'a' and id3>990000;
>
> As per current design, it takes following steps to retrieve index tuples:
>
> 1.       Find the scan start position by searching first position in BTree as per the first key condition i.e. as per
id2>'a'
>
> 2.       Then it fetches each tuples from position found in step-1.
>
> 3.       For each tuple, it matches all scan key condition, in our example it matches both scan key condition.
>
> 4.       If condition match, it returns the tuple otherwise scan stops.
>
> Now problem is here that already first scan key condition is matched to find the scan start position (Step-1), so it
isobvious that any further tuple also will match the first scan key condition (as records are sorted).
 
> So comparison on first scan key condition again in step-3 seems to be redundant.
>
> So my proposal is to skip the condition check on the first scan key condition for every tuple.

The same happens in a single-column case. If you have a query like 
"SELECT * FROM tbl2 where id2 > 'a'", once you've found the start 
position of the scan, you know that all the rows that follow match too.

> I would like to submit the patch for this improvement.
> Please provide your feedback. Also let me know if I am missing something.

Yeah, sounds like a good idea. This scenario might not arise very often, 
but it should be cheap to check, so I doubt it will add any measurable 
overhead to the cases where the optimization doesn't help.

- Heikki




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: Scaling shared buffer eviction
Следующее
От: Dmitriy Igrishin
Дата:
Сообщение: Re: libpq connection status and closed fd