Re: A thought on Index Organized Tables

Поиск
Список
Период
Сортировка
От Gokulakannan Somasundaram
Тема Re: A thought on Index Organized Tables
Дата
Msg-id 9362e74e1002230547i39e232fdj8bd64474135d4af@mail.gmail.com
обсуждение исходный текст
Ответ на Re: A thought on Index Organized Tables  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: A thought on Index Organized Tables  (Andrew Dunstan <andrew@dunslane.net>)
Список pgsql-hackers


On Tue, Feb 23, 2010 at 10:42 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Takahiro Itagaki <itagaki.takahiro@oss.ntt.co.jp> writes:
> Instead, how about excluding columns in primary keys from table data?

How will you implement "select * from mytable" ?  Or even
"select * from mytable where non_primary_key = something" ?
If you can't do either of those without great expense, I think
a savings on primary-key lookups is not going to be adequate
recompense.
 
Tom,
      I am talking things more from the perspective of how things have got implemented in Oracle/SQL Server. Both Oracle and SQL Server store the snapshot info with indexes and hence can do index-only scans with their indexes. But still they have the concept of Index Organized Tables / Clustered Indexes. Apart from the disk footprint, it will have an impact on the cache efficiency also.
       In Oracle IOT and SQL Server Clustered Indexes, you have an option to store some of the columns in the leaf pages( but not in the non-leaf pages) and hence the tuples won't get sorted based on them, but you don't require an extra i/o to access them. This optimization is again to reduce the size of IOT. Oracle IOT has a concept called overflow regions, which is more like a heap and will store a few columns. There will be a pointer from main b-tree structure to this secondary structure. Accessing these columns are costly, but the idea is that the database designer has taken this into account while deciding on the columns to be put in the overflow regions.
       We can design secondary indexes to make the access faster for non-primary key based searches. But since the Secondary indexes store primary key in the place of HeapTuple Pointer, the access will usually take 2-3 more i/os. But the idea is that the IOT is for those kind of data. which will be 99% queried based on primary keys. The database provides that extra performance for that kind of access patterns. So to answer your question, full table scans(if overflow regions are involved) and search based on non-primary keys will be slow in an IOT.
         I looked at the postgres nbtree code. From my analysis(which might be wrong!), we can implement IOTs, provided we make a decision on broken data types issue.

Thanks,
Gokul.

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Magnus Hagander
Дата:
Сообщение: Re: Recent vendor SSL renegotiation patches break PostgreSQL
Следующее
От: Tom Lane
Дата:
Сообщение: Re: [COMMITTERS] Re: pgsql: Speed up CREATE DATABASE by deferring the fsyncs until after