Re: Reviewing new index types (was Re: [PATCHES] Updatedbitmap indexpatch)

Поиск
Список
Период
Сортировка
От Simon Riggs
Тема Re: Reviewing new index types (was Re: [PATCHES] Updatedbitmap indexpatch)
Дата
Msg-id 1185228715.4284.433.camel@ebony.site
обсуждение исходный текст
Ответ на Re: Reviewing new index types (was Re: [PATCHES] Updated bitmap indexpatch)  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Reviewing new index types (was Re: [PATCHES]Updatedbitmap indexpatch)
Список pgsql-hackers
On Mon, 2007-07-23 at 17:19 -0400, Tom Lane wrote:
> "Simon Riggs" <simon@2ndquadrant.com> writes:
> > ... BMI is not useful at all
> > for PKs, whilst GIT is specifically designed to handle them.
> 
> This seems a strange statement, because GIT doesn't look particularly
> efficient for unique indexes AFAICS.  In the worst case you'd have to
> look individually at each tuple on a heap page to check for uniqueness
> conflict (no binary search, because you couldn't assume they are
> ordered).

That is one of a few heuristics about the patch that need some active
discussion, so I'm glad you asked.

The main use case is nearly-unique, so for cases where we have a
Master:Detail relationship, e.g. Order:OrderItem. The Order index is a
PK, with the OrderItem index as a nearly unique key. The index is not
brilliant for the Order index, but is good for the OrderItem index.

Heikki designed the grouping so that there is a state change between
non-grouped and non-grouped (normal) index entries. By default the patch
uses a threshold of non-grouped -> grouped at N=2 index entries and then
no limit on the number of rows/block. Currently you can tune N, but we
might also envisage setting a limit on the width of the range of values
to limit the number of tids stored in a grouped index entry. That could
control the uniqueness overhead.

On an I/O bound workload the space saving on the index outweighs the CPU
loss from uniqueness checking. When I/O is not an issue then
unfortunately there is a CPU overhead.

For GIT it would appear that the summary is that it gives a slight loss
on medium sized PK indexes and an increasing win as index size
increases. We struggled to come up with ways of making it Just Work with
as few parameters as possible.

> > B-TREE INDEXES (Integers) 
> 
> > Rows/value    Best time    Size in blocks
> > 10000000    49s        21899
> > 1000000        49s        21899
> > 100000        49s        21899
> > 10000        47s        21899
> > 1000        43s        21899
> > 100        38s        21899
> > 10        38s        21899
> > 1        33s        21899
> 
> Surely the GIT code failed to kick in at all here?  That's just about
> exactly the index size I'd expect for 10 million integers with the
> existing btree code (at least when MAXALIGN=4).

That was the b-tree test, i.e. the control. The GIT patch has bitrot, so
not able to test just yet.

--  Simon Riggs EnterpriseDB  http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Reviewing new index types (was Re: [PATCHES] Updated bitmap indexpatch)
Следующее
От: "Mark Wong"
Дата:
Сообщение: Re: Why so many out-of-disk-space failures on buildfarm machines?