Re: Best way to scan on-disk bitmaps

Поиск
Список
Период
Сортировка
От Greg Stark
Тема Re: Best way to scan on-disk bitmaps
Дата
Msg-id 87vf5k2qrw.fsf@stark.xeocode.com
обсуждение исходный текст
Ответ на Re: Best way to scan on-disk bitmaps  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers

Tom Lane <tgl@sss.pgh.pa.us> writes:

> I think it would be easy to change the planner and btree to handle
> this (where "easy" means "I remember where all the skeletons are
> buried").  But I don't know the gist code hardly at all.  Can anyone
> offer an informed opinion on whether gist can handle this now, and
> if not what it would take to handle it?

Currently gist indexes only use the first column for page splits, making
multi-key gist indexes basically useless. The problem is it's hard to imagine
an API for a pickSplit function that could handle multi-key indexes with
disparate data types and operator classes. I had an idea of a new way to deal
with gist indexes that simplified the API and side-stepped the whole issue but
you raised concerns that it might be too limiting.

Unfortunately the mailing list archive seems to have missed this discussion.
I've attached three messages from the discussion at the time.

Oleg & Teodor,

If I understand the code correctly, GiST will only pass the first
attribute of each index tuple to the user-defined PickSplit method when
it wants to split a node. (see circa line 1269 of gist.c)

Is this a wise design decision? Granted, in many situations the first
attribute in the index is sufficient to make a reasonable decision about
how to divide the node into two halves, but I don't think that is
universally true. For example, consider a two column index whose first
attribute has a small number of distinct values. It could well be that
all the first attribute values in a node-to-be-split would be the same.
Only passing the first attribute to PickSplit would result in an
essentially random distribution of tuples among the split nodes, rather
than allowing the GiST extension to make use of the second attribution
to partition the nodes. That's an extreme example, but it is easy to
construct more realistic scenarios (basically, any situation in which
the cardinality of the first index attribute is low -- a reasonably
common occurrence with a multi-column index, I believe).

I'm not sure whether this would be a problem in practice. Speculation:
repeated invocations of PickSplit are one of the main factors in
deriving the ultimate shape of the GiST tree. Poor distribution of keys
resulting from PickSplit would eventually result in unnecessarily loose
bounding predicates in internal nodes, which would hurt performance.

Comments welcome,

Neil

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings


Greg Stark <gsstark@MIT.EDU> writes:

> I'm not sure that GiST indexes behave the same way as btree indexes for the
> multi-column case.
>
> In a btree index the second column is entirely subordinate to the first
> column. In a GiST index the data is multi-dimensional, and all dimensions are
> equally important.

In fact on further consideration I do have a proposal.

If you look at the GiST implementations for various index types you'll see
that many (all?) take the same approach for PickSplit. In fact they pretty
much all have the same code copy/pasted to handle it.

The approach they take is to have a function which calculates an abstract
"distance" between any two entries. There's an algorithm that they use to pick
the split based on this distance function.

If you abandoned "PickSplit" and instead exposed this distance function as the
external API then the behaviour for multi-column indexes is clear. You
calculate the distance along all the axes and calculate the diagonal distance.

I think abandoning PickSplit for the distance function might also mean you
don't need a separate function for Penalty either, but I'm not sure on that.

--
greg



Greg Stark <gsstark@MIT.EDU> writes:
> The approach they take is to have a function which calculates an
> abstract "distance" between any two entries. There's an algorithm that
> they use to pick the split based on this distance function.

> If you abandoned "PickSplit" and instead exposed this distance
> function as the external API then the behaviour for multi-column
> indexes is clear. You calculate the distance along all the axes and
> calculate the diagonal distance.

Hmm ... the problem with that is the assumption that different opclasses
will compute similarly-scaled distances.  If opclass A generates
distances in the range (0,1e6) while B generates in the range (0,1),
combining them with Euclidean distance won't work well at all.  OTOH you
can't blindly normalize, because in some cases maybe the data is such
that a massive difference in distances is truly appropriate.

I'm also a bit leery of the assumption that every GiST application can
reduce its PickSplit logic to Euclidean distances.

            regards, tom lane





> (hash and rtree are not at issue since they don't support multi-key
> indexes.)

It seems like it would be trivial to make hash indexes do something useful.
Maybe it's not clear which useful something to do though.

a) hash each column and xor the results. This requires all columns be present
   for an index lookup. Again this requires making the planner logic more
   flexible and specific to each index method.

b) Partition take only a few bits from each column's hash. This would actually
   let the hash index use any subset of columns though it seems like it would
   be prohibitively expensive at first blush. And it requires more than just
   planner hacking to take advantage of.



--
greg

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: SO_KEEPALIVE
Следующее
От: Lamar Owen
Дата:
Сообщение: Re: pgFoundry