Re: Best way to scan on-disk bitmaps
От | Greg Stark |
---|---|
Тема | Re: Best way to scan on-disk bitmaps |
Дата | |
Msg-id | 87vf5k2qrw.fsf@stark.xeocode.com обсуждение исходный текст |
Ответ на | Re: Best way to scan on-disk bitmaps (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-hackers |
Tom Lane <tgl@sss.pgh.pa.us> writes: > I think it would be easy to change the planner and btree to handle > this (where "easy" means "I remember where all the skeletons are > buried"). But I don't know the gist code hardly at all. Can anyone > offer an informed opinion on whether gist can handle this now, and > if not what it would take to handle it? Currently gist indexes only use the first column for page splits, making multi-key gist indexes basically useless. The problem is it's hard to imagine an API for a pickSplit function that could handle multi-key indexes with disparate data types and operator classes. I had an idea of a new way to deal with gist indexes that simplified the API and side-stepped the whole issue but you raised concerns that it might be too limiting. Unfortunately the mailing list archive seems to have missed this discussion. I've attached three messages from the discussion at the time. Oleg & Teodor, If I understand the code correctly, GiST will only pass the first attribute of each index tuple to the user-defined PickSplit method when it wants to split a node. (see circa line 1269 of gist.c) Is this a wise design decision? Granted, in many situations the first attribute in the index is sufficient to make a reasonable decision about how to divide the node into two halves, but I don't think that is universally true. For example, consider a two column index whose first attribute has a small number of distinct values. It could well be that all the first attribute values in a node-to-be-split would be the same. Only passing the first attribute to PickSplit would result in an essentially random distribution of tuples among the split nodes, rather than allowing the GiST extension to make use of the second attribution to partition the nodes. That's an extreme example, but it is easy to construct more realistic scenarios (basically, any situation in which the cardinality of the first index attribute is low -- a reasonably common occurrence with a multi-column index, I believe). I'm not sure whether this would be a problem in practice. Speculation: repeated invocations of PickSplit are one of the main factors in deriving the ultimate shape of the GiST tree. Poor distribution of keys resulting from PickSplit would eventually result in unnecessarily loose bounding predicates in internal nodes, which would hurt performance. Comments welcome, Neil ---------------------------(end of broadcast)--------------------------- TIP 7: don't forget to increase your free space map settings Greg Stark <gsstark@MIT.EDU> writes: > I'm not sure that GiST indexes behave the same way as btree indexes for the > multi-column case. > > In a btree index the second column is entirely subordinate to the first > column. In a GiST index the data is multi-dimensional, and all dimensions are > equally important. In fact on further consideration I do have a proposal. If you look at the GiST implementations for various index types you'll see that many (all?) take the same approach for PickSplit. In fact they pretty much all have the same code copy/pasted to handle it. The approach they take is to have a function which calculates an abstract "distance" between any two entries. There's an algorithm that they use to pick the split based on this distance function. If you abandoned "PickSplit" and instead exposed this distance function as the external API then the behaviour for multi-column indexes is clear. You calculate the distance along all the axes and calculate the diagonal distance. I think abandoning PickSplit for the distance function might also mean you don't need a separate function for Penalty either, but I'm not sure on that. -- greg Greg Stark <gsstark@MIT.EDU> writes: > The approach they take is to have a function which calculates an > abstract "distance" between any two entries. There's an algorithm that > they use to pick the split based on this distance function. > If you abandoned "PickSplit" and instead exposed this distance > function as the external API then the behaviour for multi-column > indexes is clear. You calculate the distance along all the axes and > calculate the diagonal distance. Hmm ... the problem with that is the assumption that different opclasses will compute similarly-scaled distances. If opclass A generates distances in the range (0,1e6) while B generates in the range (0,1), combining them with Euclidean distance won't work well at all. OTOH you can't blindly normalize, because in some cases maybe the data is such that a massive difference in distances is truly appropriate. I'm also a bit leery of the assumption that every GiST application can reduce its PickSplit logic to Euclidean distances. regards, tom lane > (hash and rtree are not at issue since they don't support multi-key > indexes.) It seems like it would be trivial to make hash indexes do something useful. Maybe it's not clear which useful something to do though. a) hash each column and xor the results. This requires all columns be present for an index lookup. Again this requires making the planner logic more flexible and specific to each index method. b) Partition take only a few bits from each column's hash. This would actually let the hash index use any subset of columns though it seems like it would be prohibitively expensive at first blush. And it requires more than just planner hacking to take advantage of. -- greg
В списке pgsql-hackers по дате отправления: