Обсуждение: Re: rtree indexes aren't being used with 7.0
Jeff Hoffmann <jeff@propertykey.com> writes: > close, but no cigar. i kept on dropping that constant until it worked > for all of my tables. i ended up at 0.0002, but i still haven't tried > it on my biggest tables. Urgh. That seems way too low to put in as a default estimate, especially considering that that same estimator routine is used for several different operators. We need to look at this some more. > i assumed a fairly linear relationship between # of records and the > value of that constant so that value should work into the low 1 > million record range at least. Actually, the cost estimator for indexscans is deliberately not linear, since it's trying to model the effects of hits in a buffer cache ... perhaps that's a pointless refinement when we have no accurate idea of the size of the kernel's buffer cache, but certainly the real-world behavior is not going to be linear. > why did it change so much from 6.5.3? IIRC, it was somewhere around > 0.25 in 6.5.3. The old code had a *drastic* underestimate of the costs of indexscans versus sequential scans, so it would tend to choose an indexscan even for a query with a very large selectivity ratio. Believe me, if you were running a query that actually returned a quarter of the rows in your table, you would not want an indexscan --- but 6.5 would give you one. 7.0 won't, which means that there's now a premium on making a selectivity estimate that has something to do with reality. > without understanding how selectivity functions work, would it even be > possible to come up with meaningful functions for geometric types & > rtrees? Good question. I haven't looked at the literature at all, but a first thought is that you might be able to do something useful given the bounding box of all data in the table ... which is a stat that VACUUM does *not* compute, but perhaps could be taught to. regards, tom lane
Tom Lane wrote: > > Good question. I haven't looked at the literature at all, but a first > thought is that you might be able to do something useful given the > bounding box of all data in the table ... which is a stat that VACUUM > does *not* compute, but perhaps could be taught to. > > regards, tom lane i'm still trying to understand how cost estimates & selectivity work. i'm looking at a page in the programmers guide & i'm trying to figure out how it works for a normal (btree) index vs. how it would (or should) work for an rtree. first i have to figure out how it works for a normal index. it looks like there are two parts: the index cost (both index startup and cost to access a tuple) and the selectivity (which is multiplied by the cost to access to get the total cost of using that index(?)) it makes sense that the real cost would best be estimated this way, at least in my simplistic world. if i've been twiddling with the selectivity shouldn't i also be messing with the index costs. wouldn't you probably get a more effective lowering of the cost that way? logically, i would say you're right about the bounding box being a descent judge of selectivity. theoretically you should be able to see data distribution by looking at an rtree index which would give even a better selectivity number. i'm still not sure about the cost thing, though. would that be something that should be looked into? jeff
Jeff Hoffmann <jeff@propertykey.com> writes: > i'm still trying to understand how cost estimates & selectivity work. > i'm looking at a page in the programmers guide & i'm trying to figure > out how it works for a normal (btree) index vs. how it would (or should) > work for an rtree. "Use the source, Luke ..." A lot of this low-level stuff is not adequately documented except in the source code. There are three basic components to the cost estimate: an operator-specific selectivity routine, which tries to estimate the fraction of tuples that will be returned; a generic cost estimator for indexscans; and a cost estimator for the specific index type. The generic estimator tries to figure the costs for accesses to the table proper, given the selectivity fraction; while the index-specific estimator worries about the cost of accesses to the index. The generic cost routine is in backend/optimizer/path/costsize.c, and the other stuff is in backend/utils/adt/selfuncs.c and backend/utils/adt/geo_selfuncs.c. Right now the "index-specific estimator" is no such thing; all the index types share a single default routine. If you know more about rtree access patterns than that code does, feel free to suggest improvements. But the real problem is of course the selectivity routine; for the geometric operators we've never had anything but stub routines that return a constant. You might want to look at the scalar-comparison selectivity routines (selfuncs.c) to see how they work. Basically the idea for those is that VACUUM ANALYZE stores the min and max values of each column, and given an operator like "var < constant" we can do a linear interpolation between the min and max to estimate the selectivity. For example, if min=0, max=100, and the constant is 40 then we'd estimate 0.40 selectivity. (This works great as long as the data is reasonably uniformly distributed between the min and max, but falls down badly if it's not. One of the items on our very long TODO list is to change VACUUM ANALYZE to store more extensive statistics so that we can be a little smarter...) A corresponding kind of idea would be to have VACUUM store the bounding box, and perhaps other stats like the average box area, and then compare that to the specific box being probed with to estimate selectivity. An open question: how to tell VACUUM that for box columns we want bounding box and not min/max? I'm inclined to think that we should generalize the VACUUM ANALYZE mechanism so that it runs N aggregates over each column and stores the results, where the specific aggregates to use are looked up in a table based on the column datatype. regards, tom lane
Jeff Hoffmann wrote: > logically, i would say you're right about the bounding box being a > descent judge of selectivity. theoretically you should be able to see > data distribution by looking at an rtree index which would give even a > better selectivity number. i'm still not sure about the cost thing, > though. would that be something that should be looked into? after i wrote this, i noticed that all of the cost estimators use the same generic cost estimator function (genericcostestimate) in which part of the equation is the number of pages in the index. i'm assuming that this means disk pages, in which case, that would be part of the problem. rtree indexes are really bulky -- in fact, i have some index files that are larger than the database file in a lot of cases, which would almost immediately make a sequential scan interesting. granted, this is a funny case where there's only three attributes, but it happens. it might be useful to just modify the genericcostestimate function and slash the indexPages into some fraction (e.g., an rtree index on a box attribute is greater than 5x the size of a btree index on an integer attribute, so maybe just cut the indexPages by 5 or 10 or something like that.) it's no substitute for a decent selectivity function, but it might help. am i anywhere near the right track here? jeff
Tom Lane wrote: > Right now the "index-specific estimator" is no such thing; all the index > types share a single default routine. If you know more about rtree > access patterns than that code does, feel free to suggest improvements. > well, i probably do, but not a whole lot more. like i said in my other message, rtrees are really bulky. is there an easy way of figuring out how many pages are actually touched in an index during a query? or even things like the values that go into adding up the cost (like, for example, would the numIndexPages value totally overwhelm the rest of the equation)? it'd be nice to have some empirical data to see if the estimates are actually working. > But the real problem is of course the selectivity routine; for the > geometric operators we've never had anything but stub routines that > return a constant. You might want to look at the scalar-comparison > selectivity routines (selfuncs.c) to see how they work. Basically > the idea for those is that VACUUM ANALYZE stores the min and max values > of each column, and given an operator like "var < constant" we can do > a linear interpolation between the min and max to estimate the > selectivity. For example, if min=0, max=100, and the constant is 40 > then we'd estimate 0.40 selectivity. (This works great as long as the > data is reasonably uniformly distributed between the min and max, but > falls down badly if it's not. One of the items on our very long TODO > list is to change VACUUM ANALYZE to store more extensive statistics so > that we can be a little smarter...) > again, i'm making some assumptions here. i should probably be reading instead of asking questions, but it's nice to get some verification that i actually understand what's going on. my understanding is that vacuum stuffs the statistics in pg_statistic and that staloval and stahival are text representations of the lo & high values (which then can be converted to a numerical value if need be). so essentially you can stuff whatever you want in there. is the stacommonval the median? is there some place where all of the system tables are described? theoretically, then you should be able to use vacuum to put an upper left corner in staloval, for example, and the lower right in stahival. i guess it's off to trying to figure out where the vacuum code is... jeff
Jeff Hoffmann <jeff@propertykey.com> writes: > rtree indexes are really bulky -- in fact, i have some index > files that are larger than the database file in a lot of cases, which > would almost immediately make a sequential scan interesting. granted, > this is a funny case where there's only three attributes, but it > happens. it might be useful to just modify the genericcostestimate > function and slash the indexPages into some fraction (e.g., an rtree > index on a box attribute is greater than 5x the size of a btree index on > an integer attribute, so maybe just cut the indexPages by 5 or 10 or > something like that.) it's no substitute for a decent selectivity > function, but it might help. Mmm ... the existing estimator essentially assumes that if you are executing a query that ultimately returns X% of the tuples, you are going to have to touch about X% of the pages in the index to do it. While I'm prepared to be shown that that's not true for rtrees, it seems like a pretty plausible first-order estimate. I don't think that making the rtree estimator less realistic is an appropriate solution to problems in the operator estimator, anyway. If the estimator gives bad results given an accurate selectivity, then we should change it; but if the problem is upstream then we should work on the upstream. It could even be that the notion of representing selectivity as a single number is inadequate for rtrees, and that more info needs to be passed back from the operator-specific routine (selectivity in two dimensions, say). But considering that we haven't even tried to use the existing structure, discarding it is probably premature... regards, tom lane
At 11:13 AM 15-05-2000 -0400, Tom Lane wrote: >Jeff Hoffmann <jeff@propertykey.com> writes: >Actually, the cost estimator for indexscans is deliberately not linear, >since it's trying to model the effects of hits in a buffer cache ... >perhaps that's a pointless refinement when we have no accurate idea of >the size of the kernel's buffer cache, but certainly the real-world >behavior is not going to be linear. Just have something like a few linear "curves" and some IF statements? That's often good enough in real world scenarios- the whole point about fuzzy logic. Works very well for high speed lifts, and other seemingly complex control situations. Cheerio, Link.