Re: the big picture for index-only scans
От | Cédric Villemain |
---|---|
Тема | Re: the big picture for index-only scans |
Дата | |
Msg-id | BANLkTi=DV5UtxFhCmDkM3E-a6u8aC+WJ_Q@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: the big picture for index-only scans (Robert Haas <robertmhaas@gmail.com>) |
Список | pgsql-hackers |
2011/6/19 Robert Haas <robertmhaas@gmail.com>: > On Sun, Jun 19, 2011 at 11:12 AM, Cédric Villemain > <cedric.villemain.debian@gmail.com> wrote: >>> Good point --- we would be making plan decisions based on the visibility >>> map coverage. The big question is whether visibility map changes are >>> more dynamic than the values we already plan against, like rows in the >>> table, table size, and value distributions. I don't know the answer. >> >> Robert, I though of Covered-Index as just a usage of the vis map: >> don't take the heap block if not needed. This look easier to do and >> better in the long term (because read-only DB may quickly turn into a >> no-heap access DB for example). Thus this is not real covered-index. >> Did you want to implement real covered-index and did you have ideas on >> how to do that ? Or just an optimization of the current >> planner/executor on index usage ? > > If by a "real" covered index you mean one that includes visibility > info in the index - I have no plans to work on anything like that. If > we were to do that, the index would become much larger and less > efficient, whereas the approach of just optimizing the way our > existing indexes are used doesn't have that disadvantage. It also > sounds like a lot of work. Now, if someone else wants to demonstrate > that it has advantages that are worth the costs and go do it, more > power to said person, but I'm unexcited about it. Yes I was thinking of that, and agree with you. > >> I don't know VM internals: >> >> * do we have a counter of ALL_VISIBLE flag set on a relation ? (this >> should be very good for planner) >> * do we need a pg_class.rel_vmvisible ?! (I have hands up, don't >> shoot pleeaase) > > Evidently I'm developing a more frightening reputation than I would hope. :-( Nah, I was joking :) don't worry ! Probably because I have already proposing 1 new GUC and at least one new column to pg_class recently. (and we're not used to change that frequently) > > Anyway, yes, I do believe we need a table-level statistic for the > percentage of the visibility map bits that are believed to be set. > Having said that I think we need it, let me also say that I'm a bit > skeptical about how well it will work. There are two problems: > > 1. Consider a query like "SELECT a, b FROM foo WHERE a = 1". To > accurately estimate the cost of executing this query via an index-only > scan (on an index over foo (a, b)), we need to know (i) the percentage > of rows in the table for which a = 1 and (ii) the percentage *of those > rows* which are on an all-visible page. We can assume that if 80% of > the rows in the table are on all-visible pages, then 80% of the rows > returned by this query will be on all-visible pages also, but that > might be wildly wrong. This is similar to the problem of costing > "SELECT * FROM foo WHERE a = 1 AND b = 1" - we know the fraction of > rows where a = 1 and the fraction where b = 1, but there's no > certainty that multiplying those values will produce an accurate > estimate for the conjunction of those conditions. The problem here is > not as bad as the general multi-column statistics problem because a > mistake will only bollix the cost, not the row count estimate, but > it's still not very nice. > > 2. Since VACUUM and ANALYZE often run together, we will be estimating > the percentage of rows on all-visible pages just at the time when that > percentage is highest. This is not exactly wonderful, either... > > I have a fair amount of hope that even with these problems we can come > up with some adjustment to the planner that is better than just > ignoring the problem, but I am not sure how difficult it will be. > >> * is it ok to parse VM for planning (I believe it is not) ? > > It doesn't seem like a good idea to me, but I just work here. I'm not > sure what that would buy us. All true, and I won't be unhappy to have the feature as a bonus, not expected by the planner(for the cost part) but handled by the executor. -- Cédric Villemain 2ndQuadrant http://2ndQuadrant.fr/ PostgreSQL : Expertise, Formation et Support
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Robert HaasДата:
Сообщение: Re: Adding a distinct "pattern" type to resolve the "~" commutator stalemate
Следующее
От: Florian PflugДата:
Сообщение: Re: Adding a distinct "pattern" type to resolve the "~" commutator stalemate