Re: Automatic function replanning
От | Bruce Momjian |
---|---|
Тема | Re: Automatic function replanning |
Дата | |
Msg-id | 200512222120.jBMLKvZ29992@candle.pha.pa.us обсуждение исходный текст |
Ответ на | Re: Automatic function replanning ("Jim C. Nasby" <jnasby@pervasive.com>) |
Список | pgsql-hackers |
We need invalidation anyway, so I don't see why an intermediate step makes sense. --------------------------------------------------------------------------- Jim C. Nasby wrote: > Well, not just rows; total tuples, both base heap and index. ISTM that > would be a better metric than just plain rows read out of base or rows > returned. > > Depending on how far down this road we want to go, this would allow for > detecting what parameter values require different query plans, and then > using different query plans for different sets of values. Simply > invalidating the cached plan means you could potentially end up needing > to re-plan very frequently. But given the current speed of our > optimizer, it's probably not worth going to this extent. > > Another concern I have is: is cardinality the only metric we need to > look at when deciding to re-plan or are there others? > > In either case, my guess is that tracking the info needed to make this > idea happen is probably much easier than doing automatic plan > invalidation based on cardinality, so it would be a useful interum step. > But if we could actually get cardinality invalidation into 8.2, I'd say > put the effort into that... > > On Thu, Dec 22, 2005 at 03:14:09PM -0500, Bruce Momjian wrote: > > > > Oh, OK, so you are logging prepared queries where the plan generates a > > significantly different number of rows from previous runs. I am not > > sure why that is better, or easier, than just invalidating the cached > > plan if the cardinality changes. > > > > --------------------------------------------------------------------------- > > > > Jim C. Nasby wrote: > > > On Wed, Dec 21, 2005 at 11:00:31PM -0500, Bruce Momjian wrote: > > > > > Track normal resource consumption (ie: tuples read) for planned queries > > > > > and record parameter values that result in drastically different > > > > > resource consumption. > > > > > > > > > > This would at least make it easy for admins to identify prepared queries > > > > > that have a highly variable execution cost. > > > > > > > > We have that TODO already: > > > > > > > > * Log statements where the optimizer row estimates were dramatically > > > > different from the number of rows actually found? > > > > > > Does the stored plan also save how many rows were expected? Otherwise > > > I'm not sure how that TODO covers it... If it does then please ignore my > > > ramblings below. :) > > > > > > My idea has nothing to do with row estimates. It has to do with the > > > amount of work actually done to perform a query. Consider this example: > > > > > > CREATE TABLE queue (status char NOT NULL, queue_item text NOT NULL); > > > CREATE INDEX queue__status ON queue (status); > > > > > > Obviously, to process this you'll need a query like: > > > SELECT * FROM queue WHERE status='N' -- N for New; > > > > > > Say you also occasionally need to see a list of items that have been > > > processed: > > > SELECT * FROM queue WHERE status='D' -- D for Done; > > > > > > And let's say you need to keep done items around for 30 days. > > > > > > Now, if both of these are done using a prepared statement, it's going to > > > look like: > > > > > > SELECT * FROM queue WHERE status='?'; > > > > > > If the first one to run is the queue processing one, the planner will > > > probably choose the index. This means that when we're searching on 'N', > > > there will be a fairly small number of tuples read to execute the query, > > > but when searching for 'D' a very large number of tuples will be read. > > > > > > What I'm proposing is to keep track of the 'normal' number of tuples > > > read when executing a prepared query, and logging any queries that are > > > substantially different. So, if you normally have to read 50 tuples to > > > find all 'N' records, when the query looking for 'D' records comes along > > > and has to read 5000 tuples instead, we want to log that. Probably the > > > easiest way to accomplish this is to store a moving average of tuples > > > read with each prepared statement entry. > > > -- > > > Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com > > > Pervasive Software http://pervasive.com work: 512-231-6117 > > > vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 > > > > > > ---------------------------(end of broadcast)--------------------------- > > > TIP 6: explain analyze is your friend > > > > > > > -- > > Bruce Momjian | http://candle.pha.pa.us > > pgman@candle.pha.pa.us | (610) 359-1001 > > + If your life is a hard drive, | 13 Roberts Road > > + Christ can be your backup. | Newtown Square, Pennsylvania 19073 > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 3: Have you checked our extensive FAQ? > > > > http://www.postgresql.org/docs/faq > > > > -- > Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com > Pervasive Software http://pervasive.com work: 512-231-6117 > vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 > > ---------------------------(end of broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
В списке pgsql-hackers по дате отправления: