Re: [PERFORM] encouraging index-only scans

Поиск
Список
Период
Сортировка
От Bruce Momjian
Тема Re: [PERFORM] encouraging index-only scans
Дата
Msg-id 20130907165059.GE11757@momjian.us
обсуждение исходный текст
Ответ на Re: [PERFORM] encouraging index-only scans  (Andres Freund <andres@2ndquadrant.com>)
Ответы Re: [PERFORM] encouraging index-only scans  (Andres Freund <andres@2ndquadrant.com>)
Список pgsql-hackers
On Sat, Sep  7, 2013 at 07:34:49AM +0200, Andres Freund wrote:
> > The idea of using RecentGlobalXmin to see how much _work_ has happened
> > since the last vacuum is interesting, but it doesn't handle read-only
> > transactions;  I am not sure how they can be tracked.  You make a good
> > point that 5 minutes passing is meaningless --- you really want to know
> > how many transactions have completed.
> 
> So, what I was pondering went slightly into a different direction:
> 
> (lets ignore anti wraparound vacuum for now)
> 
> Currently we trigger autovacuums by the assumed number of dead
> tuples. In the course of it's action it usually will find that it cannot
> remove all dead rows and that it cannot mark everything as all
> visible. That's because the xmin horizon hasn't advanced far enough. We
> won't trigger another vacuum after that unless there are further dead
> tuples in the relation...
> One trick if we want to overcome that problem and that we do not handle
> setting all visible nicely for INSERT only workloads would be to trigger
> vacuum by the amount of pages that are not marked all visible in the vm.
> 
> The problem there is that repeatedly scanning a relation that's only 50%
> visible where the rest cannot be marked all visible because of a
> longrunning pg_dump obivously isn't a good idea. So we need something to
> notify us when there's work to be done. Using elapsed time seems like a
> bad idea because it doesn't adapt to changing workloads very well and
> doesn't work nicely for different relations.
> 
> What I was thinking of was to keep track of the oldest xids on pages
> that cannot be marked all visible. I haven't thought about the
> statistics part much, but what if we binned the space between
> [RecentGlobalXmin, ->nextXid) into 10 bins and counted the number of
> pages falling into each bin. Then after the vacuum finished we could
> compute how far RecentGlobalXmin would have to progress to make another
> vacuum worthwile by counting the number of pages from the lowest bin
> upwards and use the bin's upper limit as the triggering xid.
> 
> Now, we'd definitely need to amend that scheme by something that handles
> pages that are newly written to, but it seems something like that
> wouldn't be too hard to implement and would make autovacuum more useful.

That seems very complicated.  I think it would be enough to record the
current xid at the time of the vacuum, and when testing for later
vacuums, if that saved xid is earlier than the RecentGlobalXmin, and
there have been no inserts/updates/deletes, we know that all of
the pages can now be marked as allvisible.

What this doesn't handle is the insert case.  What we could do there is
to record the total free space map space, and if the FSM has not changed
between the last vacuum, we can even vacuum if inserts happened in that
period because we assume the inserts are on new pages.  One problem
there is that the FSM is only updated if an insert will not fit on the
page.  We could record the table size and make sure the table size has
increased before we allow inserts to trigger a vm-set vacuum.

None of this is perfect, but it is better than what we have, and it
would eventually get the VM bits set.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: strange IS NULL behaviour
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: Re: Privileges for INFORMATION_SCHEMA.SCHEMATA (was Re: [DOCS] Small clarification in "34.41. schemata")