Обсуждение: vacuum analyze feedback

Поиск
Список
Период
Сортировка

vacuum analyze feedback

От
Ed Loehr
Дата:
I know this topic has been rehashed a million times, but I just wanted to
add one datapoint.  I have a database (150 tables, less than 20K tuples 
in any one table) which I 'vacuum analyze'*HOURLY*, blocking all access,
and I still see frequent situations where my query times bloat by roughly
300% (4 times slower) in the intervening time between vacuums.  All this 
is to say that I think a more strategic implementation of the 
functionality of vacuum analyze (specifically, non-batched, automated,
on-the-fly vacuuming/analyzing) would be a major "value add".  I haven't 
educated myself as to the history of it, but I do wonder why the 
performance focus is not on this.  I'd imagine it would be a performance 
hit (which argues for making it optional), but I'd gladly take a 10% 
performance hit over the current highly undesireable degradation.  You 
could do a whole lotta optimization on the planner/parser/executor and
not get close to the end-user-perceptible gains from fixing this
problem...

Regards,
Ed Loehr


Re: vacuum analyze feedback

От
Bruce Momjian
Дата:
> I know this topic has been rehashed a million times, but I just wanted to
> add one datapoint.  I have a database (150 tables, less than 20K tuples 
> in any one table) which I 'vacuum analyze'*HOURLY*, blocking all access,
> and I still see frequent situations where my query times bloat by roughly
> 300% (4 times slower) in the intervening time between vacuums.  All this 
> is to say that I think a more strategic implementation of the 
> functionality of vacuum analyze (specifically, non-batched, automated,
> on-the-fly vacuuming/analyzing) would be a major "value add".  I haven't 
> educated myself as to the history of it, but I do wonder why the 
> performance focus is not on this.  I'd imagine it would be a performance 
> hit (which argues for making it optional), but I'd gladly take a 10% 
> performance hit over the current highly undesireable degradation.  You 
> could do a whole lotta optimization on the planner/parser/executor and
> not get close to the end-user-perceptible gains from fixing this
> problem...
> 

Vadim is planning over-write storage manager in 7.2 which will allow
expired tuples to be reunsed without vacuum.

Or is the ANALYZE the issue for you?  You need hourly statistics?

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: vacuum analyze feedback

От
Ed Loehr
Дата:
Bruce Momjian wrote:
> 
> > I know this topic has been rehashed a million times, but I just wanted to
> > add one datapoint.  I have a database (150 tables, less than 20K tuples
> > in any one table) which I 'vacuum analyze'*HOURLY*, blocking all access,
> > and I still see frequent situations where my query times bloat by roughly
> > 300% (4 times slower) in the intervening time between vacuums.  All this
> > is to say that I think a more strategic implementation of the
> > functionality of vacuum analyze (specifically, non-batched, automated,
> > on-the-fly vacuuming/analyzing) would be a major "value add".  I haven't
> > educated myself as to the history of it, but I do wonder why the
> > performance focus is not on this.  I'd imagine it would be a performance
> > hit (which argues for making it optional), but I'd gladly take a 10%
> > performance hit over the current highly undesireable degradation.  You
> > could do a whole lotta optimization on the planner/parser/executor and
> > not get close to the end-user-perceptible gains from fixing this
> > problem...
> 
> Vadim is planning over-write storage manager in 7.2 which will allow
> expired tuples to be reunsed without vacuum.

Sorry, I missed that in prior threads...that would be good.

> Or is the ANALYZE the issue for you?  

Both, actually.  More specifically, blocking end-user access during
vacuum, and degraded end-user performance as pg_statistics diverge from
reality.  Both are losses of service from the system.

> You need hourly statistics?

My unstated point was that hourly stats have turned out *not* to be
nearly good enough in my case.  Better would be if the system was smart
enough to recognize when the outcome of a query/plan was sufficiently
divergent from statistics to warrant a system-initiated analyze (or
whatever form it would take).  I'll probably end up doing this detection
from the app/client side, but that's not the right place for it, IMO.

Regards,
Ed Loehr


Re: vacuum analyze feedback

От
Bruce Momjian
Дата:
> Bruce Momjian wrote:
> > 
> > > I know this topic has been rehashed a million times, but I just wanted to
> > > add one datapoint.  I have a database (150 tables, less than 20K tuples
> > > in any one table) which I 'vacuum analyze'*HOURLY*, blocking all access,
> > > and I still see frequent situations where my query times bloat by roughly
> > > 300% (4 times slower) in the intervening time between vacuums.  All this
> > > is to say that I think a more strategic implementation of the
> > > functionality of vacuum analyze (specifically, non-batched, automated,
> > > on-the-fly vacuuming/analyzing) would be a major "value add".  I haven't
> > > educated myself as to the history of it, but I do wonder why the
> > > performance focus is not on this.  I'd imagine it would be a performance
> > > hit (which argues for making it optional), but I'd gladly take a 10%
> > > performance hit over the current highly undesireable degradation.  You
> > > could do a whole lotta optimization on the planner/parser/executor and
> > > not get close to the end-user-perceptible gains from fixing this
> > > problem...
> > 
> > Vadim is planning over-write storage manager in 7.2 which will allow
> > expired tuples to be reunsed without vacuum.
> 
> Sorry, I missed that in prior threads...that would be good.
> 
> > Or is the ANALYZE the issue for you?  
> 
> Both, actually.  More specifically, blocking end-user access during
> vacuum, and degraded end-user performance as pg_statistics diverge from
> reality.  Both are losses of service from the system.
> 
> > You need hourly statistics?
> 
> My unstated point was that hourly stats have turned out *not* to be
> nearly good enough in my case.  Better would be if the system was smart
> enough to recognize when the outcome of a query/plan was sufficiently
> divergent from statistics to warrant a system-initiated analyze (or
> whatever form it would take).  I'll probably end up doing this detection
> from the app/client side, but that's not the right place for it, IMO.

Yes, I think eventually, we need to feed information about actual query
results back into the optimizer for use in later queries.

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: vacuum analyze feedback

От
Philip Warner
Дата:
At 15:54 25/05/00 -0400, Bruce Momjian wrote:
>
>Yes, I think eventually, we need to feed information about actual query
>results back into the optimizer for use in later queries.
>

You could be a little more ambituous and do what Dec/Rdb does - use the
results of current query execution to (possibly) cause a change in the
current strategy.



----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.C.N. 008 659 498)             |          /(@)   ______---_
Tel: +61-03-5367 7422            |                 _________  \
Fax: +61-03-5367 7430            |                 ___________ |
Http://www.rhyme.com.au          |                /           \|                                |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


Re: vacuum analyze feedback

От
Bruce Momjian
Дата:
> At 15:54 25/05/00 -0400, Bruce Momjian wrote:
> >
> >Yes, I think eventually, we need to feed information about actual query
> >results back into the optimizer for use in later queries.
> >
> 
> You could be a little more ambituous and do what Dec/Rdb does - use the
> results of current query execution to (possibly) cause a change in the
> current strategy.
> 

yes.

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026