Обсуждение: Database Optimization and Peformance

Поиск
Список
Период
Сортировка

Database Optimization and Peformance

От
Joe McClintock
Дата:
I have inherited the management of a Postgresql 8.0.3 database of around
2.3 GB and growing. I’m still coming up to speed on Postgresql db
administration so I’m not an expert although I have had quite a bit of
experience with Oracle for what that is worth.

Recently we had an issue with extreme performance degradation on a pair
of key tables where a simple “select * from [table name]” had gone for
1.08 mill sec 37 to 45 sec. This caused a very negative impact on the
performance of our web application. We had just loaded about 1.2 M rows
of data to the db on the previous Wednesday and on Thursday and Friday
the database and application performance was good. Weekend activity is
pretty low so it was surprising to find on Monday the db and application
performance in terrible shape. If the data load degraded database
performance why did it take 4 days for the problem to manifest? Regular
nightly backups are done but as far as I can tell vacuum, analyze and
reindex has been done only when needed which to this point has not been
very often.

I ran a vacuum, analyze and reindex on the database with no change in
performance, query time was still 37+ sec, a little worse. On our test
system I found that a db_dump from production and then restore brought
the database back to full performance. So in desperation I shut down the
production application, backed up the production database, rename the
production db, create a new empty production db and restored the
production backup to the empty db. After a successful db restore and
restart of the web application, everything was then up and running like
a top.

My concern is this, doing a backup and restore does not seem an
appropriate way manage database fragmentation and performance. The
documentation I have read indicates that vacuum, analyze reindex are the
tools to use to de-fragment and optimize the database. In my case they
did not work and reindexing made query performance slightly worse. Am I
missing something? As the database grows, will I need to essentially
rebuild the db on a regular basis?

Thanks


Re: Database Optimization and Peformance

От
Jeff Frost
Дата:
On Wed, 30 Aug 2006, Joe McClintock wrote:

> I ran a vacuum, analyze and reindex on the database with no change in
> performance, query time was still 37+ sec, a little worse. On our test system
> I found that a db_dump from production and then restore brought the database
> back to full performance. So in desperation I shut down the production
> application, backed up the production database, rename the production db,
> create a new empty production db and restored the production backup to the
> empty db. After a successful db restore and restart of the web application,
> everything was then up and running like a top.

Joe,

I would guess that since the dump/restore yielded good performance once again,
a VACUUM FULL would have also fixed the problem.  How are your FSM settings in
the conf file?  Can you run VACUUM VERBOSE and send us the last 10 or so lines
of output?

A good article on FSM settings can be found here:

http://www.pervasive-postgres.com/instantkb13/article.aspx?id=10087&cNode=5K1C3W

You probably should consider setting up autovacuum and definitely should
upgrade to at least 8.0.8 if not 8.1.4 when you get the chance.

When you loaded the new data did you delete or update old data or was it just
a straight insert?

--
Jeff Frost, Owner     <jeff@frostconsultingllc.com>
Frost Consulting, LLC     http://www.frostconsultingllc.com/
Phone: 650-780-7908    FAX: 650-649-1954

Re: Database Optimization and Peformance

От
Jeff Frost
Дата:
On Wed, 30 Aug 2006, Joe McClintock wrote:

> Looking at the postgresql.config the lines for max_fsm_pages and
> max_fsm_relations are commented out so they have the default 20000 pages and
> 1000 relations.  Should we up these defaults?

From the vacuum verbose output below, it looks like you're on the very edge of
those settings, but not yet over of course I presume that was just for this
DB, so if you have more than one, you might be over.  You probably want to
double them as they don't take that many resources.
>
>> Can you run VACUUM VERBOSE and send us the last 10 or so lines of output?
>
> I ran VACUUM FULL VERBOSE on the old production DB and I see we need to run
> VACUUM FULL on a regular basis.  I took about hour and a half to complete but
> afterwards the problem tables and queries are running well.  Query time is
> back to 1.08 mill sec.
> Here are the last 10 lines of VACUUM FULL VERBOSE

Actually, you should not have to run VACUUM FULL ever if you vacuum often
enough, that's why I suggest you use pg_autovacuum.  You can find it in the
contrib modules for 8.0.x or builtin to the 8.1.x server.

>
>
> Jeff Frost wrote:
>> On Wed, 30 Aug 2006, Joe McClintock wrote:
>>
>>> I ran a vacuum, analyze and reindex on the database with no change in
>>> performance, query time was still 37+ sec, a little worse. On our test
>>> system I found that a db_dump from production and then restore brought the
>>> database back to full performance. So in desperation I shut down the
>>> production application, backed up the production database, rename the
>>> production db, create a new empty production db and restored the
>>> production backup to the empty db. After a successful db restore and
>>> restart of the web application, everything was then up and running like a
>>> top.
>>
>> Joe,
>>
>> I would guess that since the dump/restore yielded good performance once
>> again, a VACUUM FULL would have also fixed the problem.  How are your FSM
>> settings in the conf file?  Can you run VACUUM VERBOSE and send us the last
>> 10 or so lines of output?
>>
>> A good article on FSM settings can be found here:
>>
>> http://www.pervasive-postgres.com/instantkb13/article.aspx?id=10087&cNode=5K1C3W
>>
>> You probably should consider setting up autovacuum and definitely should
>> upgrade to at least 8.0.8 if not 8.1.4 when you get the chance.
>>
>> When you loaded the new data did you delete or update old data or was it
>> just a straight insert?
>>
>
>
>

--
Jeff Frost, Owner     <jeff@frostconsultingllc.com>
Frost Consulting, LLC     http://www.frostconsultingllc.com/
Phone: 650-780-7908    FAX: 650-649-1954

Re: Database Optimization and Peformance

От
Chris Browne
Дата:
joe.mcclintock@antics.com (Joe McClintock) writes:
> My concern is this, doing a backup and restore does not seem an
> appropriate way manage database fragmentation and performance. The
> documentation I have read indicates that vacuum, analyze reindex are
> the tools to use to de-fragment and optimize the database. In my
> case they did not work and reindexing made query performance
> slightly worse. Am I missing something? As the database grows, will
> I need to essentially rebuild the db on a regular basis?

It oughtn't be necessary.

It seems quite plausible that there are a few tables that are not
being vacuumed nearly often enough.

If you have tables where large portions are modified
(updated/deleted), then you need to run VACUUM quite frequently,
otherwise such tables will be overrun with dead space.

We have tables that we run VACUUM on every five minutes because they
change really heavily.  (200-300 tuples, where we do an update to a
tuple every time a customer is billed.)
--
"cbbrowne","@","ntlug.org"
http://cbbrowne.com/info/finances.html
Why does sour cream have an expiration date?

Re: Database Optimization and Peformance

От
Joe McClintock
Дата:
Thanks Jeff,

Obviously I have a bit to learn about Postgres admin.

 > How are your FSM settings in the conf file?

Looking at the postgresql.config the lines for max_fsm_pages and
max_fsm_relations are commented out so they have the default 20000 pages
and 1000 relations.  Should we up these defaults?

 >When you loaded the new data did you delete or update old data or was
it just a straight insert?

As far as the database load went, it was a straight insert, no updates
were made.

 >Can you run VACUUM VERBOSE and send us the last 10 or so lines of output?

I ran VACUUM FULL VERBOSE on the old production DB and I see we need to
run VACUUM FULL on a regular basis.  I took about hour and a half to
complete but afterwards the problem tables and queries are running
well.  Query time is  back to 1.08 mill sec.

Here are the last 10 lines of VACUUM FULL VERBOSE

Total free space (including removable row versions) is 0 bytes.
0 pages are or will become empty, including 0 at the end of the table.
0 pages containing 0 free bytes are potential move destinations.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "pg_toast_9187147_index" now contains 0 row versions in 1 pages
DETAIL:  0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  free space map: 1000 relations, 1111 pages stored; 19648 total
pages needed
DETAIL:  Allocated FSM size: 1000 relations + 20000 pages = 223 kB
shared memory.
VACUUM
erm_slow=>

Thanks again for your quick response

Regards
--Joe



Jeff Frost wrote:
> On Wed, 30 Aug 2006, Joe McClintock wrote:
>
>> I ran a vacuum, analyze and reindex on the database with no change in
>> performance, query time was still 37+ sec, a little worse. On our
>> test system I found that a db_dump from production and then restore
>> brought the database back to full performance. So in desperation I
>> shut down the production application, backed up the production
>> database, rename the production db, create a new empty production db
>> and restored the production backup to the empty db. After a
>> successful db restore and restart of the web application, everything
>> was then up and running like a top.
>
> Joe,
>
> I would guess that since the dump/restore yielded good performance
> once again, a VACUUM FULL would have also fixed the problem.  How are
> your FSM settings in the conf file?  Can you run VACUUM VERBOSE and
> send us the last 10 or so lines of output?
>
> A good article on FSM settings can be found here:
>
> http://www.pervasive-postgres.com/instantkb13/article.aspx?id=10087&cNode=5K1C3W
>
>
> You probably should consider setting up autovacuum and definitely
> should upgrade to at least 8.0.8 if not 8.1.4 when you get the chance.
>
> When you loaded the new data did you delete or update old data or was
> it just a straight insert?
>


Re: Database Optimization and Peformance

От
Ray Stell
Дата:
I'm an ex-ora guy, also.  We will see more and more of us here
as oracle corp becomes less cooperative in their pricing demands.
What I'm saying here is directly from my oracle performance tuning
history and Milsap and Holt.

With all due respect (seriously), I'm playing catch up here with postgresql,
and I consider any help a benefit.  However, during the 1990s it was very popular
to "tune" oracle db performance as you have described.  People who
know cool knobs to turn were considered wizards and knocked down
big consulting bucks in this process:

see slow app
change stuff (how did they know about that knob?)
run app again
repeat as needed

I'm not saying what you advise will not help performance.
Lots of ad hoc kinds of changes "might" be helpful.  However, we
learned years ago from great db engineering minds a better method.
The wisdom goes something like this:

"Reliable problem diagnosis cannot proceed unless the data collection
phase produces response time data for exactly the right time scope
and exactly the right action scope." - Millsap, Holt

and

"If you can't measure it, you can't manage it. - Peter Drucker

That said, I've yet to find a trace tool within pg that provides system
wide response time data, ala the oracle wait interface.  I don't
know if the internal instrumentation is included in the pg kernel.
A. Kretschmer was kind enough to point to "explain analyze," that is
like ora explain plan including "total elapsed time expended within each
plan node."  Better than the nothing I had last week.  Far from
what I've been spoiled by.

Perhaps there is some way to hook strace into the db kernel, but
the results,  how you would profile that is way beyond me.
If on solaris 10, dtrace has great os level instrumentation
that might come in handy.

One underlying concept in the method is:

"Work first to reduce the biggest response time component of a business' most
important user action."  - Millsap, Holt

Any help on how to accomplish the detail of this method would be greatly
appreciated.

Thanks.


On Wed, Aug 30, 2006 at 11:45:06AM -0700, Jeff Frost wrote:
> On Wed, 30 Aug 2006, Joe McClintock wrote:
>
> >I ran a vacuum, analyze and reindex on the database with no change in
> >performance, query time was still 37+ sec, a little worse. On our test
> >system I found that a db_dump from production and then restore brought the
> >database back to full performance. So in desperation I shut down the
> >production application, backed up the production database, rename the
> >production db, create a new empty production db and restored the
> >production backup to the empty db. After a successful db restore and
> >restart of the web application, everything was then up and running like a
> >top.
>
> Joe,
>
> I would guess that since the dump/restore yielded good performance once
> again, a VACUUM FULL would have also fixed the problem.  How are your FSM
> settings in the conf file?  Can you run VACUUM VERBOSE and send us the last
> 10 or so lines of output?
>
> A good article on FSM settings can be found here:
>
> http://www.pervasive-postgres.com/instantkb13/article.aspx?id=10087&cNode=5K1C3W
>
> You probably should consider setting up autovacuum and definitely should
> upgrade to at least 8.0.8 if not 8.1.4 when you get the chance.
>
> When you loaded the new data did you delete or update old data or was it
> just a straight insert?
>
> --
> Jeff Frost, Owner     <jeff@frostconsultingllc.com>
> Frost Consulting, LLC     http://www.frostconsultingllc.com/
> Phone: 650-780-7908    FAX: 650-649-1954
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>       choose an index scan if your joining column's datatypes do not
>       match

Re: Database Optimization and Peformance

От
Ray Stell
Дата:
On Wed, Aug 30, 2006 at 05:29:25PM -0400, Chris Browne wrote:
> joe.mcclintock@antics.com (Joe McClintock) writes:
> > My concern is this, doing a backup and restore does not seem an
> > appropriate way manage database fragmentation and performance. The
> > documentation I have read indicates that vacuum, analyze reindex are
> > the tools to use to de-fragment and optimize the database. In my
> > case they did not work and reindexing made query performance
> > slightly worse. Am I missing something? As the database grows, will
> > I need to essentially rebuild the db on a regular basis?



What data tells you that you have a fragmentation problem?  Have you
baselined disk i/o?  You appear to have determined how to fix the problem,
vacuum, based on what?  I have a tool that looks like a hammer, therefore
the problem must be nails.

Reasonable DB performance tuning requires the ability to be able to
determine where the system is spending its time.  Only then can a
reasonable solution be proposed.  You are doing the opposite and
hoping to walk in the right door.  We did this all through the 1990s with
oracle and it sucks.  Not having methods of measuring what where the time
is being spent sucks and we will continue to feel suckage until we find
the tools that will show us what the system is doing that makes it "slow."

I'd love to hear wise pg people tell me what tools/methods are available
to provide response time data.

Again, thanks for all your help in my sharp pg learning curve.




> It oughtn't be necessary.
>
> It seems quite plausible that there are a few tables that are not
> being vacuumed nearly often enough.
>
> If you have tables where large portions are modified
> (updated/deleted), then you need to run VACUUM quite frequently,
> otherwise such tables will be overrun with dead space.
>
> We have tables that we run VACUUM on every five minutes because they
> change really heavily.  (200-300 tuples, where we do an update to a
> tuple every time a customer is billed.)
> --
> "cbbrowne","@","ntlug.org"
> http://cbbrowne.com/info/finances.html
> Why does sour cream have an expiration date?
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend

--