Обсуждение: strange performance regression between 7.4 and 8.1

Поиск
Список
Период
Сортировка

strange performance regression between 7.4 and 8.1

От
"Alex Deucher"
Дата:
Hello,

I have noticed a strange performance regression and I'm at a loss as
to what's happening.  We have a fairly large database (~16 GB).  The
original postgres 7.4 was running on a sun v880 with 4 CPUs and 8 GB
of ram running Solaris on local scsi discs.  The new server is a sun
Opteron box with 4 cores, 8 GB of ram running postgres 8.1.4 on Linux
(AMD64) on a 4 Gbps FC SAN volume.  When we created the new database
it was created from scratch rather than copying over the old one,
however the table structure is almost identical (UTF8 on the new one
vs. C on the old). The problem is queries are ~10x slower on the new
hardware.  I read several places that the SAN might be to blame, but
testing with bonnie and dd indicates that the SAN is actually almost
twice as fast as the scsi discs in the old sun server.  I've tried
adjusting just about every option in the postgres config file, but
performance remains the same.  Any ideas?

Thanks,

Alex

Re: strange performance regression between 7.4 and 8.1

От
Guido Neitzer
Дата:
On 01.03.2007, at 13:40, Alex Deucher wrote:

> I read several places that the SAN might be to blame, but
> testing with bonnie and dd indicates that the SAN is actually almost
> twice as fast as the scsi discs in the old sun server.  I've tried
> adjusting just about every option in the postgres config file, but
> performance remains the same.  Any ideas?

As mentioned last week:

Did you actually try to use the local drives for speed testing? It
might be that the SAN introduces latency especially for random access
you don't see on local drives.

cug

Re: strange performance regression between 7.4 and 8.1

От
"Alex Deucher"
Дата:
On 3/5/07, Guido Neitzer <lists@event-s.net> wrote:
> On 01.03.2007, at 13:40, Alex Deucher wrote:
>
> > I read several places that the SAN might be to blame, but
> > testing with bonnie and dd indicates that the SAN is actually almost
> > twice as fast as the scsi discs in the old sun server.  I've tried
> > adjusting just about every option in the postgres config file, but
> > performance remains the same.  Any ideas?
>
> As mentioned last week:
>
> Did you actually try to use the local drives for speed testing? It
> might be that the SAN introduces latency especially for random access
> you don't see on local drives.

Yes, I started setting that up this afternoon.  I'm going to test that
tomorrow and post the results.

Alex

>
> cug
>

Re: strange performance regression between 7.4 and 8.1

От
Guido Neitzer
Дата:
On 05.03.2007, at 19:56, Alex Deucher wrote:

> Yes, I started setting that up this afternoon.  I'm going to test that
> tomorrow and post the results.

Good - that may or may not give some insight in the actual
bottleneck. You never know but it seems to be one of the easiest to
find out ...

cug

Re: strange performance regression between 7.4 and 8.1

От
"Alex Deucher"
Дата:
On 3/5/07, Guido Neitzer <lists@event-s.net> wrote:
> On 05.03.2007, at 19:56, Alex Deucher wrote:
>
> > Yes, I started setting that up this afternoon.  I'm going to test that
> > tomorrow and post the results.
>
> Good - that may or may not give some insight in the actual
> bottleneck. You never know but it seems to be one of the easiest to
> find out ...
>

Well, the SAN appears to be the limiting factor.  I set up the DB on
the local scsi discs (software RAID 1) and performance is excellent
(better than the old server).  Thanks for everyone's help.

Alex

Re: strange performance regression between 7.4 and 8.1

От
Ron
Дата:
At 10:25 AM 3/6/2007, Alex Deucher wrote:
>On 3/5/07, Guido Neitzer <lists@event-s.net> wrote:
>>On 05.03.2007, at 19:56, Alex Deucher wrote:
>>
>> > Yes, I started setting that up this afternoon.  I'm going to test that
>> > tomorrow and post the results.
>>
>>Good - that may or may not give some insight in the actual
>>bottleneck. You never know but it seems to be one of the easiest to
>>find out ...
>
>Well, the SAN appears to be the limiting factor.  I set up the DB on
>the local scsi discs (software RAID 1) and performance is excellent
>(better than the old server).  Thanks for everyone's help.
>
>Alex

What kind of SAN is it and how many + what kind of HDs are in it?
Assuming the answers are reasonable...

Profile the table IO pattern your workload generates and start
allocating RAID sets to tables or groups of tables based on IO pattern.

For any table or group of tables that has a significant level of
write IO, say >= ~25% of the IO mix, try RAID 5 or 6 first, but be
prepared to go RAID 10 if performance is not acceptable.

Don't believe any of the standard "lore" regarding what tables to put
where or what tables to give dedicated spindles to.
Profile, benchmark, and only then start allocating dedicated resources.
For instance, I've seen situations where putting pg_xlog on its own
spindles was !not! the right thing to do.

Best Wishes,
Ron Peacetree


Re: strange performance regression between 7.4 and 8.1

От
"Alex Deucher"
Дата:
On 3/6/07, Ron <rjpeace@earthlink.net> wrote:
> At 10:25 AM 3/6/2007, Alex Deucher wrote:
> >On 3/5/07, Guido Neitzer <lists@event-s.net> wrote:
> >>On 05.03.2007, at 19:56, Alex Deucher wrote:
> >>
> >> > Yes, I started setting that up this afternoon.  I'm going to test that
> >> > tomorrow and post the results.
> >>
> >>Good - that may or may not give some insight in the actual
> >>bottleneck. You never know but it seems to be one of the easiest to
> >>find out ...
> >
> >Well, the SAN appears to be the limiting factor.  I set up the DB on
> >the local scsi discs (software RAID 1) and performance is excellent
> >(better than the old server).  Thanks for everyone's help.
> >
> >Alex
>
> What kind of SAN is it and how many + what kind of HDs are in it?
> Assuming the answers are reasonable...
>

It's a Hitachi WMS/Tagmastore.  105 hitachi SATA drives; 4 Gbps FC.

> Profile the table IO pattern your workload generates and start
> allocating RAID sets to tables or groups of tables based on IO pattern.
>
> For any table or group of tables that has a significant level of
> write IO, say >= ~25% of the IO mix, try RAID 5 or 6 first, but be
> prepared to go RAID 10 if performance is not acceptable.
>

Right now it's designed for max capacity: big RAID 5 groups.  I expect
I'll probably need RAID 10 for decent performance.

> Don't believe any of the standard "lore" regarding what tables to put
> where or what tables to give dedicated spindles to.
> Profile, benchmark, and only then start allocating dedicated resources.
> For instance, I've seen situations where putting pg_xlog on its own
> spindles was !not! the right thing to do.
>

Right.  Thanks for the advice.  I'll post my results when I get around
to testing some new SAN configurations.

Alex

Re: strange performance regression between 7.4 and 8.1

От
"Rodrigo Madera"
Дата:
I would just like to note here that this is an example of inefficient strategy.

We could all agree (up to a certain economical point) that Alex saved the most expensive one thousand dollars of his life.

I don't know the financial status nor the size of your organization, but I'm sure that you have selected the path that has cost you more.

In the future, an investment on memory for a (let's say) rather small database should be your first attempt.

Yours,
Rodrigo Madera

On 3/6/07, Alex Deucher <alexdeucher@gmail.com> wrote:
On 3/6/07, Ron <rjpeace@earthlink.net> wrote:
> At 10:25 AM 3/6/2007, Alex Deucher wrote:
> >On 3/5/07, Guido Neitzer < lists@event-s.net> wrote:
> >>On 05.03.2007, at 19:56, Alex Deucher wrote:
> >>
> >> > Yes, I started setting that up this afternoon.  I'm going to test that
> >> > tomorrow and post the results.
> >>
> >>Good - that may or may not give some insight in the actual
> >>bottleneck. You never know but it seems to be one of the easiest to
> >>find out ...
> >
> >Well, the SAN appears to be the limiting factor.  I set up the DB on
> >the local scsi discs (software RAID 1) and performance is excellent
> >(better than the old server).  Thanks for everyone's help.
> >
> >Alex
>
> What kind of SAN is it and how many + what kind of HDs are in it?
> Assuming the answers are reasonable...
>

It's a Hitachi WMS/Tagmastore.  105 hitachi SATA drives; 4 Gbps FC.

> Profile the table IO pattern your workload generates and start
> allocating RAID sets to tables or groups of tables based on IO pattern.
>
> For any table or group of tables that has a significant level of
> write IO, say >= ~25% of the IO mix, try RAID 5 or 6 first, but be
> prepared to go RAID 10 if performance is not acceptable.
>

Right now it's designed for max capacity: big RAID 5 groups.  I expect
I'll probably need RAID 10 for decent performance.

> Don't believe any of the standard "lore" regarding what tables to put
> where or what tables to give dedicated spindles to.
> Profile, benchmark, and only then start allocating dedicated resources.
> For instance, I've seen situations where putting pg_xlog on its own
> spindles was !not! the right thing to do.
>

Right.  Thanks for the advice.  I'll post my results when I get around
to testing some new SAN configurations.

Alex

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match

Re: strange performance regression between 7.4 and 8.1

От
"Craig A. James"
Дата:
Rodrigo Madera wrote:
> I would just like to note here that this is an example of inefficient
> strategy.
>
> We could all agree (up to a certain economical point) that Alex saved
> the most expensive one thousand dollars of his life.
>
> I don't know the financial status nor the size of your organization, but
> I'm sure that you have selected the path that has cost you more.
>
> In the future, an investment on memory for a (let's say) rather small
> database should be your first attempt.

Alex may have made the correct, rational choice, given the state of accounting at most corporations.  Corporate
accountingpractices and the budgetary process give different weights to cash and labor.  Labor is fixed, and can be
grosslywasted without (apparently) affecting the quarterly bottom line.  Cash expenditures come directly off profits. 

It's shortsighted and irrational, but nearly 100% of corporations operate this way.  You can waste a week of your time
andnobody complains, but spend a thousand dollars, and the company president is breathing down your neck. 

When we answer a question on this forum, we need to understand that the person who needs help may be under irrational,
butreal, constraints, and offer appropriate advice.  Sure, it's good to fight corporate stupidity, but sometimes you
justwant to get the system back online. 

Craig

Re: strange performance regression between 7.4 and 8.1

От
Ron
Дата:
At 01:34 PM 3/8/2007, Craig A. James wrote:
>Rodrigo Madera wrote:
>>I would just like to note here that this is an example of
>>inefficient strategy.
>>We could all agree (up to a certain economical point) that Alex
>>saved the most expensive one thousand dollars of his life.
>>I don't know the financial status nor the size of your
>>organization, but I'm sure that you have selected the path that has
>>cost you more.
>>In the future, an investment on memory for a (let's say) rather
>>small database should be your first attempt.
>
>Alex may have made the correct, rational choice, given the state of
>accounting at most corporations.  Corporate accounting practices and
>the budgetary process give different weights to cash and
>labor.  Labor is fixed, and can be grossly wasted without
>(apparently) affecting the quarterly bottom line.  Cash expenditures
>come directly off profits.
>
>It's shortsighted and irrational, but nearly 100% of corporations
>operate this way.  You can waste a week of your time and nobody
>complains, but spend a thousand dollars, and the company president
>is breathing down your neck.
>
>When we answer a question on this forum, we need to understand that
>the person who needs help may be under irrational, but real,
>constraints, and offer appropriate advice.  Sure, it's good to fight
>corporate stupidity, but sometimes you just want to get the system back online.
>
>Craig
All good points.

However, when we allow or help (even tacitly by "looking the other
way") our organizations to waste IT dollars we increase the risk that
we are going to be paid less because there's less money.  Or even
that we will be unemployed because there's less money (as in "we
wasted enough money we went out of business").

The correct strategy is to Speak Their Language (tm) to the
accounting and management folks and give them the information needed
to Do The Right Thing (tm) (or at least authorize you doing it ;-)
).  They may still not be / act sane, but at that point your hands are clean.
(...and if your organization has a habit of Not Listening to Reason
(tm), strongly consider finding a new job before you are forced to by
their fiscal or managerial irresponsibility.)

Cap Ex may not be the same as Discretionary Expenses, but at the end
of the day dollars are dollars.
Any we spend in one place can't be spent in any other place; and
there's a finite pile of them.

Spending 10x as much in labor and opportunity costs (you can only do
one thing at a time...) as you would on CapEx to address a problem is
simply not smart money management nor good business.  Even spending
2x as much in that fashion is probably not.

  Cheers,
Ron Peacetree






Re: strange performance regression between 7.4 and 8.1

От
Carlos Moreno
Дата:
>> I would just like to note here that this is an example of inefficient
>> strategy.
>> [ ... ]
>
>
> Alex may have made the correct, rational choice, given the state of
> accounting at most corporations.  Corporate accounting practices and
> the budgetary process give different weights to cash and labor.  Labor
> is fixed, and can be grossly wasted without (apparently) affecting the
> quarterly bottom line.  Cash expenditures come directly off profits.
>
> It's shortsighted and irrational, but nearly 100% of corporations
> operate this way.  You can waste a week of your time and nobody
> complains, but spend a thousand dollars, and the company president is
> breathing down your neck.
>
> When we answer a question on this forum, we need to understand that
> the person who needs help may be under irrational, but real,
> constraints, and offer appropriate advice.  Sure, it's good to fight
> corporate stupidity, but sometimes you just want to get the system
> back online.


Another thing --- which may or may not apply to Alex's case and to the
particular state of the thread, but it's still related and IMHO
important to
take into account:

There may be other consrtaints that makes it impossible to even consider
a memory upgrade --- for example, us (our project).  We *rent* the servers
from a Web hoster (dedicated servers).  This particular hoster does not
even offer the possibility of upgrading the hardware --- 2GB of RAM,
take it r leave it.  Period.

In other cases, the memory upgrade has a *monthly* cost  (and quite
often I find it excessive --- granted, that may be just me).  So, $50 or
$100 per month *additional* expenses may be considerable.

Now, yet another thing that you (Craig) seem to be missing:  you're
simply putting the expense of all this time under the expenses column
in exchange for solving the particular problem --- gaining the insight
on the internals and performance tuning techniques for PG may well
be worth tens of thousands of dollars for his company in the future.

The "quick and dirty" solution is not giving a damn about knowledge
but to the ability to solve the problem at hand *now*, at whatever
"petty cash cost" because it looks more cost effective (when seen
from the non-irrational accounting point of view, that is) --- but isn't
going for the "quick and dirty" solution without learning anything
from the experience also shortsighted ???

Carlos
--


Re: strange performance regression between 7.4 and 8.1

От
Carlos Moreno
Дата:
Ron wrote:

>
> Speak Their Language (tm)  [ ... ]  Do The Right Thing (tm)
> [...] Not Listening to Reason (tm),
> [...]
>
> fiscal or managerial irresponsibility.)


And *here*, of all the instances, you don't put a (TM) sign ??!!!!
Tsk-tsk-tsk

:-)

Carlos
--


Re: strange performance regression between 7.4 and 8.1

От
"Craig A. James"
Дата:
Carlos,
> Now, yet another thing that you (Craig) seem to be missing:  you're
> simply putting the expense of all this time under the expenses column
> in exchange for solving the particular problem...

More like I was trying to keep my response short ;-).  I think we're all in agreement on pretty much everything:

  1. Understand your problem
  2. Find potential solutions
  3. Find the technical, economic AND situational tradeoffs
  4. Choose the best course of action

My original comment was directed at item #3.  I was trying to remind everyone that a simple cost analysis may point to
solutionsthat simply aren't possible, given business constraints. 

I know we also agree that we should constantly fight corporate stupidity and short-sighted budgetary oversight.  But
that'sa second battle, one that goes on forever.  Sometimes you just have to get the job done within the current
constraints.

'Nuff said.

Craig