Обсуждение: Major Linux performance regression; shouldn't we be worried about RHEL6?

Поиск
Список
Период
Сортировка

Major Linux performance regression; shouldn't we be worried about RHEL6?

От
Josh Berkus
Дата:
All,

Domas (of Facebook/Wikipedia, MySQL geek) pointed me to this report:

http://www.phoronix.com/scan.php?page=article&item=linux_perf_regressions&num=1
http://www.phoronix.com/scan.php?page=article&item=ext4_then_now&num=6

The serious problems with this appear to be (a) that Linux/Ext4 PG
performance still hasn't fully recovered, and, (b) that RHEL6 is set to
ship with kernel 2.6.32, which means that we'll have a whole generation
of RHEL which is off-limits to PostgreSQL.

Tom, any word from your coworkers on this?

--
                                  -- Josh Berkus
                                     PostgreSQL Experts Inc.
                                     http://www.pgexperts.com

Re: Major Linux performance regression; shouldn't we be worried about RHEL6?

От
Josh Berkus
Дата:
> The serious problems with this appear to be (a) that Linux/Ext4 PG
> performance still hasn't fully recovered, and, (b) that RHEL6 is set to
> ship with kernel 2.6.32, which means that we'll have a whole generation
> of RHEL which is off-limits to PostgreSQL.

Oh.  Found some other information on the issue.  Looks like the problem
is fixed in later kernels.  So the only real issue is: is RHEL6 shipping
with 2.6.32?

--
                                  -- Josh Berkus
                                     PostgreSQL Experts Inc.
                                     http://www.pgexperts.com

Re: Major Linux performance regression; shouldn't we be worried about RHEL6?

От
Scott Marlowe
Дата:
On Fri, Nov 5, 2010 at 2:15 PM, Josh Berkus <josh@agliodbs.com> wrote:
> All,
>
> Domas (of Facebook/Wikipedia, MySQL geek) pointed me to this report:
>
> http://www.phoronix.com/scan.php?page=article&item=linux_perf_regressions&num=1
> http://www.phoronix.com/scan.php?page=article&item=ext4_then_now&num=6
>
> The serious problems with this appear to be (a) that Linux/Ext4 PG
> performance still hasn't fully recovered, and, (b) that RHEL6 is set to
> ship with kernel 2.6.32, which means that we'll have a whole generation
> of RHEL which is off-limits to PostgreSQL.

Why would it be off limits?  Is it likely to lose data due to power failure etc?

Are you referring to improvements due to write barrier support getting
fixed up fr ext4 to run faster but still be safe?  I would assume that
any major patches that increase performance with write barriers
without being dangerous for your data would get back ported by RH as
usual.

Re: Major Linux performance regression; shouldn't we be worried about RHEL6?

От
Josh Berkus
Дата:
> Why would it be off limits?  Is it likely to lose data due to power failure etc?

If fsyncs are taking 5X as long, people can't use PostgreSQL on that
platform.

> Are you referring to improvements due to write barrier support getting
> fixed up fr ext4 to run faster but still be safe?  I would assume that
> any major patches that increase performance with write barriers
> without being dangerous for your data would get back ported by RH as
> usual.

Hopefully, yes.  I wouldn't mind confirmation of this, though; it
wouldn't be the first time RH shipped with known-bad IO performance.

--
                                  -- Josh Berkus
                                     PostgreSQL Experts Inc.
                                     http://www.pgexperts.com

Re: Major Linux performance regression; shouldn't we be worried about RHEL6?

От
Andres Freund
Дата:
On Friday 05 November 2010 21:15:20 Josh Berkus wrote:
> All,
>
> Domas (of Facebook/Wikipedia, MySQL geek) pointed me to this report:
>
> http://www.phoronix.com/scan.php?page=article&item=linux_perf_regressions&n
I guess thats the O_DSYNC thingy.  See the "Defaulting wal_sync_method to
fdatasync on Linux for 9.1?" (performance) and "Revert default wal_sync_method
to fdatasync on Linux 2.6.33+" on hackers.

O_DSYNC got finally properly implemented on linux with 2.6.33 (and thus 2.6.32-
rc1).

> um=1 http://www.phoronix.com/scan.php?page=article&item=ext4_then_now&num=6
That one looks pretty uninteresting. Barriers are slower then no barriers. No
surprise there.

Andres

Re: Major Linux performance regression; shouldn't we be worried about RHEL6?

От
Scott Marlowe
Дата:
On Fri, Nov 5, 2010 at 2:32 PM, Josh Berkus <josh@agliodbs.com> wrote:
>
>> Why would it be off limits?  Is it likely to lose data due to power failure etc?
>
> If fsyncs are taking 5X as long, people can't use PostgreSQL on that
> platform.

I was under the impression that from 2.6.28 through 2.6.31 or so that
the linux kernel just forgot how to fsync, and they turned it back on
in 2.6.32 and that's why we saw the big slowdown.  Dropping from
thousands of transactions per second to 150 to 175 seems a reasonable
change when that happens.

>> Are you referring to improvements due to write barrier support getting
>> fixed up fr ext4 to run faster but still be safe?  I would assume that
>> any major patches that increase performance with write barriers
>> without being dangerous for your data would get back ported by RH as
>> usual.
>
> Hopefully, yes.  I wouldn't mind confirmation of this, though; it
> wouldn't be the first time RH shipped with known-bad IO performance.

true, very true.  I will say that with my 2.6.32 based Ubuntu 10.04.1
LTS servers, running pgsql on an LSI 8888 controller can pull off 7500
tps quite easily.  And quite safely, having survived power off tests
quite well.  That's on ext3 though.  I haven't tested them with ext4,
as when I set them up I still didn't consider it stable enough for
production.

Re: Major Linux performance regression; shouldn't we be worried about RHEL6?

От
Greg Smith
Дата:
Josh Berkus wrote:
> Domas (of Facebook/Wikipedia, MySQL geek) pointed me to this report:
>
> http://www.phoronix.com/scan.php?page=article&item=linux_perf_regressions&num=1
> http://www.phoronix.com/scan.php?page=article&item=ext4_then_now&num=6
>
The main change here was discussed back in January:

http://archives.postgresql.org/message-id/4B512D0D.4030909@2ndquadrant.com

What I've been doing about this is the writing leading up to
http://wiki.postgresql.org/wiki/Reliable_Writes so that when RHEL6 does
ship, we have a place to point people toward that makes it better
documented that the main difference here is a reliability improvement
rather than a performance regression.  I'm not sure what else we can do
here, other than organizing more testing for kernel bugs in this area on
RHEL6.  The only way to regain the majority of the "lost" performance
here is to turn off synchronous_commit in the default config.

--
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services and Support        www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


Re: Major Linux performance regression; shouldn't we be worried about RHEL6?

От
Josh Berkus
Дата:
> The main change here was discussed back in January:
> http://archives.postgresql.org/message-id/4B512D0D.4030909@2ndquadrant.com
>
> What I've been doing about this is the writing leading up to
> http://wiki.postgresql.org/wiki/Reliable_Writes so that when RHEL6 does
> ship, we have a place to point people toward that makes it better
> documented that the main difference here is a reliability improvement
> rather than a performance regression.  I'm not sure what else we can do
> here, other than organizing more testing for kernel bugs in this area on
> RHEL6.  The only way to regain the majority of the "lost" performance
> here is to turn off synchronous_commit in the default config.

Yeah, I was looking at that.  However, there seems to be some
indications that there was a drop in performance specifically in 2.6.32
which went beyond fixing the reliability:

http://www.phoronix.com/scan.php?page=article&item=linux_2636_btrfs&num=1

However, Phoronix doesn't say what sync option they're using; quite
likely it's O_DSYNC.  Unfortunately, the fact that users now need to be
aware of the fsync_method again, after having it set automatically for
them for the last 4 years, is a usability regression for *us*.  Anything
that's reasonable for us to do about it?

--
                                  -- Josh Berkus
                                     PostgreSQL Experts Inc.
                                     http://www.pgexperts.com

Re: Major Linux performance regression; shouldn't we be worried about RHEL6?

От
Scott Carey
Дата:
On Nov 5, 2010, at 1:19 PM, Josh Berkus wrote:

>
>> The serious problems with this appear to be (a) that Linux/Ext4 PG
>> performance still hasn't fully recovered, and, (b) that RHEL6 is set to
>> ship with kernel 2.6.32, which means that we'll have a whole generation
>> of RHEL which is off-limits to PostgreSQL.
>
> Oh.  Found some other information on the issue.  Looks like the problem
> is fixed in later kernels.  So the only real issue is: is RHEL6 shipping
> with 2.6.32?
>

No, RHEL 6 is not on any specific upstream Kernel version.  Its 2.6.32++, with many changes from .33 to .35 also in
there. You can probably assume that ALL ext4 changes are in there (since RedHat develops and contributes that).  Plus,
itwill have several features that RedHat has not gotten pushed upstream completely yet, such as the automatic huge
pagesstuff. 

The likelihood that file system related fixes from the .33 to .35 range did not make RHEL 6 is very low.


> --
>                                  -- Josh Berkus
>                                     PostgreSQL Experts Inc.
>                                     http://www.pgexperts.com
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance