Обсуждение: Recommended File System Configuration

Поиск
Список
Период
Сортировка

Recommended File System Configuration

От
James Thornton
Дата:
Back in 2001, there was a lengthy thread on the PG Hackers list about PG
and journaling file systems
(http://archives.postgresql.org/pgsql-hackers/2001-05/msg00017.php), but
there was no decisive conclusion regarding what FS to use. At the time
the fly in the XFS ointment was that deletes were slow, but this was
improved with XFS 1.1.

I think a journaling a FS is needed for PG data since large DBs could
take hours to recover on a non-journaling FS, but what about WAL files?

--

  James Thornton
______________________________________________________
Internet Business Consultant, http://jamesthornton.com



Re: Recommended File System Configuration

От
Chris Browne
Дата:
james@jamesthornton.com (James Thornton) writes:
> Back in 2001, there was a lengthy thread on the PG Hackers list about
> PG and journaling file systems
> (http://archives.postgresql.org/pgsql-hackers/2001-05/msg00017.php),
> but there was no decisive conclusion regarding what FS to use. At the
> time the fly in the XFS ointment was that deletes were slow, but this
> was improved with XFS 1.1.
>
> I think a journaling a FS is needed for PG data since large DBs could
> take hours to recover on a non-journaling FS, but what about WAL files?

If the WAL files are on a small filesystem, it presumably won't take
hours for that filesystem to recover at fsck time.

The results have not been totally conclusive...

 - Several have found JFS to be a bit faster than anything else on
   Linux, but some data loss problems have been experienced;

 - ext2 has the significant demerit that with big filesystems, fsck
   will "take forever" to run;

 - ext3 appears to be the slowest option out there, and there are some
   stories of filesystem corruption;

 - ReiserFS was designed to be real fast with tiny files, which is not
   the ideal "use case" for PostgreSQL; the designers there are
   definitely the most aggressive at pushing out "bleeding edge" code,
   which isn't likely the ideal;

 - XFS is neither fastest nor slowest, but there has been a lack of
   reports of "spontaneous data loss" under heavy load, which is a
   good thing.  It's not part of "official 2.4" kernels, requiring
   backports, but once 2.6 gets more widely deployed, this shouldn't
   be a demerit anymore...

I think that provides a reasonable overview of what has been seen...
--
output = reverse("gro.gultn" "@" "enworbbc")
http://cbbrowne.com/info/oses.html
Donny: Are these the Nazis, Walter?
Walter: No, Donny, these men are nihilists. There's nothing to be
afraid of.  -- The Big Lebowski

Re: Recommended File System Configuration

От
James Thornton
Дата:
Chris Browne wrote:

> The results have not been totally conclusive...
>
>  - Several have found JFS to be a bit faster than anything else on
>    Linux, but some data loss problems have been experienced;
>
>  - ext2 has the significant demerit that with big filesystems, fsck
>    will "take forever" to run;
>
>  - ext3 appears to be the slowest option out there, and there are some
>    stories of filesystem corruption;


In an Oracle paper entitled Tuning an "Oracle8i Database Running Linux"
(http://otn.oracle.com/oramag/webcolumns/2002/techarticles/scalzo_linux02.html),
Dr. Bert Scalzo says, "The trouble with these tests-for example, Bonnie,
Bonnie++, Dbench, Iobench, Iozone, Mongo, and Postmark-is that they are
basic file system throughput tests, so their results generally do not
pertain in any meaningful fashion to the way relational database systems
access data files." Instead he suggests users benchmarking filesystems
for database applications should use these two well-known and widely
accepted database benchmarks:

AS3AP (http://www.benchmarkresources.com/handbook/5.html): a scalable,
portable ANSI SQL relational database benchmark that provides a
comprehensive set of tests of database-processing power; has built-in
scalability and portability for testing a broad range of systems;
minimizes human effort in implementing and running benchmark tests; and
provides a uniform, metric, straightforward interpretation of the results.

TPC-C (http://www.tpc.org/): an online transaction processing (OLTP)
benchmark that involves a mix of five concurrent transactions of various
types and either executes completely online or queries for deferred
execution. The database comprises nine types of tables, having a wide
range of record and population sizes. This benchmark measures the number
of transactions per second.

I encourage you to read the paper -- Dr. Scalzo's results will surprise
you; however, while he benchmarked ext2, ext3, ReiserFS, JFS, and RAW,
he did not include XFS.

SGI and IBM did a more detailed study on Linux filesystem performance,
which included XFS, ext2, ext3 (various modes), ReiserFS, and JRS, and
the results are presented in a paper entitled "Filesystem Performance
and Scalability in Linux 2.4.17"
(http://oss.sgi.com/projects/xfs/papers/filesystem-perf-tm.pdf). This
paper goes over the details on how to properly conduct a filesystem
benchmark and addresses scaling and load more so than Dr. Scalzo's tests.

For further study, I have compiled a list of Linux filesystem resources
at: http://jamesthornton.com/hotlist/linux-filesystems/.

--

  James Thornton
______________________________________________________
Internet Business Consultant, http://jamesthornton.com