Обсуждение: Raw devices vs. Filesystems

Поиск
Список
Период
Сортировка

Raw devices vs. Filesystems

От
"Jaime Casanova"
Дата:
Can you tell me (or at least guide me to a palce where i can find the
answer) what are the benefits of filesystems over raw devices?

And what filesystem is the best for postgresql performance?

_________________________________________________________________
The new MSN 8: advanced junk mail protection and 2 months FREE*
http://join.msn.com/?page=features/junkmail


Re: Raw devices vs. Filesystems

От
Terry Hampton
Дата:
    Hello  Jaime,

    I think you're on the right track but have gotten some
    concepts possibly confused.

    As I remember,   the original email asked if Postgres
    could be run in a "raw" mode.   Another submitter
    told us that it can not.   ( Did I read that correctly, everyone ? )

    This means run Postgress in a raw or character mode versus
    the "standard" block mode as you would handle other files
    in a filesystem.     The primary advantage for a raw mode
    is speed.  ORACLE allows this method of operation.

    So ... your last question is now rather moot.   For best
    performance assure your setup parameters and Linux
    kernel parameters are optimized.      Help with that
    is a few keystrokes from anywhere.

                Terry



Jaime Casanova wrote:
> Can you tell me (or at least guide me to a palce where i can find the
> answer) what are the benefits of filesystems over raw devices?
>
> And what filesystem is the best for postgresql performance?
>


--
Terry L. Hampton
Project Manager
LimaCorp, LLC   www.limacorp.com
513.587.1874



Re: Raw devices vs. Filesystems

От
Christopher Browne
Дата:
After takin a swig o' Arrakan spice grog, el_vigia_ec@hotmail.com ("Jaime Casanova") belched out:
> Can you tell me (or at least guide me to a palce where i can find the
> answer) what are the benefits of filesystems over raw devices?

For PostgreSQL, filesystems have the merit that you can actually use
them.  PostgreSQL doesn't support use of "raw devices."

Two major benefits of using filesystems as opposed to raw devices are
that:

a) The use of raw devices is dramatically non-portable; you have to
   reimplement data access on every platform you are trying to
   support;

b) The use of raw devices essentially mandates that you implement
   some form of generic filesystem on top of them, which adds
   considerable complexity to your code.

Two benefits to raw devices are claimed...

c) It's faster.  But that assumes that the "cooked" filesystems are
   implemented fairly badly.  That was typically true, a dozen
   years ago, but it isn't so typical now, particularly with a
   fancy cacheing controller.

d) It guarantees application control of update ordering.  Of course,
   with a cacheing controller, or disk drives that lie to one degree
   or another, those guarantees might be gone anyways.

There are other filesystem advantages, such as

e) Shifting "cooked" data around may be as simple as a "mv," whereas
   reorganizing on raw disk requires DB-specific tools...

> And what filesystem is the best for postgresql performance?

That would depend, assortedly, on what OS you are using, what kind of
hardware you are running on, what kind of usage patterns you have, as
well as on how you define the notion of "best."

Absent of any indication of any of those things, the best that can be
said is "that depends..."
--
(format nil "~S@~S" "cbbrowne" "acm.org")
http://cbbrowne.com/info/languages.html
TTY Message from The-XGP at MIT-AI:
The-XGP@AI 02/59/69 02:59:69
Your XGP output is startling.

Re: Raw devices vs. Filesystems

От
"Gregory S. Williamson"
Дата:
No point to beating a dead horse (other than the sheer joy of the thing) since postgres does not have raw device
support,but ... 

raw devices, at least on solaris, are about 10 times as fast as cooked file systems for Informix. This might still be a
gainfor postgres' performance, but the portability issues remain. 

raw device use in Informix is safer in terms of data because Informix does not ever have to use the regular file system
andso issues of buffering and so on go away. My understanding -- fortunately not ever tried in the real world -- is
thatpostgres' WAL log system is as reliable as Informix writing to raw devices. 

raw devices can't be copied or tampered with with regular file tools (mv, cp etc.); this changes how backups get done
butalso adds a layer of insulation between valuable data and users. 

Greg Williamson
DBA
GlobeXplorer LLC
-----Original Message-----
From:    Christopher Browne [mailto:cbbrowne@acm.org]
Sent:    Mon 3/29/2004 10:28 AM
To:    pgsql-admin@postgresql.org
Cc:
Subject:    Re: [ADMIN] Raw devices vs. Filesystems
After takin a swig o' Arrakan spice grog, el_vigia_ec@hotmail.com ("Jaime Casanova") belched out:
> Can you tell me (or at least guide me to a palce where i can find the
> answer) what are the benefits of filesystems over raw devices?

For PostgreSQL, filesystems have the merit that you can actually use
them.  PostgreSQL doesn't support use of "raw devices."

Two major benefits of using filesystems as opposed to raw devices are
that:

a) The use of raw devices is dramatically non-portable; you have to
   reimplement data access on every platform you are trying to
   support;

b) The use of raw devices essentially mandates that you implement
   some form of generic filesystem on top of them, which adds
   considerable complexity to your code.

Two benefits to raw devices are claimed...

c) It's faster.  But that assumes that the "cooked" filesystems are
   implemented fairly badly.  That was typically true, a dozen
   years ago, but it isn't so typical now, particularly with a
   fancy cacheing controller.

d) It guarantees application control of update ordering.  Of course,
   with a cacheing controller, or disk drives that lie to one degree
   or another, those guarantees might be gone anyways.

There are other filesystem advantages, such as

e) Shifting "cooked" data around may be as simple as a "mv," whereas
   reorganizing on raw disk requires DB-specific tools...

> And what filesystem is the best for postgresql performance?

That would depend, assortedly, on what OS you are using, what kind of
hardware you are running on, what kind of usage patterns you have, as
well as on how you define the notion of "best."

Absent of any indication of any of those things, the best that can be
said is "that depends..."
--
(format nil "~S@~S" "cbbrowne" "acm.org")
http://cbbrowne.com/info/languages.html
TTY Message from The-XGP at MIT-AI:
The-XGP@AI 02/59/69 02:59:69
Your XGP output is startling.

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend




Re: Raw devices vs. Filesystems

От
Chris Browne
Дата:
gsw@globexplorer.com ("Gregory S. Williamson") writes:
> No point to beating a dead horse (other than the sheer joy of the
> thing) since postgres does not have raw device support, but ...  raw
> devices, at least on solaris, are about 10 times as fast as cooked
> file systems for Informix. This might still be a gain for postgres'
> performance, but the portability issues remain.

That claim seems really rather remarkable.

It implies an entirely stunning degree of inefficiency in the
implementation of filesystems on Solaris.

The amount of indirection involved in walking through i-nodes and such
is something I would expect to introduce some percentage of
performance loss, but for it to introduce overhead of over 900%
presumably implies that Sun (and/or Veritas) got something really
horribly wrong.
--
select 'cbbrowne' || '@' || 'cbbrowne.com';
http://www.ntlug.org/~cbbrowne/nonrdbms.html
Rules of the Evil Overlord #1. "My Legions of Terror will have helmets
with   clear    plexiglass   visors,   not    face-concealing   ones."
<http://www.eviloverlord.com/>

Re: Raw devices vs. Filesystems

От
"Gregory S. Williamson"
Дата:
Remarkable, perhaps, to you. Not in the Informix world. But irrelevant to postgres, no ?

-----Original Message-----
From: Chris Browne [mailto:cbbrowne@acm.org]
Sent: Tuesday, April 06, 2004 1:57 PM
To: pgsql-admin@postgresql.org
Subject: Re: [ADMIN] Raw devices vs. Filesystems


gsw@globexplorer.com ("Gregory S. Williamson") writes:
> No point to beating a dead horse (other than the sheer joy of the
> thing) since postgres does not have raw device support, but ...  raw
> devices, at least on solaris, are about 10 times as fast as cooked
> file systems for Informix. This might still be a gain for postgres'
> performance, but the portability issues remain.

That claim seems really rather remarkable.

It implies an entirely stunning degree of inefficiency in the
implementation of filesystems on Solaris.

The amount of indirection involved in walking through i-nodes and such
is something I would expect to introduce some percentage of
performance loss, but for it to introduce overhead of over 900%
presumably implies that Sun (and/or Veritas) got something really
horribly wrong.
--
select 'cbbrowne' || '@' || 'cbbrowne.com';
http://www.ntlug.org/~cbbrowne/nonrdbms.html
Rules of the Evil Overlord #1. "My Legions of Terror will have helmets
with   clear    plexiglass   visors,   not    face-concealing   ones."
<http://www.eviloverlord.com/>

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Re: Raw devices vs. Filesystems

От
"scott.marlowe"
Дата:
Note that the innefficiency could well lie with Informix's file system
interfacing as easily as it could lie with the operating system.  Do they
charge extra for being able to access raw devices or somehow make more
money by supporting them?  If so, there could be a clear business case for
lots of uwaits() in the code path that handles file systems.

I'm just saying it's a possibility.

On Tue, 6 Apr 2004, Gregory S. Williamson wrote:

> Remarkable, perhaps, to you. Not in the Informix world. But irrelevant to postgres, no ?
>
> -----Original Message-----
> From: Chris Browne [mailto:cbbrowne@acm.org]
> Sent: Tuesday, April 06, 2004 1:57 PM
> To: pgsql-admin@postgresql.org
> Subject: Re: [ADMIN] Raw devices vs. Filesystems
>
>
> gsw@globexplorer.com ("Gregory S. Williamson") writes:
> > No point to beating a dead horse (other than the sheer joy of the
> > thing) since postgres does not have raw device support, but ...  raw
> > devices, at least on solaris, are about 10 times as fast as cooked
> > file systems for Informix. This might still be a gain for postgres'
> > performance, but the portability issues remain.
>
> That claim seems really rather remarkable.
>
> It implies an entirely stunning degree of inefficiency in the
> implementation of filesystems on Solaris.
>
> The amount of indirection involved in walking through i-nodes and such
> is something I would expect to introduce some percentage of
> performance loss, but for it to introduce overhead of over 900%
> presumably implies that Sun (and/or Veritas) got something really
> horribly wrong.
>


Re: Raw devices vs. Filesystems

От
Marsh Ray
Дата:
 > gsw@globexplorer.com ("Gregory S. Williamson") writes:
 >>No point to beating a dead horse (other than the sheer joy of the
 >>thing) since postgres does not have raw device support, but ...  raw
 >>devices, at least on solaris, are about 10 times as fast as cooked
 >>file systems for Informix. This might still be a gain for postgres'
 >>performance, but the portability issues remain.

 > From: Chris Browne [mailto:cbbrowne@acm.org]
 > That claim seems really rather remarkable.
 > It implies an entirely stunning degree of inefficiency in the
 > implementation of filesystems on Solaris.
 > The amount of indirection involved in walking through i-nodes and such
 > is something I would expect to introduce some percentage of
 > performance loss, but for it to introduce overhead of over 900%
 > presumably implies that Sun (and/or Veritas) got something really
 > horribly wrong.

Gregory S. Williamson wrote:
> Remarkable, perhaps, to you. Not in the Informix world. But
 > irrelevant to postgres, no ?

I too am a little surprised by those numbers, but I think the potential
for a performance gain of that order is relevant.

As I once heard someone remark: "When show up at a pool hall talking
those kind of odds, well, people start making phone calls."

- Marsh


Re: Raw devices vs. Filesystems

От
Tom Lane
Дата:
Chris Browne <cbbrowne@acm.org> writes:
> That claim seems really rather remarkable.
> It implies an entirely stunning degree of inefficiency in the
> implementation of filesystems on Solaris.

Solaris has a reputation for having stunning degrees of inefficiency
in a number of places :-(.  On the other hand I've also heard it praised
for its ability to survive partial hardware failures (eg, N out of M
CPUs down), so maybe that's the price you gotta pay.

But to get back to the point of this discussion: to allow PG to use raw
devices instead of filesystems, we'd first have to do a ton of
portability work (since raw disk access is nowhere standard), and
abandon our principle that Postgres does not run as root (since raw disk
access is not permitted to non-root processes by any sane sysadmin).
But that last is a mighty comforting principle to have, anytime someone
complains that their el cheapo whitebox PC locks up as soon as they
start to stress the database.  I know I'd have wasted a lot more time
chasing random hardware breakages if I couldn't say "system freezes and
filesystem corruption are Clearly Not Our Fault".

After that, we get to implement our own filesystem-equivalent management
of disk space allocation, disk I/O scheduling, etc.  Are we really
smarter than all those kernel hackers doing this for a living?  I doubt it.

After that, we get to re-optimize all the existing Postgres behaviors
that are designed to sit on top of a standard Unix buffering filesystem
layer.

After that, we might reap some performance benefits.  Or maybe not.
There's not a heck of a lot of hard evidence that we would --- and
what there is traces to twenty-year-old assumptions about disk drive
and OS behavior, which are quite unlikely to still apply today.

Personally, I have a lot of more-promising projects to pursue...

            regards, tom lane

Re: Raw devices vs. Filesystems

От
Grega Bremec
Дата:
...and on Wed, Apr 07, 2004 at 01:26:02AM -0400, Tom Lane used the keyboard:
>
> After that, we get to implement our own filesystem-equivalent management
> of disk space allocation, disk I/O scheduling, etc.  Are we really
> smarter than all those kernel hackers doing this for a living?  I doubt it.
>
> After that, we get to re-optimize all the existing Postgres behaviors
> that are designed to sit on top of a standard Unix buffering filesystem
> layer.
>
> After that, we might reap some performance benefits.  Or maybe not.
> There's not a heck of a lot of hard evidence that we would --- and
> what there is traces to twenty-year-old assumptions about disk drive
> and OS behavior, which are quite unlikely to still apply today.
>
> Personally, I have a lot of more-promising projects to pursue...
>

Has anyone tried PostgreSQL on top of OCFS? Personally, I'm not sure it
would even work, as Oracle clearly state that OCFS was _never_ meant to
be a fully fledged UNIX filesystem with POSIX features such as correct
timestamp updates, inode changes, etc., but OCFSv2 brings some features
that might lead one into thinking they're about to make it suitable for
uses beyond that of just having Oracle databases sitting on top of it.

Furthermore, this filesystem would be a blazing one stop solution for
all replication issues PostgreSQL currently suffers from, as its main
design goal was to present "a consistent file system image across the
servers in a cluster".

Now, if both goals can be achieved in one go, hell, I'm willing to try
it out myself in an attempt to extract off of it, some performance
indicators that could be compared to other database performance tests
sent to both this and the PERFORM mailing list.

So, anyone? :)

Cheers,
--
    Grega Bremec
    Senior Administrator
    Noviforum Ltd., Software & Media
    http://www.noviforum.si/

Вложения

Re: Raw devices vs. Filesystems

От
Harald Fuchs
Дата:
In article <5719.1081315562@sss.pgh.pa.us>,
Tom Lane <tgl@sss.pgh.pa.us> writes:

> But to get back to the point of this discussion: to allow PG to use raw
> devices instead of filesystems, we'd first have to do a ton of
> portability work (since raw disk access is nowhere standard), and
> abandon our principle that Postgres does not run as root (since raw disk
> access is not permitted to non-root processes by any sane sysadmin).

Why not?  In MySQL/InnoDB, you do a "chown mysql.daemon /dev/raw/raw1"
(or whatever raw disk you want to access), and that's all.

> After that, we get to implement our own filesystem-equivalent management
> of disk space allocation, disk I/O scheduling, etc.  Are we really
> smarter than all those kernel hackers doing this for a living?  I doubt it.

Ditto.  I don't have hard numbers for MySQL, but I didn't see any
noticeable improvement when messing with raw disks (at least under
Linux).

Re: [PERFORM] Raw devices vs. Filesystems

От
Josh Berkus
Дата:
Grega,

> Furthermore, this filesystem would be a blazing one stop solution for
> all replication issues PostgreSQL currently suffers from, as its main
> design goal was to present "a consistent file system image across the
> servers in a cluster".

Does it work, though?   Without Oracle admin tools?

> Now, if both goals can be achieved in one go, hell, I'm willing to try
> it out myself in an attempt to extract off of it, some performance
> indicators that could be compared to other database performance tests
> sent to both this and the PERFORM mailing list.

Hey, any test you wanna run is fine with us.    I'm pretty sure that OCFS
belongs to Oracle, though, patent & copyright, so we couldn't actually use it
in practice.

If your intention in this test is to show the superiority of raw devices, let
me give you a reality check: barring some major corporate backing getting
involved, we can't possibly implement our own PG-FS for database support.  We
already have a TODO list which is far too long for our developer pool, and
implementing a custom FS either takes a large team (OCFS) or several years of
development (Reiser).

Now, if you know somebody who might pay for one, then great ....

--
Josh Berkus
Aglio Database Solutions
San Francisco

Re: [PERFORM] Raw devices vs. Filesystems

От
Steve Atkins
Дата:
On Wed, Apr 07, 2004 at 09:09:16AM -0700, Josh Berkus wrote:

> If your intention in this test is to show the superiority of raw devices, let
> me give you a reality check: barring some major corporate backing getting
> involved, we can't possibly implement our own PG-FS for database support.  We
> already have a TODO list which is far too long for our developer pool, and
> implementing a custom FS either takes a large team (OCFS) or several years of
> development (Reiser).

Is there any documentation as to what guarantees PostgreSQL requires
from the filesystem, or what posix semantics can be relaxed?

Cheers,
  Steve

Re: Raw devices vs. Filesystems

От
"Murthy Kambhampaty"
Дата:
On Wednesday, April 07, 2004 1:26 AM Tom Lane wrote:

>
> But to get back to the point of this discussion: to allow PG
> to use raw devices instead of filesystems, we'd first have to do a ton of
> portability work
...

[The following is said in a low, tentative voice :) ]

I wonder if writing the postgresql data structures as HDF5 data structures (http://hdf.ncsa.uiuc.edu/whatishdf5.html)
withina single HDF5 file (perhaps the WAL files would still reside elsewhere) would improve performance while allowing
HDF5to handle portability, and other useful features, is a better solution than the relying on filesystem features. 

HDF5 actually provides an added portability advantage that postgresql does not currently enjoy:
"a completely portable file format, so that a file can be written on any system and read on any other"
(See http://hdf.ncsa.uiuc.edu/HDF5/RD100-2002/All_About_HDF5.pdf).
The HDF5 "distribution" includes tools for dumping data structures, etc. so if you're hooked on filesystem level
operations,you have the ability to inspect postgresql data structures within the HDF5 file, i.e., "outside postgresql". 

HDF5's is also designed for clustered/grid computing systems:
"The HDF5 format and library provide a powerful means of organizing and accessing data in a manner that allows
scientiststo share, process, and manipulate data in today's heterogeneous and quickly-evolving high-performance
computationalenvironment, including the emerging computational GRIDs."
(http://hdf.ncsa.uiuc.edu/HDF5/RD100-2002/All_About_HDF5.pdf,p. 3). 
So, the main purpose of this post is to suggest that HDF5's design moves a postgresql version built on a HDF5 datastore
thatmuch closer to being ready for cluster-computing environments, with respect to the datastore (there's still the
sharedmemory, etc., that need to be addressed, but ...). 

We're playing with HDF5 from Python (see the pytables project) for our "analytics" work, but that requires moving data
outof postgresql. I suspect that an SQL interface to HDF5 data structures using postgresql would be a lot more
convenient,and that postgresql would gain multiple benefits from having all its data structures in a single HDF5 file.
OTOH,maybe us analytics types are better off with Python over HDF5 and "postgresql on HDF5" is not a net win for
postgresql.Still, there seems to a great advantage to having rich data structures to operate on rather than just
"files",and allowing the HDF5 library to deal with portability, I/O efficiency, and clustering. 

Hope my $0.02 worth was.

Cheers,
    Murthy

Re: [PERFORM] Raw devices vs. Filesystems

От
Grega Bremec
Дата:
...and on Wed, Apr 07, 2004 at 09:09:16AM -0700, Josh Berkus used the keyboard:
>
> Does it work, though?   Without Oracle admin tools?

Hello, Josh. :)

Well, as I said, that's why I was asking - I'm willing to give it a go
if nobody can prove me wrong. :)

> > Now, if both goals can be achieved in one go, hell, I'm willing to try
> > it out myself in an attempt to extract off of it, some performance
> > indicators that could be compared to other database performance tests
> > sent to both this and the PERFORM mailing list.
>
> Hey, any test you wanna run is fine with us.    I'm pretty sure that OCFS
> belongs to Oracle, though, patent & copyright, so we couldn't actually use it
> in practice.

I thought you knew - OCFS, OCFS-Tools and OCFSv2 have not only been open-
source for quite a while now - they're released under the GPL.

    http://oss.oracle.com/projects/ocfs/
    http://oss.oracle.com/projects/ocfs-tools/
    http://oss.oracle.com/projects/ocfs2/

I don't know what that means to you (probably nothing good, as PostgreSQL
is released under the BSD license), but it most definitely can be considered
a good thing for the end user, as she can download it, compile, and set it
up on her disks, without the need to pay Oracle royalties. :)

> If your intention in this test is to show the superiority of raw devices, let
> me give you a reality check: barring some major corporate backing getting
> involved, we can't possibly implement our own PG-FS for database support.  We
> already have a TODO list which is far too long for our developer pool, and
> implementing a custom FS either takes a large team (OCFS) or several years of
> development (Reiser).

Not really - I was just thinking about something not-entirely-a-filesystem
and POK!, OCFS sprang to mind. It omits many POSIX features that slow down
a traditional filesystem, yet it does know the concept of inodes and most
of all, it's _really_ heavy on caching. As such, it sounded quite promising
to me, but trial, I think, is the best test.

The question does spring up though, that Steve raised in another post - just
for the record, what POSIX semantics can a postmaster live without in a
filesystem?

Cheers,
--
    Grega Bremec
    Senior Administrator
    Noviforum Ltd., Software & Media
    http://www.noviforum.si/

Вложения

Re: [PERFORM] Raw devices vs. Filesystems

От
Josh Berkus
Дата:
Grega,

> Well, as I said, that's why I was asking - I'm willing to give it a go
> if nobody can prove me wrong. :)

Why not?   If you have time?

> I thought you knew - OCFS, OCFS-Tools and OCFSv2 have not only been open-
> source for quite a while now - they're released under the GPL.

Keen!   Wonder if we can make them regret it.

Seriously, if Oracle opened this stuff, it's probably becuase they used some
GPL components in it.   It also probably means that it won't work for
anything but Oracle ...

> I don't know what that means to you (probably nothing good, as PostgreSQL
> is released under the BSD license),

Well, it just means that we can't ship OCFS with PostgreSQL.

> The question does spring up though, that Steve raised in another post -
> just for the record, what POSIX semantics can a postmaster live without in
> a filesystem?

You might want to ask that question again on Hackers.  I don't know the
answer, myself.

--
Josh Berkus
Aglio Database Solutions
San Francisco

Re: [PERFORM] Raw devices vs. Filesystems

От
Christopher Browne
Дата:
josh@agliodbs.com (Josh Berkus) wrote:
>> Well, as I said, that's why I was asking - I'm willing to give it a go
>> if nobody can prove me wrong. :)
>
> Why not?   If you have time?

True enough.

>> I thought you knew - OCFS, OCFS-Tools and OCFSv2 have not only been
>> open- source for quite a while now - they're released under the
>> GPL.
>
> Keen!  Wonder if we can make them regret it.
>
> Seriously, if Oracle opened this stuff, it's probably becuase they
> used some GPL components in it.  It also probably means that it
> won't work for anything but Oracle ...

It could be that the experiment shows that OCFS isn't all that
helpful.  Or that it helps cover inadequacies in certain aspects of
how Oracle accesses filesystems.

If it _does_ show that it is helpful, then that may suggest a
filesystem implementation strategy useful for the BSD folks.

The main "failure case" would be if the exercise shows that using OCFS
is pretty futile.
--
select 'cbbrowne' || '@' || 'acm.org';
http://www3.sympatico.ca/cbbrowne/linux.html
Do you know where your towel is?