Обсуждение: request for tuning suggestions from PC Week Labs

Поиск
Список
Период
Сортировка

request for tuning suggestions from PC Week Labs

От
Timothy Dyck
Дата:


(This was originally posted on the Admin list where it was suggested it
should be posted to pgsql-hackers.)

Hi all, I'm the database analyst for PC Week and am comparing
PostgreSQL with Inprise's InterBase (which will be open sourced later this
year). I wrote the features list that is circulating around the list right
now by
Marc Fournier.

As part of this project, I'm running a benchmark with a mix of OLTP
and DSS queries in a set of various mixes and at a variety of user loads
(up to 100 concurrent users). I'd like to get your tuning suggestions on
the engine to make sure I am not missing anything.

My test server is a departmental-type machine with two Pentium III 450MHz
CPUs with 512MB of RAM running RedHat Linux 6.1. PGDATA is pointing to a
RAID 5 array and the database is ~40 MB of data before indexing, and so
will fit entirely into the db cache. The OS swapfile is not used at all. I
will be using ODBC to query the database. I compiled with -m486 and am
using a page size of 2KB instead of 8KB as the benchmark is mostly
OLTP-type queries.

1. The biggest performance item I've seen in looking through the mailing
lists is the fsync option. I want to leave this enabled as I don't think a
transactional database should ever lose data. My understanding is that
with it on PG checkpoints after every commit. Is there a way to let the
log grow to a certain size before checkpointing? When fsync is off, how is
data loss possible?

2. Can I move the log to a different spindle from the disks the
database data is on? The manuals seem to indicate the log is actually
part of the datafile itself, which would imply it can't be moved
elsewhere.

3. Any other suggestions are much appreciated.

Regards,
Tim Dyck
Senior Analyst, PC Week Labs
timothy_dyck@zd.com
519-746-4241




Re: [HACKERS] request for tuning suggestions from PC Week Labs

От
Tom Lane
Дата:
Timothy Dyck <Timothy_Dyck@zd.com> writes:
> 1. The biggest performance item I've seen in looking through the mailing
> lists is the fsync option. I want to leave this enabled as I don't think a
> transactional database should ever lose data. My understanding is that
> with it on PG checkpoints after every commit. Is there a way to let the
> log grow to a certain size before checkpointing? When fsync is off, how is
> data loss possible?

With fsync on, pgsql does fsync() after every write, which essentially
means you get zero overlap of computation and I/O.  Horribly
inefficient.

With fsync off, we don't do the fsync() call.  The data is still
pushed out to the Unix kernel at the same times, but the kernel's disk
scheduler has discretion about what order the disk pages actually get
sent to disk in.  Also, you get fewer physical writes when several
successive transactions modify the same disk page.  On most Unixes this
makes for a vast performance improvement.

The risk scenario here is that the pg_log update saying that your
transaction has committed might get physically written out before
the data pages that contain the actual tuples do.  We write the
pg_log page last, of course, but the kernel might reorder the physical
writes.

If the pg_log update gets written, and some but not all of the updated
data pages have been written, and you suffer a system crash, then after
reboot it appears that some but not all of the changes made by your
transaction have "stuck".  That counts as data corruption for most
applications.

Note that I'm talking about an actual system crash: power failure,
hardware failure, or kernel failure.  A crash of the Postgres backend
does *not* create this hazard.  Also note that a crash does not create
a corruption hazard unless pg_log says that the incomplete transaction
committed.

My feeling is that if you have a UPS and a reliable kernel, there is
no meaningful reliability benefit from keeping fsync on --- certainly
not enough to justify the performance hit.  The above risk analysis
ignores non-software risk issues, when in fact there are big risks
at the hardware level.  One of the more obvious ones is that modern
disk drives do a certain amount of traffic reordering themselves.
If your drive acts like that, I don't see that fsync buys anything
at all, except perhaps protection against kernel crashes.  My experience
(on HPUX) is that the kernel's MTBF is little worse than the disk drive's,
so I don't use fsync.  YMMV.

> 2. Can I move the log to a different spindle from the disks the
> database data is on? The manuals seem to indicate the log is actually
> part of the datafile itself, which would imply it can't be moved
> elsewhere.

You could move pg_log to a different drive but it probably wouldn't buy
much.  pg_log only contains a commit/no commit flag for each
transaction, not copies of data, so there's not that much traffic there.

People have reported seeing wins from moving indexes on big tables to
separate drives.  We don't currently have any nice neat GUI for that,
but you can kluge it with a few symbolic links.  Better methods are
under discussion...
        regards, tom lane