Обсуждение: best practices for separating data and logs

Поиск
Список
Период
Сортировка

best practices for separating data and logs

От
"Peter Koczan"
Дата:
Hi all,

I'm planning a lot of changes for migrating to PostgreSQL 8.3, among
them being a better way of separating data and logs (transaction logs,
that is).

Currently, the OS and log data are on one disk system, and the data
(including configs) are on the other disk system. After creating the
database cluster, I copy the pg_xlog directory to the OS system and
symlink it from the database.

So, I'm wondering...

- Are there any best practices, or better practices, than symlinking?

- How do other people have this set up, or recommend setting this up
(e.g. also moving pg_clog or other things as well)?

I searched through the archives and found a few threads regarding
separating data and WAL, but nothing regarding best practices or
specifics.

Thanks,
Peter

P.S. Here are exact details on how I have things set up.

[root@mitchell testing-8.2]# pwd
/scratch.1/postgres/testing-8.2

[root@mitchell testing-8.2]# ls -l
total 48
drwx------ 8 postgres postgres    83 Oct  8 16:57 base
drwx------ 2 postgres postgres  4096 Oct 12 05:07 global
drwx------ 2 postgres postgres    94 Oct  9 14:28 pg_clog
-rw------- 1 postgres postgres  3841 Aug 28 14:16 pg_hba.conf
-rw------- 1 postgres postgres  1460 Aug  8 14:06 pg_ident.conf
drwx------ 4 postgres postgres    34 Aug  8 14:06 pg_multixact
drwx------ 2 postgres postgres    17 Oct 12 00:15 pg_subtrans
drwx------ 2 postgres postgres     6 Aug  8 14:06 pg_tblspc
drwx------ 2 postgres postgres     6 Aug  8 14:06 pg_twophase
-rw------- 1 postgres postgres     4 Aug  8 14:06 PG_VERSION
lrwxrwxrwx 1 root     root        37 Oct  8 16:18 pg_xlog ->
/scratch/postgres/testing-8.2/pg_xlog
-rwxr-xr-x 1 postgres postgres 15212 Oct 11 15:05 postgresql.conf
-rw------- 1 postgres postgres    96 Oct 11 15:55 postmaster.opts
-rw------- 1 postgres postgres    57 Oct 11 15:55 postmaster.pid
-rw-r--r-- 1 postgres postgres  1805 Aug  8 14:06 server.crt
-rw------- 1 postgres postgres  1675 Aug  8 14:06 server.key

[root@mitchell testing-8.2]# df
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/sda1              1019208    434264    532336  45% /
/dev/sda8             56394768    348344  53135448   1% /scratch
/dev/sda6              1019208     34664    931936   4% /tmp
/dev/sda3              4061572    202080   3649844   6% /var
/dev/sda7              1019208    701972    264628  73% /var/vice/cache
/dev/sda2             10154020   5616776   4013124  59% /usr
tmpfs                   451556         0    451556   0% /dev/shm
/dev/sdb1             78110004   9589792  68520212  13% /scratch.1
AFS                    9000000         0   9000000   0% /afs

Re: best practices for separating data and logs

От
Tom Lane
Дата:
"Peter Koczan" <pjkoczan@gmail.com> writes:
> I'm planning a lot of changes for migrating to PostgreSQL 8.3, among
> them being a better way of separating data and logs (transaction logs,
> that is).

> Currently, the OS and log data are on one disk system, and the data
> (including configs) are on the other disk system. After creating the
> database cluster, I copy the pg_xlog directory to the OS system and
> symlink it from the database.

> So, I'm wondering...

> - Are there any best practices, or better practices, than symlinking?

I believe 8.3's initdb has an explicit option for making pg_xlog be a
symlink to someplace.  The results aren't different from doing it
manually, but it saves a step (and a chance for mistake).

> - How do other people have this set up, or recommend setting this up
> (e.g. also moving pg_clog or other things as well)?

I think consensus is that pg_clog is best treated as part of the data.
The point of moving xlog is that the heads on that drive will never have
to move away from the current xlog tip; as soon as you put something
else on that drive, you ruin the performance benefit.

            regards, tom lane

Re: best practices for separating data and logs

От
Chris Browne
Дата:
tgl@sss.pgh.pa.us (Tom Lane) writes:
> "Peter Koczan" <pjkoczan@gmail.com> writes:
>> I'm planning a lot of changes for migrating to PostgreSQL 8.3, among
>> them being a better way of separating data and logs (transaction logs,
>> that is).
>
>> Currently, the OS and log data are on one disk system, and the data
>> (including configs) are on the other disk system. After creating the
>> database cluster, I copy the pg_xlog directory to the OS system and
>> symlink it from the database.
>
>> So, I'm wondering...
>
>> - Are there any best practices, or better practices, than symlinking?
>
> I believe 8.3's initdb has an explicit option for making pg_xlog be a
> symlink to someplace.  The results aren't different from doing it
> manually, but it saves a step (and a chance for mistake).

Yes, indeed.

  -X, --xlogdir=XLOGDIR     location for the transaction log directory

I had not been aware of this one; this seems like a nice feature to
add support for in an init script...

We've been using an init script that offers a whole bunch of options,
originally due to Drew Hammond.

These days, the actions offered are thus:
  [start|reload|stop|env|mkdir|initdb|logtail]

start/stop/reload are pretty traditional.  The other options are
pretty interesting, particularly for cases where you might want to:

 - Frequently create databases from scratch, as when testing
   CVS HEAD
 - Alternatively, to help establish common policies, for the "less
   frequent" cases.

  env:   Sets up PATH, MAN_PATH, PGPORT with the values used by
         the backend in "this" init file

  mkdir: Sets up all the directories required both for DB backend and
         for logging

  logtail:  runs "tail -f" on the last log file for the cluster

  initdb:  Runs initdb, pointing at particular directories, and with
         particular configuration policy.

         I have recently augmented this by making it smart enough to
         rewrite the postgresql.conf file (using sed) to establish
         default values for a dozen or so options that tend to get
         customized with fairly common values.

         In effect, the entire cluster configuration gets set up in
         about a 10-line section near the top of the script.

         Adding in an option to redirect pg_xlog seems like a mighty
         fine idea; I know that on various occasions, I have had the
         irritation of building clusters and having to go to some
         fragile manual effort to shift pg_xlog somewhere else.

         Automating THAT seems like it's a "no-brainer" as far as
         being an excellent idea...

I probably ought to ask around for permission to release this; it
seems like it's probably useful enough (I have been using it a lot)
that it ought to be in a CVS repository somewhere, rather than
languishing on my desktop.
--
(format nil "~S@~S" "cbbrowne" "linuxfinances.info")
http://linuxdatabases.info/info/linuxxian.html
I am not a Church numeral!
I am a free variable!

Re: best practices for separating data and logs

От
Alvaro Herrera
Дата:
Chris Browne wrote:

>  - Alternatively, to help establish common policies, for the "less
>    frequent" cases.
>
>   env:   Sets up PATH, MAN_PATH, PGPORT with the values used by
>          the backend in "this" init file

How does this work?  I have my own script to do things, and one of the
painful bits is setting up the environment.  What my script does is emit
the var definitions to stdout, so I have to run the script on `` to let
the current shell get the definitions.

>   logtail:  runs "tail -f" on the last log file for the cluster

Hmm, interesting.  Perhaps I oughta implement this.  Currently my script
starts the server by launching the postmaster in foreground, so there is
no need for this -- but I have frequently wished for an option to start
it in background, and be able to get at the log separately.

I use this script so frequently that I have also implemented bash
completion for it.  It is truly helpful.  (Of course, it deals with
multiple installations by getting the one to use as an argument, and
autocompletes that as well).

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.