Re: Industrial-Strength Logging

Поиск
Список
Период
Сортировка
От Giles Lean
Тема Re: Industrial-Strength Logging
Дата
Msg-id 16919.960037174@nemeton.com.au
обсуждение исходный текст
Ответ на Industrial-Strength Logging  (Tim Holloway <mtsinc@southeast.net>)
Ответы Re: Re: Industrial-Strength Logging  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
------- Blind-Carbon-Copy

To: pgsql-hackers@postgresql.org
Subject: Re: Industrial-Strength Logging
In-reply-to: <Pine.LNX.4.21.0006021547090.406-100000@localhost.localdomain>
Date: Sat, 03 Jun 2000 22:59:34 +1000
Message-ID: <16919.960037174@nemeton.com.au>
From: Giles Lean <giles@nemeton.com.au>


On Sat, 3 Jun 2000 01:48:33 +0200 (CEST)  Peter Eisentraut wrote:

> Yeah, let's have another logging discussion... :)

Mmm, seems popular.  There was a mention on -ports and -general a
couple of weeks ago, and here we are (were) on -patches.  I'm moving
this discussion to -hackers (hope that's a good choice) since that is
where Tim Holloway's proposals were discussed late last year.

A start point I found in the archives for Tim's proposal is:
http://www.postgresql.org/mhonarc/pgsql-hackers/1999-10/msg00747.html

I'm not proposing anything that advanced. In particular, I'm not
discussing the -content- of log messages at all.  For now it would be
nice to see the logging mechanism improved; changing or improving the
content can be another project.

I don't discuss the current logging implementation except to note that
the backend postgres processes' logging depends on whether the process
is running under postmaster or not, has a controlling terminal or not,
whether a -o option was provided, and whether postgres was compiled to
use syslog.  Maybe that functionality can be simplified a bit ... ;-)

One more thing I don't discuss is how the debug log level is set.
Certainly something more sophisticated and dynamically variable than
the current command line method would be nice, but that too can be a
discussion for another day; it isn't much related to -how- the error
messages are tucked away.

Typical logging methods
=======================

(a)(i)  write to standard error with redirection to a file

    Pro:
    - what the code (mostly) does currently
    - very easy to set up, just redirect standard error at startup
    - efficient and low overhead

    Con:
    - can't rotate log files
    - problematic when there is an I/O error or the filesystem the log
      file is on fills up

(a)(ii) write to standard error, with standard error piped to another
        process

    Pro:
    - administrator chooses between (i) and (ii) and can change this
      via shutdown and restart, no recompilation needed
    - no code changes to backend programs
    - clean separation of functionality
    - choice of backend logging programs
      o Bernstein's logtools
      o Apache's rotatelogs
      o swatch
      o logsurfer
      o ...

    Con:
    - backend can block if the logging process is not reading log
      messages fast enough (can the backends generate enough data for
      this to be a problem in practice?)
    - reliability of message logging is dependent on the log
      process
    - log messages can be lost if the log process aborts, or is not
      started (solution: make portmaster responsible for starting and
      restartin the log process)

(b) write to named log file(s)

    One way to allow rotation of log files is for the backend
    processes to know what log files they write to, and to have them
    open them directly without shell redirection.  There is some
    support for this with the postgres -o option, but no support
    for rotating these files that I have seen so far.

    In the simplest case, the backend processes open the log file at
    when they start and close it when they exit.  This allows rotation
    of the log file by moving it and waiting for all the currently
    running backend processes to finish.

    Pro:
    - relatively simple code change
    - still efficient and low overhead

    Con:
    - backend processes can run for some time, and postmaster runs
      indefinitely, so at least postmaster needs to know about log
      file rotation
    - doesn't help much for I/O errors or full filesystem

    To address these limitations some applications open their log file
    for each message and then close it afterward:

    Pro:
    - nothing holds the log file open for long
    - still efficient and low overhead for the actual writing the log
      file

    Con:
    - all error logging has to be via a log routine.  This would be
      elog(), but there is some use of fprintf(stderr, ...) around the
      place that would want to be changed

    - there will be some efficiency hit for the open() and close()
      calls. This won't be -too- bad since the operating system's
      inode cache (or local equivalent) should contain an entry for
      the log file, but it is still two more system calls.

    Another way to handle rotation with long running processes is to
    signal them to re-open their log file, like syslogd is managed:

    Pro:
    - it's a solution

    Con:
    - more code in the backend processes
    - more communication with the backend processes
    - more complication

(c) log via some logging facility such as syslogd

    Pro:
    - people know syslogd
    - syslogd "looks like" the right answer

    Con:
    - messages are sent to it typically via UDP, so it's easy for them
      to get lost
    - syslogd is not coded robustly, and can hang, stop listening to
      input sources, stop writing to output files, and put wrong
      timestamps on messages
    - using non-UDP transport to syslogd is non-portable (some systems
      allow Unix domain sockets, some named pipes, some neither)
    - syslogd can be hard to secure (e.g. to stop it listening on
      particular network interfaces)
    - do you know what your syslog(3) routine does when it gets a
      write(2) error?
    - it is not supported by all (any?) vendors to have syslogd write
      to a pipe, so on-the-fly processing of error messages is not
      possible; they have to be written to the filesystem
    - Unix specific(?)

(d) log into the database

    Pro:
    - we've got a nice tool, let's use it

    Con:
    - chicken-and-egg problem of logging serious/fatal errors
    - infinite recursion problem of logging messages that cause more errors
    - error messages are predominantly text messages, very susceptible
      to processing with perl, grep and friends.  It isn't crystal
      clear that having them inside the database helps a lot
    - using the database is friendly to a DBA, but quite possibly not
      to the system adminstrator who very possibly knows no SQL but
      who (on a Unix system, at least) has many text log files to look
      after
    - postmaster doesn't access tables according to the previous
      discussion

Recommendations
===============

From here on it's definitely all IMHO!  Your mileage may vary.

To restate the methods I discussed:

    (a)(i)  standard error to file
       (ii) standard error piped to a process
    (b) named log file(s)
    (c) syslogd
    (d) database

I would recommend (a)(ii), with (a)(i) available for anyone who wants
it.  (Someone who has high load 9-5 but who can shut down daily might
be happy writing directly to a log file, for example.)

Using the database is too complex/fragile, using syslogd fails too
often, and having every process involved manage log files as in (b)
seems like more work than is necessary unless/until logging via a
single process turns out to be a bottleneck.

There has been a suggestion to use the Apache "rotatelogs" program to
do the logging.  That program is perhaps a little -too- simplistic:

- - no timestamp option
- - no way to make ad-hoc requests for log file rotation
- - write errors cause it to exit (... oops!)
- - no fsync option

On the other hand the program is only 100 lines.  Writing a robust
version isn't rocket science, and if postmaster is taught to start it,
then pg_ctl looks like a natural place to control log file rotation
and possibly one day log levels as well.

Regards,

Giles

P.S. Yes, I've some coding time to offer.  I'll wait and see what is
liked before I do anything though. ;-)





------- End of Blind-Carbon-Copy

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tatsuo Ishii
Дата:
Сообщение: Re: [GENERAL] PostgreSQL v7.0.1 Released
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: New warning code for missing FROM relations