Обсуждение: logging full queries separately

Поиск
Список
Период
Сортировка

logging full queries separately

От
Charles Sprickman
Дата:
Howdy,

I've been finding that keeping full query logs is quite helpful; I started doing this to be able to run pgbadger each
dayto get a nice overview of what's going on with the db servers.  The one huge downside to this is that the logs are
sonoisy, it's hard to track down errors and stats real-time since the query logging is so noisy. 

Looking at the manual, I don't see any creative solutions to split the query logs off from the rest of the logs.  Is
thereanything I've overlooked to take care of this? 

Right now I'm considering just switching to syslog-ng or rsyslog and seeing if there's a way I can parse the non-query
messagesout and dump them in another logfile…  Any pointers on that?  Without breaking compatibility with pgbadger,
whatlog_line_prefix might make it easier to pick out statements (even when they span multiple lines)? 

Thanks,

Charles

Re: logging full queries separately

От
"ktm@rice.edu"
Дата:
On Sat, Feb 23, 2013 at 12:55:07AM -0500, Charles Sprickman wrote:
> Howdy,
>
> I've been finding that keeping full query logs is quite helpful; I started doing this to be able to run pgbadger each
dayto get a nice overview of what's going on with the db servers.  The one huge downside to this is that the logs are
sonoisy, it's hard to track down errors and stats real-time since the query logging is so noisy. 
>
> Looking at the manual, I don't see any creative solutions to split the query logs off from the rest of the logs.  Is
thereanything I've overlooked to take care of this? 
>
> Right now I'm considering just switching to syslog-ng or rsyslog and seeing if there's a way I can parse the
non-querymessages out and dump them in another logfile…  Any pointers on that?  Without breaking compatibility with
pgbadger,what log_line_prefix might make it easier to pick out statements (even when they span multiple lines)? 
>
> Thanks,
>
> Charles
>

Hi Charles,

Using syslog will work well for this. I would definitely recommend rsyslog
over syslog-ng. We have a system running both and rsyslog is more efficient
with less overhead and is less costly than syslog-ng for some functionality.

Regards,
Ken


Re: logging full queries separately

От
John Rouillard
Дата:
On Sat, Feb 23, 2013 at 12:55:07AM -0500, Charles Sprickman wrote:
>[...]
>The one huge downside to this is that the logs are so noisy, it's
>hard to track down errors and stats real-time since the query logging
>is so noisy.
> [...]
>Right now I'm considering just switching to syslog-ng or rsyslog
>and seeing if there's a way I can parse the non-query messages
>out and dump them in another logfile…  Any pointers on that?

I use SEC for log analysis and I identify query logs by looking for
the duration entry in the first line. Then pull the triplet of:

   hostname, pid and log line id (the 5 in [5-2], [5-3] etc.)

and find all the rest of the lines by matching that triplet.

>Without breaking compatibility with pgbadger, what
>log_line_prefix might make it easier to pick out
>statements (even when they span multiple lines)?

Just for reference I use:

  '%t %u@%d(%p): '

--
                -- rouilj

John Rouillard       System Administrator
Renesys Corporation  603-244-9084 (cell)  603-643-9300 x 111


Re: logging full queries separately

От
Charles Sprickman
Дата:
On Feb 25, 2013, at 9:59 AM, John Rouillard wrote:

> On Sat, Feb 23, 2013 at 12:55:07AM -0500, Charles Sprickman wrote:
>> [...]
>> The one huge downside to this is that the logs are so noisy, it's
>> hard to track down errors and stats real-time since the query logging
>> is so noisy.
>> [...]
>> Right now I'm considering just switching to syslog-ng or rsyslog
>> and seeing if there's a way I can parse the non-query messages
>> out and dump them in another logfile…  Any pointers on that?
>
> I use SEC for log analysis and I identify query logs by looking for
> the duration entry in the first line. Then pull the triplet of:
>
>   hostname, pid and log line id (the 5 in [5-2], [5-3] etc.)
>
> and find all the rest of the lines by matching that triplet.

I'm having sort of the opposite problem - for pgbadger, I simply log
everything and that's easy enough.

For real-time human consumption, I want to filter out all the
statements so I'm just left with errors, periodic stats, notices of
temporary tables being created, etc.  But what really screws me up
on that match is the multi-line output:

Feb 19 05:06:16 j15 postgres[32818]: [28-2] db=collects,user=collects STATEMENT:  select count(distinct s.sub_id) as
subscribers,count(distinct ls.id) as subscriptions 
Feb 19 05:06:16 j15 postgres[32818]: [28-3]           from   list l
Feb 19 05:06:16 j15 postgres[32818]: [28-4]             join list_sub ls  on (l.list_id = ls.list_id)

First line, easy enough, if I see "STATEMENT", throw it away.

It's those next two lines.  And rsyslog is doing line-by-line
processing on filters.

I suppose my other option is to find every possible prefix
(STATEMENT, LOG, ERROR, and who knows how many others) and do a
positive match for all of those rather than trying to negate the
statements.

Does any of that make sense?

Here's a rough example in rsyslog's config format:

} else if $fromhost-ip == '10.99.88.22' then {
        local0.*                                /var/log/j15/pgsql-full
        if not $msg contains 'STATEMENT' then   /var/log/j15/pgsql-brief
}

I suspect I'm going to have to instead explicitly match everything
that's not a "STATEMENT".  Just need to figure out what the full
list is.  I think that part of the log line corresponds to "%i" in
log_line_prefix.

Thanks,

Charles

>
>> Without breaking compatibility with pgbadger, what
>> log_line_prefix might make it easier to pick out
>> statements (even when they span multiple lines)?
>
> Just for reference I use:
>
>  '%t %u@%d(%p): '
>
> --
>                 -- rouilj
>
> John Rouillard       System Administrator
> Renesys Corporation  603-244-9084 (cell)  603-643-9300 x 111
>
>
> --
> Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-admin



Re: logging full queries separately

От
John Rouillard
Дата:
On Wed, Feb 27, 2013 at 05:52:49AM -0500, Charles Sprickman wrote:
> On Feb 25, 2013, at 9:59 AM, John Rouillard wrote:
> > On Sat, Feb 23, 2013 at 12:55:07AM -0500, Charles Sprickman wrote:
> >> [...]
> >> The one huge downside to this is that the logs are so noisy, it's
> >> hard to track down errors and stats real-time since the query logging
> >> is so noisy.
> >> [...]
> >> Right now I'm considering just switching to syslog-ng or rsyslog
> >> and seeing if there's a way I can parse the non-query messages
> >> out and dump them in another logfile…  Any pointers on that?
> >
> > I use SEC for log analysis and I identify query logs by looking for
> > the duration entry in the first line. Then pull the triplet of:
> >
> >   hostname, pid and log line id (the 5 in [5-2], [5-3] etc.)
> >
> > and find all the rest of the lines by matching that triplet.
>
> I'm having sort of the opposite problem - for pgbadger, I simply log
> everything and that's easy enough.
>
> For real-time human consumption, I want to filter out all the
> statements so I'm just left with errors, periodic stats, notices of
> temporary tables being created, etc.  But what really screws me up
> on that match is the multi-line output:
>
> Feb 19 05:06:16 j15 postgres[32818]: [28-2] db=collects,user=collects STATEMENT:  select count(distinct s.sub_id) as
subscribers,count(distinct ls.id) as subscriptions 
> Feb 19 05:06:16 j15 postgres[32818]: [28-3]           from   list l
> Feb 19 05:06:16 j15 postgres[32818]: [28-4]             join list_sub ls  on (l.list_id = ls.list_id)
>
> First line, easy enough, if I see "STATEMENT", throw it away.
>
> It's those next two lines.  And rsyslog is doing line-by-line
> processing on filters.

Yeah rsyslog can't do that. Nothing that looks like by line and
doesn't keep state can really do that. That's why I use SEC.

> I suppose my other option is to find every possible prefix
> (STATEMENT, LOG, ERROR, and who knows how many others) and do a
> positive match for all of those rather than trying to negate the
> statements.

You could generate a set of include filters, but some of those
staments (e.g. errors) can also be multi line so you will get only the
first line and miss the rest of the detail. That may be sufficent for
your use case, but I know it wasn't for any of mine.

> Here's a rough example in rsyslog's config format:
>
> } else if $fromhost-ip == '10.99.88.22' then {
>         local0.*                                /var/log/j15/pgsql-full
>         if not $msg contains 'STATEMENT' then   /var/log/j15/pgsql-brief
> }
>
> I suspect I'm going to have to instead explicitly match everything
> that's not a "STATEMENT".  Just need to figure out what the full
> list is.  I think that part of the log line corresponds to "%i" in
> log_line_prefix.

Well in your example above the line with [28-3] is not a STATEMENT, so
you will have to get all the possible labels ERROR, WARNING, stats...
It may be easier if you throw away all lines but the first in any
series of messages. You can use something like:

   if not $msg contains '-1]' then  /dev/null

(or if you can use regexp's use: ': \[\d+-1\]') earlier in your rules
to toss all the continuation lines away. Then you have just a series
of filters that looks for the ERROR, STTEMENT, WARNING keywords that
occur on the first lines. I have not found that to be very helpful in
the past as the important info is often in the continuation lines but
YMMV.

--
                -- rouilj

John Rouillard       System Administrator
Renesys Corporation  603-244-9084 (cell)  603-643-9300 x 111