Обсуждение: get current log file

Поиск
Список
Период
Сортировка

get current log file

От
"Armor"
Дата:
Hello,

    I find there is a new feature about getting current log file name on the TODO list (for detail please check http://www.postgresql.org/message-id/Pine.GSO.4.64.0811101325260.9276@westnet.com). On the other side, we finish a ticket to this requirement for our customer. 
    If the PG community still need this feature,  there will be a pleasure for us to make contribution. 

------------------
Jerry Yu
 

Re: get current log file

От
Alvaro Herrera
Дата:
Armor wrote:
> Hello,
> 
> 
>     I find there is a new feature about getting current log file name on the TODO list (for detail please check
http://www.postgresql.org/message-id/Pine.GSO.4.64.0811101325260.9276@westnet.com).On the other side, we finish a
ticketto this requirement for our customer. 
 
>     If the PG community still need this feature,  there will be a pleasure for us to make contribution. 

Please propose a design and we'll discuss.  There's clearly need for
this feature.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: get current log file

От
"Armor"
Дата:
     As we known, the name of current log file depends on the number of seconds (for simple, later I will call it last_syslogger_file_time) since Epoch when create new log file. So, for this feature, the key is how syslogger process pass last_syslogger_file_time to backend processes.

    To pass last_syslogger_file_time, we have 2 solutions: 1, add a global variable to record last_syslogger_file_time which shared by backends and syslogger, so backends can get last_syslogger_file_time very easily; 2 syslogger process send last_syslogger_file_time to pgstat process when last_syslogger_file_time changes, just as other auxiliary processes send stat  message to pgstat process, and  pgstat process will write  last_syslogger_file_time into stat file so that backend can get last_syslogger_file_time via reading this stat file.

   For these 2 solutions, we prefer to later, because we want to keep the global variables space much simpler.
   On the other side, we need to add a new function named pg_stat_get_log_file_name() which will return the current log file name  according to last_syslogger_file_time and log file name format.
   If you have any question, please let me know.
------------------
Jerry Yu
 


------------------ Original ------------------
From:  "Alvaro Herrera";<alvherre@2ndquadrant.com>;
Date:  Tue, Feb 2, 2016 06:30 PM
To:  "Armor"<yupengstone@qq.com>;
Cc:  "pgsql-hackers"<pgsql-hackers@postgresql.org>;
Subject:  Re: [HACKERS] get current log file

Armor wrote:
> Hello,
>
>
>     I find there is a new feature about getting current log file name on the TODO list (for detail please check http://www.postgresql.org/message-id/Pine.GSO.4.64.0811101325260.9276@westnet.com). On the other side, we finish a ticket to this requirement for our customer.
>     If the PG community still need this feature,  there will be a pleasure for us to make contribution.

Please propose a design and we'll discuss.  There's clearly need for
this feature.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Service

Re: get current log file

От
Euler Taveira
Дата:
On 02-02-2016 10:22, Armor wrote:
>      As we known, the name of current log file depends on the number of
> seconds (for simple, later I will call it last_syslogger_file_time)
> since Epoch when create new log file. So, for this feature, the key is
> how syslogger process pass last_syslogger_file_time to backend processes.
> 
I didn't like the name. Let's call it syslogger_file_name. It describes
what the variable is (actual file name that syslogger is writing on).

>     To pass last_syslogger_file_time, we have 2 solutions: 1, add a
> global variable to record last_syslogger_file_time which shared by
> backends and syslogger, so backends can get last_syslogger_file_time
> very easily; 2 syslogger process send last_syslogger_file_time to pgstat
> process when last_syslogger_file_time changes, just as other auxiliary
> processes send stat  message to pgstat process, and  pgstat process will
> write  last_syslogger_file_time into stat file so that backend can
> get last_syslogger_file_time via reading this stat file.
> 
I prefer (1) because (i) logfile name is not statistics and (ii) stats
collector could not respond in certain circumstances (and even discard
some messages).


--   Euler Taveira                   Timbira - http://www.timbira.com.br/  PostgreSQL: Consultoria, Desenvolvimento,
Suporte24x7 e Treinamento
 



Re: get current log file

От
Robert Haas
Дата:
On Thu, Feb 25, 2016 at 1:15 AM, Euler Taveira <euler@timbira.com.br> wrote:
> On 02-02-2016 10:22, Armor wrote:
>>      As we known, the name of current log file depends on the number of
>> seconds (for simple, later I will call it last_syslogger_file_time)
>> since Epoch when create new log file. So, for this feature, the key is
>> how syslogger process pass last_syslogger_file_time to backend processes.
>>
> I didn't like the name. Let's call it syslogger_file_name. It describes
> what the variable is (actual file name that syslogger is writing on).
>
>>     To pass last_syslogger_file_time, we have 2 solutions: 1, add a
>> global variable to record last_syslogger_file_time which shared by
>> backends and syslogger, so backends can get last_syslogger_file_time
>> very easily; 2 syslogger process send last_syslogger_file_time to pgstat
>> process when last_syslogger_file_time changes, just as other auxiliary
>> processes send stat  message to pgstat process, and  pgstat process will
>> write  last_syslogger_file_time into stat file so that backend can
>> get last_syslogger_file_time via reading this stat file.
>>
> I prefer (1) because (i) logfile name is not statistics and (ii) stats
> collector could not respond in certain circumstances (and even discard
> some messages).

(1) seems like a bad idea, because IIUC, the syslogger process doesn't
currently touch shared memory.  And in fact, shared memory can be
reset after a backend exits abnormally, but the syslogger (alone among
all PostgreSQL processes other than the postmaster) lasts across
multiple such resets.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: get current log file

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> On Thu, Feb 25, 2016 at 1:15 AM, Euler Taveira <euler@timbira.com.br>>> wrote:
>>> To pass last_syslogger_file_time, we have 2 solutions: 1, add a
>>> global variable to record last_syslogger_file_time which shared by
>>> backends and syslogger, so backends can get last_syslogger_file_time
>>> very easily; 2 syslogger process send last_syslogger_file_time to pgstat
>>> process when last_syslogger_file_time changes, just as other auxiliary
>>> processes send stat  message to pgstat process, and  pgstat process will
>>> write  last_syslogger_file_time into stat file so that backend can
>>> get last_syslogger_file_time via reading this stat file.

>> I prefer (1) because (i) logfile name is not statistics and (ii) stats
>> collector could not respond in certain circumstances (and even discard
>> some messages).

> (1) seems like a bad idea, because IIUC, the syslogger process doesn't
> currently touch shared memory.  And in fact, shared memory can be
> reset after a backend exits abnormally, but the syslogger (alone among
> all PostgreSQL processes other than the postmaster) lasts across
> multiple such resets.

Yes, allowing the syslogger to depend on shared memory is right out.
I don't particularly care for having it assume the stats collector
exists, either -- in fact, given the current initialization order
it's physically impossible for syslogger to send to stats collector
because the former is started before the latter's communication
socket is made.

I haven't actually heard a use-case for exposing the current log file name
anyway.  But if somebody convinced me that there is one, I should think
that the way to implement it is to report the actual *name*, not
components out of which you could reconstruct the name only by assuming
that you know everything about the current syslogger configuration and
the code that builds log file names.  That's obviously full of race
conditions and code-maintenance hazards.
        regards, tom lane



Re: get current log file

От
Robert Haas
Дата:
On Fri, Feb 26, 2016 at 8:31 AM, Armor <yupengstone@qq.com> wrote:
> I think I know what you are concerned about. May be I did not explain my
> solution very clearly.
> (i) Using a variable named last_syslogger_file_time replace
> first_syslogger_file_time in syslogger.c. When postmaster initialize logger
> process,   last_syslogger_file_time will be assign the time stamp when
> logger start, then fork the child process for logger. Later logger will
> create a log file based on last_syslogger_file_time . And
> last_syslogger_file_time in the postmaster process will be inherited by
> other  auxiliary processes
> (ii) when pgstat process initialize, it will read  last_syslogger_file_time
> from pg stat file of last time (because pgstat process will write it to pg
> stat file). And then pgstat process will get last_syslogger_file_time
> inherit from postmaster,  if this version of  last_syslogger_file_time is
> larger then that read from the stat file, it means logger create a new log
> file so use it as the latest value; else means pgstat process crashed
> before, so it need to use the value from stat file as the latest.
> (iii) when logger rotate a log file, it will assign time stamp to
> last_syslogger_file_time  and send it to pg_stat process. And pg_stat
> process will write last_syslogger_file_time to stat file so can be read by
> other backends.
> (iiii) Adding a stat function named pg_stat_get_log_file_name, when user
> call it, it will read  last_syslogger_file_time from stat file and construct
> the log file name based on log file name format and
> last_syslogger_file_time, return the log file name eventually.
>
> However, there is a risk for this solution: when logger create a new log
> file and then try to send new last_syslogger_file_time to pg_stat process,
> and pg_stat process crash at this moment, so the new pg_stat process cannot
> get the latest  last_syslogger_file_time. However, I think this case is a
> corner case.

I don't think we're going to accept this feature if it might fail in
corner cases.  And that design seems awfully complex.

The obvious way to implement this, to me at least, seems to be for the
syslogger to write a file someplace in the data directory containing
the name of the current log file.  When it switches log files, it
rewrites that file.  When you want to know what the current logfile
is, you read that file.

But there's one thing I'm slightly baffled about: why would you
actually need this?  I mean, it seems like a good idea to set
log_filename to a pattern that makes the name of the current logfile
pretty well predictable.  If not, maybe you should just fix that.
Also, if not on Windows, if you do get confused about which logfile is
active, you could just use lsof on the log_directory to figure out
which file the syslogger has open.  I just can't really remember
having a problem with this, and I'm wondering why someone would.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: get current log file

От
Euler Taveira
Дата:
On 26-02-2016 08:03, Robert Haas wrote:
> I don't think we're going to accept this feature if it might fail in
> corner cases.  And that design seems awfully complex.
> 
Agree.

> The obvious way to implement this, to me at least, seems to be for the
> syslogger to write a file someplace in the data directory containing
> the name of the current log file.  When it switches log files, it
> rewrites that file.  When you want to know what the current logfile
> is, you read that file.
> 
That is not an elegant solution but it is simple. However, it is another
file in PGDATA. I can live with that but if we have consensus, let's do
it optional.

> But there's one thing I'm slightly baffled about: why would you
> actually need this?
> 
The use case I have in mind is consume log file by using a tool like
logstash. In this case, logstash accepts patterns and you can also use
syslog for it.


--   Euler Taveira                   Timbira - http://www.timbira.com.br/  PostgreSQL: Consultoria, Desenvolvimento,
Suporte24x7 e Treinamento
 



Re: get current log file

От
Tom Lane
Дата:
Euler Taveira <euler@timbira.com.br> writes:
> On 26-02-2016 08:03, Robert Haas wrote:
>> But there's one thing I'm slightly baffled about: why would you
>> actually need this?

> The use case I have in mind is consume log file by using a tool like
> logstash. In this case, logstash accepts patterns and you can also use
> syslog for it.

This needs to be explained a lot more clearly than it has been so far,
else we are going to reject this proposed feature as being more code and
more overhead than is justified.  Exactly why would you need a pointer to
the current log file, rather than just configuring whatever tool you use
to vacuum up everything in the pg_log directory?  Why would this use-case
not suffer from nasty race conditions (ie, what happens when current log
file changes immediately before or immediately after you look at the
pointer)?
        regards, tom lane



Re: get current log file

От
Euler Taveira
Дата:
On 26-02-2016 11:50, Tom Lane wrote:
> This needs to be explained a lot more clearly than it has been so far,
> else we are going to reject this proposed feature as being more code and
> more overhead than is justified.  Exactly why would you need a pointer to
> the current log file, rather than just configuring whatever tool you use
> to vacuum up everything in the pg_log directory?  Why would this use-case
> not suffer from nasty race conditions (ie, what happens when current log
> file changes immediately before or immediately after you look at the
> pointer)?
> 
Those are good concerns. Also, we already have emit_log_hook that could
grab server log messages. A small extension using the hook (there are
some out there) could be use with a log consuming tool.


--   Euler Taveira                   Timbira - http://www.timbira.com.br/  PostgreSQL: Consultoria, Desenvolvimento,
Suporte24x7 e Treinamento
 



Re: get current log file

От
Tom Lane
Дата:
Euler Taveira <euler@timbira.com.br> writes:
> Those are good concerns. Also, we already have emit_log_hook that could
> grab server log messages. A small extension using the hook (there are
> some out there) could be use with a log consuming tool.

Hmmm ... emit_log_hook runs in the process calling elog, no?  That would
not have any special visibility into the state of syslogger.
        regards, tom lane



Re: get current log file

От
"Armor"
Дата:
I think I know what you are concerned about. May be I did not explain my solution very clearly.
(i) Using a variable named last_syslogger_file_time replace first_syslogger_file_time in syslogger.c. When postmaster initialize logger process,   last_syslogger_file_time will be assign the time stamp when logger start, then fork the child process for logger. Later logger will create a log file based on last_syslogger_file_time . And   last_syslogger_file_time in the postmaster process will be inherited by other  auxiliary processes 
(ii) when pgstat process initialize, it will read  last_syslogger_file_time from pg stat file of last time (because pgstat process will write it to pg stat file). And then pgstat process will get last_syslogger_file_time inherit from postmaster,  if this version of  last_syslogger_file_time is larger then that read from the stat file, it means logger create a new log file so use it as the latest value; else means pgstat process crashed before, so it need to use the value from stat file as the latest.
(iii) when logger rotate a log file, it will assign time stamp to last_syslogger_file_time  and send it to pg_stat process. And pg_stat process will write last_syslogger_file_time to stat file so can be read by other backends.
(iiii) Adding a stat function named pg_stat_get_log_file_name, when user call it, it will read  last_syslogger_file_time from stat file and construct the log file name based on log file name format and last_syslogger_file_time, return the log file name eventually.

However, there is a risk for this solution: when logger create a new log file and then try to send new last_syslogger_file_time to pg_stat process, and pg_stat process crash at this moment, so the new pg_stat process cannot get the latest  last_syslogger_file_time. However, I think this case is a corner case. 
------------------
Jerry Yu
https://github.com/scarbrofair
 


------------------ Original ------------------
From:  "Tom Lane";<tgl@sss.pgh.pa.us>;
Date:  Thu, Feb 25, 2016 10:47 PM
To:  "Robert Haas"<robertmhaas@gmail.com>;
Cc:  "Euler Taveira"<euler@timbira.com.br>; "Armor"<yupengstone@qq.com>; "Alvaro Herrera"<alvherre@2ndquadrant.com>; "Pgsql Hackers"<pgsql-hackers@postgresql.org>;
Subject:  Re: [HACKERS] get current log file

Robert Haas <robertmhaas@gmail.com> writes:
> On Thu, Feb 25, 2016 at 1:15 AM, Euler Taveira <euler@timbira.com.br>>> wrote:
>>> To pass last_syslogger_file_time, we have 2 solutions: 1, add a
>>> global variable to record last_syslogger_file_time which shared by
>>> backends and syslogger, so backends can get last_syslogger_file_time
>>> very easily; 2 syslogger process send last_syslogger_file_time to pgstat
>>> process when last_syslogger_file_time changes, just as other auxiliary
>>> processes send stat  message to pgstat process, and  pgstat process will
>>> write  last_syslogger_file_time into stat file so that backend can
>>> get last_syslogger_file_time via reading this stat file.

>> I prefer (1) because (i) logfile name is not statistics and (ii) stats
>> collector could not respond in certain circumstances (and even discard
>> some messages).

> (1) seems like a bad idea, because IIUC, the syslogger process doesn't
> currently touch shared memory.  And in fact, shared memory can be
> reset after a backend exits abnormally, but the syslogger (alone among
> all PostgreSQL processes other than the postmaster) lasts across
> multiple such resets.

Yes, allowing the syslogger to depend on shared memory is right out.
I don't particularly care for having it assume the stats collector
exists, either -- in fact, given the current initialization order
it's physically impossible for syslogger to send to stats collector
because the former is started before the latter's communication
socket is made.

I haven't actually heard a use-case for exposing the current log file name
anyway.  But if somebody convinced me that there is one, I should think
that the way to implement it is to report the actual *name*, not
components out of which you could reconstruct the name only by assuming
that you know everything about the current syslogger configuration and
the code that builds log file names.  That's obviously full of race
conditions and code-maintenance hazards.

regards, tom lane

Re: get current log file

От
"Armor"
Дата:
Yes, if we cannot find a perfect solution, we need to wait. 
Actually, the customer need a unified interface to access the status of database, so we implement it. 

------------------
Jerry Yu
https://github.com/scarbrofair
 


------------------ Original ------------------
From:  "Robert Haas";<robertmhaas@gmail.com>;
Date:  Fri, Feb 26, 2016 07:33 PM
To:  "Armor"<yupengstone@qq.com>;
Cc:  "Tom Lane"<tgl@sss.pgh.pa.us>; "Euler Taveira"<euler@timbira.com.br>; "Alvaro Herrera"<alvherre@2ndquadrant.com>; "pgsql-hackers"<pgsql-hackers@postgresql.org>;
Subject:  Re: [HACKERS] get current log file

On Fri, Feb 26, 2016 at 8:31 AM, Armor <yupengstone@qq.com> wrote:
> I think I know what you are concerned about. May be I did not explain my
> solution very clearly.
> (i) Using a variable named last_syslogger_file_time replace
> first_syslogger_file_time in syslogger.c. When postmaster initialize logger
> process,   last_syslogger_file_time will be assign the time stamp when
> logger start, then fork the child process for logger. Later logger will
> create a log file based on last_syslogger_file_time . And
> last_syslogger_file_time in the postmaster process will be inherited by
> other  auxiliary processes
> (ii) when pgstat process initialize, it will read  last_syslogger_file_time
> from pg stat file of last time (because pgstat process will write it to pg
> stat file). And then pgstat process will get last_syslogger_file_time
> inherit from postmaster,  if this version of  last_syslogger_file_time is
> larger then that read from the stat file, it means logger create a new log
> file so use it as the latest value; else means pgstat process crashed
> before, so it need to use the value from stat file as the latest.
> (iii) when logger rotate a log file, it will assign time stamp to
> last_syslogger_file_time  and send it to pg_stat process. And pg_stat
> process will write last_syslogger_file_time to stat file so can be read by
> other backends.
> (iiii) Adding a stat function named pg_stat_get_log_file_name, when user
> call it, it will read  last_syslogger_file_time from stat file and construct
> the log file name based on log file name format and
> last_syslogger_file_time, return the log file name eventually.
>
> However, there is a risk for this solution: when logger create a new log
> file and then try to send new last_syslogger_file_time to pg_stat process,
> and pg_stat process crash at this moment, so the new pg_stat process cannot
> get the latest  last_syslogger_file_time. However, I think this case is a
> corner case.

I don't think we're going to accept this feature if it might fail in
corner cases.  And that design seems awfully complex.

The obvious way to implement this, to me at least, seems to be for the
syslogger to write a file someplace in the data directory containing
the name of the current log file.  When it switches log files, it
rewrites that file.  When you want to know what the current logfile
is, you read that file.

But there's one thing I'm slightly baffled about: why would you
actually need this?  I mean, it seems like a good idea to set
log_filename to a pattern that makes the name of the current logfile
pretty well predictable.  If not, maybe you should just fix that.
Also, if not on Windows, if you do get confused about which logfile is
active, you could just use lsof on the log_directory to figure out
which file the syslogger has open.  I just can't really remember
having a problem with this, and I'm wondering why someone would.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company